Re: LVM timeout and SANs

 
SOLVED
Go to solution
Guy Humphreys
Valued Contributor

LVM timeout and SANs

This is a new thread carried over from:

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=362426

(I thought I should start a new one instead of hijacking that old thread! :))

Pat, (or whoever else is knowledgeable in this area)

I am just setting up my LUNs to be used on our servers and have been thinking about the LVM issue, discussed above.

Say I wanted to create a 100GB LVM, I had intended to just create one 200GB raw LUN with RAID1 on the SAN bringing the total useable space to 100Gigs, and then present it directly.
From your experiences I find that this could be hazardous. So I need to use LVM and increase the timeout value.

Now, Do I need to create two seperate 100GB raw LUNs now and then use lvm rather than the SAN to mirror them and give me the useable 100gigs?

Thanks as ever for your sage council
Guy

'If it ain't broke, don't fix it!'
10 REPLIES 10
Pete Randall
Outstanding Contributor

Re: LVM timeout and SANs

Guy,

In my experience, hardware mirroring outperforms software mirroring hands down, so I would say no, you don't want to use lvm to do your mirroring.


Pete

Pete
Alzhy
Honored Contributor

Re: LVM timeout and SANs

Guy.. if your lone SAN Array is a controller based one (i.e. an EVA) - then I suggest you do all your striping and RAIDing onthe array itself. Whatever PV timeout values for the array - you need to set it.

IF however you are using multple "controller centric" arrays (again say EVA) OR "cache centric arrays" (like the XP or Hitachi line) - then you will get the best performance by striping (not mirroring since each LUN presented is already protected on the SAN/array level) accross LUNS on different array controllers or pairs or arrays (EVAs). If you've 4 EVAs.. I'd stripe my lvol/volume on the host accross 4 disks with each disk coming from each EVA. Or 4 or 8 disks if coming from an XP.

Again for LVM - follow the recommended PV TOV values - if there are.

Hope this helps.
Hakuna Matata.
Pat Obrien_1
Regular Advisor
Solution

Re: LVM timeout and SANs

Yes 2 in 2 weeks is unfortunate, however 5 events in 5 years is better, and 2 events were on a model 1, one with vcs 1.21 and the other on vcs 2.003 or 2.005. I also have over a dozen now.

Now to your question, The others really answered the question, so I am left to summarize. Two EVA's with host mirroring is the safest, though I still say never use raid0. I always use raid5, and don't have funds to mirror EVA's, and don't really care to do raid1 within the eva.

Now of course all the above experience is with a eva5K. I am demoing a eva8k with autopath, and autopath has a timeout value which I think does the same as lvm, but not sure.

I get my best perf creating 2-100 gn luns and preferring each to opposite controllers, and then host striping them together for a 200gb volume and fs.
Guy Humphreys
Valued Contributor

Re: LVM timeout and SANs

Thanks for the replies so far guys.

I had always intended to use RAID 1 from the SAN (which is a lone EVA 5000 - we can't afford to mirror ours either!)

What is tripping me up is Pat's mention of the timeout problem. This prob forces me to use LVM, not that I am averse to LVM per se you understand - but I wanted to step away from the old and move to the new, the EVA.

Can I get some clarification on what the problem is - just so I am doubly sure I understand it.

As I see it the prob is that if a disk fails in the san it can lead to loop failures where the SAN tries to find the failed disk and do self-diagnostics - this process takes longer than the default SCSI timeout value on the host and so data is lost!

If I am wrong with this assumption please tell me.

Now, what I am confused about is will the SAN not recognise that the failed disk is part of a RAID1 set and just use the other mirrored disk automatically? or is it that it has to go through these diagnostics BEFORE it swaps over to the other disk?

cheers
Guy
'If it ain't broke, don't fix it!'
Kevin Wright
Honored Contributor

Re: LVM timeout and SANs

Create 1 LUN on the san, utilize lvm pvlinks/autopath/secure path depending on the type of SAN to provide HA.
Florian Heigl (new acc)
Honored Contributor

Re: LVM timeout and SANs

Hi,

regarding Pats post:

- lvm mirroring / not mirroring has no effect on PV timeouts, expect the chance of a path/channel failure is 50% lower.
- man pvchange has a -t option that is able to set the timeout applies on a specific volume under LVM control. use
pvchange -t 180 /dev/rdsk/cXtYdZ to give a quite high timeout value that will cover i.e. a full core switch reboot on Your disk accesses to be able to sustain 'connectivity' during SAN issues.

As already stated, mirroring is better done in hardware, the only expection being *some* D/R scenarios (i.e. when You have no SRDF ;)

Pat also seems to be running a FC-AL (loop) config which is unfortunately prone to trespassing storms especially with HP-UX.
Unfortunately.

Adjusting MAX_FCP_REQUESTS to below vendor recommendation takes care of that and even *raises* performance for most people running FC-AL. :)
yesterday I stood at the edge. Today I'm one step ahead.
Guy Humphreys
Valued Contributor

Re: LVM timeout and SANs

Florian, thank you for a very complete and concise answer. My faith in EVA's is now restored

Cheers
Guy
'If it ain't broke, don't fix it!'
Pat Obrien_1
Regular Advisor

Re: LVM timeout and SANs

The timeout issue in its gorriest details which is a fabric and not a loop as others stated:

The rogue drive syndrome is a bad disk will start a storm on a backend loop from one of the ports on the defective disk drive. This storm causes the loop to fail leaving you a single loop to half the shelves on this backend loop. At this point I have seen 2 scenarios:
1) The defective drive will then begin a storm onthe remaining loop causing this to fail also. There are several drives in each loop called quorum disk, and when quorum is lost, both controllers of the eva will begin reboot sequence. During this reboot, yo will not have host access to the storage. Depending on firmware version, this may be seconds or minutes.
2) The defective disk will fail and disapear at which point the eva diagnostics perform a scan backend loops and will find the defect disk and attepmt to take ownership during which the drive will fail again. During this internal activity, host IO becomes spotty at best. It was 13 minutes for the eva to mark this drive bad and stablize itself. Some io will happen and other will not during this time frame. From experience oracle was vastly upset.
The 13 minute delay has been reduced to 50 seconds in 3.020, and I understand better in more current versions.

This behavior in the dual loop backend of the EVA5K is the nemisis of this product which HP recogized during the rebuild eva8K. They have added "E-PBC-IO" chips to the shelf io modules or the emu to bypass a shelf with such a bad drive, and then there has been drive firmware upgrades with just about every vcs update I have done with the exception of vcs 1.21.

Now a few weeks ago I did have 2 different evas (vcs 3.020) loose host access (one for 2 seconds and the other for 12 seconds)because of a single bad drive in each case. I have almost finished converting non lvm volumes to lvm volumes and the systems aattached to these eva's logged a few ems messages, but oracle did not even know this occurred. Including these latest 2 events, I think I have seen 4-5 in about the same number of years and I now have over a dozen eva 5K and 2 8k inbound.
Guy Humphreys
Valued Contributor

Re: LVM timeout and SANs

Thanks again to Pat for some very good info - forwarned is forearmed, as they say.

Cheers
Guy
'If it ain't broke, don't fix it!'