Operating System - Linux
1753852 Members
9755 Online
108808 Solutions
New Discussion юеВ

Linux boot 40 minute stall whilst starting multipath

 
SOLVED
Go to solution
Malcolm Wade
Valued Contributor

Linux boot 40 minute stall whilst starting multipath

I have recently attached an EVA4400 to my test SAN and configured my Linux RH 4 Update 8 server accordingly to allow it to use an EVA presented Lun.

Initial boot saw the server find 4 new disk devices; that's one per path to the LUN on the EVA.

I then configured multipath and all seemed fine UNTIL the next reboot when I now find the server stalls during reboot just after checking the root filesystem:

26-Nov-2009 12:50:38 Checking root filesystem
26-Nov-2009 12:50:38 [/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/vg0/lv0
26-Nov-2009 12:50:39 /dev/vg0/lv0: clean, 9479/524288 files, 83820/1048576 blocks
26-Nov-2009 12:50:39 [ OK ]
26-Nov-2009 12:50:39 Remounting root filesystem in read-write mode: [ OK ]
26-Nov-2009 13:31:36 No RAID disks
26-Nov-2009 13:31:37 Setting up Logical Volume Management: [ OK ]

Note the 40 minute gap before the "No RAID disks" line. Then the boot sequence continues on; finds and mounts up my EVA LUN and all is good.

Anyone got any pointers?

This server is also zoned to be able to use some SAN based tapes via the same HBA as well. It's a test lab server.

Server is fully patched via up2date from my RH Sattelite server.

Thanks,
Malcolm
2 REPLIES 2
Matti_Kurkela
Honored Contributor
Solution

Re: Linux boot 40 minute stall whilst starting multipath

This part of the boot procedure is controlled by /etc/rc.d/rc.sysinit.

After the root filesystem is successfully mounted in read-write mode, the system runs these commands (for simplicity, I've omitted all if/then conditionals and followed the most likely execution path in this case):

modprobe dm-mod
restorecon /dev/mapper/control
modprobe dm-multipath
/sbin/multipath-static -v 0
/sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a"

modprobe dm-mirror
/sbin/dmraid -i -a y

"No RAID disks" apparently comes from /sbin/dmraid. So the delay must be caused by one of the commands before it.

The modprobe and restorecon commands should be pretty benign: they operate on the system disk and memory only. The remaining three commands all do some disk probing, so might cause a delay if something does not work as intended:

/sbin/multipath-static -v 0
/sbin/dmsetup ls --target multipath --exec "/sbin/kpartx -a"

/sbin/dmraid -i -a y

My suggestion:
None of these commands should be harmful to execute. Try running them manually, one at a time, to see which one takes a lot of time.

If the problem is caused by dmraid, "chmod a-x /sbin/dmraid" would cause the script to skip the dmraid commands. But that would be just a work-around.

If possible, try temporarily unpresenting the SAN tapes and see if it makes any difference. The dm-multipath tools have been developed for disks only; it might be possible that they don't (yet) know the difference between a SAN disk and a SAN tape drive.

MK
MK
Malcolm Wade
Valued Contributor

Re: Linux boot 40 minute stall whilst starting multipath

Matti,

Thanks for this great info!

I narrowed the issue down to dmraid. For the moment (until I have more time) I have simply denied exec as you sugested and the boot cycle proceeds as per normal.

I'll look a bit further later in the week.

Thanks again,
Malcolm