HPE GreenLake Administration
- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Re: Dualing SmartArray controllers and bootloader ...
Operating System - Linux
1827730
Members
2868
Online
109968
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Go to solution
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2008 12:05 PM
01-30-2008 12:05 PM
Problem: Boot displays "Attempting boot From Hard Drive (C:)" and hangs.
HP ProLiant DL580 G5
Smart Array E200
Smart Array P600
The E200, which I think was the 'included' (integrated) controller. The E200 (PCI-Express, in Slot 3) is intended to control the two internal SAS drives (RAID1)... where the Linux OS will be installed.
The P600 (64-Bit PCI, in Slot 4) was add-on to control the external 10-disk SAS HDD array, various RAID1 and RAID1+0 combos... where databases and applications will be installed.
First symptom: noticed within SmartStart CD's Array Configuration Utility ("More Information") that the Logical Device Name for the E200 was: /dev/cciss/c1d0 and the P600 had: /dev/cciss/c0d0, /dev/cciss/c0d1, /dev/cciss/c0d2. This was different, because the other 5 Linux boxes I admin all had the lower number (c0d0) assigned to the internal controller. But, I said to self 'whateveeeer'.
In BIOS, changed controller boot order so that the E200 was before the P600.
Ran a quick "test" Linux install (GUI interactive) just as far as Disk Druid to see how it viewed the logical disks. Sure enough, it saw the E200 as c1d0.
I adjusted my Kickstart script partitioning specs accordingly, such that the OS file systems (/boot, swap, /, etc.) would all use the logical disk on controller c1d0.
Now, skipping forward: reboot, install via kickstart, reboot... encountered problem noted at start.
Post-mortem diagnosis: examination of the system via boot DVD and "linux rescue" mode reveals GRUB config referencing "hd(3,0)" and "cciss/c0d0" in the comments.
Long story short... If RHEL installed a bootloader (GRUB) at all, I believe it was on the (P600) external drive array controller, the one set as 2nd in BIOS boot order but which had the c0d0 logical device name... and maybe even on the 3rd logical drive of that?!?!
I did lots of Google searching, experimentation, and hacking to try to get this system to boot with no success. I finally called my on-site guy (another country) and asked him to remove the P600 controller.
Rebooting into SmartStart now showed the E200 array with the more normal device ID: /dev/cciss/c0d0. So, after revising my Kickstart cfg back to use c0d0, I proceeded to re-install RHEL Linux from DVD and the install went perfectly. System reboots loads OS off hard disk successfully.
NOW: What is going to happen when I replace the P600 back? Is the system going to re-scan the PCI slots and renumber the controllers (reversing the device IDs) and muck up everything? Will I need to hack OS config files (e.g. /boot/grub.conf, /etc/fstab, etc.) to point to c1d0?
I need a plan of attack! :-)
Thanks in advance,
Jared Middleton
HP ProLiant DL580 G5
Smart Array E200
Smart Array P600
The E200, which I think was the 'included' (integrated) controller. The E200 (PCI-Express, in Slot 3) is intended to control the two internal SAS drives (RAID1)... where the Linux OS will be installed.
The P600 (64-Bit PCI, in Slot 4) was add-on to control the external 10-disk SAS HDD array, various RAID1 and RAID1+0 combos... where databases and applications will be installed.
First symptom: noticed within SmartStart CD's Array Configuration Utility ("More Information") that the Logical Device Name for the E200 was: /dev/cciss/c1d0 and the P600 had: /dev/cciss/c0d0, /dev/cciss/c0d1, /dev/cciss/c0d2. This was different, because the other 5 Linux boxes I admin all had the lower number (c0d0) assigned to the internal controller. But, I said to self 'whateveeeer'.
In BIOS, changed controller boot order so that the E200 was before the P600.
Ran a quick "test" Linux install (GUI interactive) just as far as Disk Druid to see how it viewed the logical disks. Sure enough, it saw the E200 as c1d0.
I adjusted my Kickstart script partitioning specs accordingly, such that the OS file systems (/boot, swap, /, etc.) would all use the logical disk on controller c1d0.
Now, skipping forward: reboot, install via kickstart, reboot... encountered problem noted at start.
Post-mortem diagnosis: examination of the system via boot DVD and "linux rescue" mode reveals GRUB config referencing "hd(3,0)" and "cciss/c0d0" in the comments.
Long story short... If RHEL installed a bootloader (GRUB) at all, I believe it was on the (P600) external drive array controller, the one set as 2nd in BIOS boot order but which had the c0d0 logical device name... and maybe even on the 3rd logical drive of that?!?!
I did lots of Google searching, experimentation, and hacking to try to get this system to boot with no success. I finally called my on-site guy (another country) and asked him to remove the P600 controller.
Rebooting into SmartStart now showed the E200 array with the more normal device ID: /dev/cciss/c0d0. So, after revising my Kickstart cfg back to use c0d0, I proceeded to re-install RHEL Linux from DVD and the install went perfectly. System reboots loads OS off hard disk successfully.
NOW: What is going to happen when I replace the P600 back? Is the system going to re-scan the PCI slots and renumber the controllers (reversing the device IDs) and muck up everything? Will I need to hack OS config files (e.g. /boot/grub.conf, /etc/fstab, etc.) to point to c1d0?
I need a plan of attack! :-)
Thanks in advance,
Jared Middleton
Solved! Go to Solution.
3 REPLIES 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2008 03:27 PM
01-30-2008 03:27 PM
Solution
Looks like you've not using LVM.
My experience has been that LVM is not fazed by any changes of device names: if the controller drivers are available, LVM will auto-identify the correct disks/partitions, wherever they might be located. That leaves only the bootloader, and the /boot partition.
There are ways to make a traditionally-partitioned system behave as robustly: do a "man fstab" and search for words "LABEL" and "UUID". You can use these to identify the partitions in a device-agnostic way.
To understand the behavior of the bootloader, you'll need to be aware of some traditional features of the BIOS. The bootloader is not aware of controllers: it just sees a list of disks. The traditional behavior is to assume that the boot disk is the first one on that list. Any deviation from this makes things more complicated to handle.
In fact, the standard way for choosing the disk to boot from at the BIOS level is to manipulate this list, so that the desired boot disk goes at the top of the list. So, changing the boot controller options at the BIOS level is likely to change the way GRUB sees things... and to thoroughly confuse an unprepared sysadmin.
On the other hand, Linux won't necessarily have any information about the disk order as seen by the BIOS and the bootloader. Each driver can choose how to number the devices it handles, but usually the detection of storage devices happens in the PCI bus order. Check the output of the "lspci" command.
(This means the ordering between the P600 and the E200 might be changed by sticking the P600 into a different slot.)
So the GRUB installation program must essentially make some educated guesses. You saw "(hd3,0)" with the comment "cciss/c0d0". This is the installer's documentation about the guesses it made.
The guesses made by the installer:
1.) The /boot partition is located on the first partition of the fourth disk in the BIOS list (GRUB uses zero-based counting).
2.) This disk is known by Linux as /dev/cciss/c0d0.
In a system with multiple disk controllers, the installer can easily get these guesses wrong. In such a system, you may have to help GRUB out: use the "grub --device-map" command to enter the GRUB shell. On the first time, the GRUB shell will create the /boot/grub/device.map file if it does not already exist. That file allows you to verify and/or correct the guesswork made by the installer. Each line in that file will have a GRUB disk identifier and the corresponding Linux device name. If the initial guesses are wrong, edit the device.map file: if the file exists, the GRUB installer will use the information in it instead of guesses.
As you see, the handling of multiple disk controllers in a PC architecture can be a bit of a dark art - and at the moment, I'm getting too tired for a coherent explanation. Please ask for more details if necessary: I'll try to look at this thread again tomorrow.
MK
My experience has been that LVM is not fazed by any changes of device names: if the controller drivers are available, LVM will auto-identify the correct disks/partitions, wherever they might be located. That leaves only the bootloader, and the /boot partition.
There are ways to make a traditionally-partitioned system behave as robustly: do a "man fstab" and search for words "LABEL" and "UUID". You can use these to identify the partitions in a device-agnostic way.
To understand the behavior of the bootloader, you'll need to be aware of some traditional features of the BIOS. The bootloader is not aware of controllers: it just sees a list of disks. The traditional behavior is to assume that the boot disk is the first one on that list. Any deviation from this makes things more complicated to handle.
In fact, the standard way for choosing the disk to boot from at the BIOS level is to manipulate this list, so that the desired boot disk goes at the top of the list. So, changing the boot controller options at the BIOS level is likely to change the way GRUB sees things... and to thoroughly confuse an unprepared sysadmin.
On the other hand, Linux won't necessarily have any information about the disk order as seen by the BIOS and the bootloader. Each driver can choose how to number the devices it handles, but usually the detection of storage devices happens in the PCI bus order. Check the output of the "lspci" command.
(This means the ordering between the P600 and the E200 might be changed by sticking the P600 into a different slot.)
So the GRUB installation program must essentially make some educated guesses. You saw "(hd3,0)" with the comment "cciss/c0d0". This is the installer's documentation about the guesses it made.
The guesses made by the installer:
1.) The /boot partition is located on the first partition of the fourth disk in the BIOS list (GRUB uses zero-based counting).
2.) This disk is known by Linux as /dev/cciss/c0d0.
In a system with multiple disk controllers, the installer can easily get these guesses wrong. In such a system, you may have to help GRUB out: use the "grub --device-map" command to enter the GRUB shell. On the first time, the GRUB shell will create the /boot/grub/device.map file if it does not already exist. That file allows you to verify and/or correct the guesswork made by the installer. Each line in that file will have a GRUB disk identifier and the corresponding Linux device name. If the initial guesses are wrong, edit the device.map file: if the file exists, the GRUB installer will use the information in it instead of guesses.
As you see, the handling of multiple disk controllers in a PC architecture can be a bit of a dark art - and at the moment, I'm getting too tired for a coherent explanation. Please ask for more details if necessary: I'll try to look at this thread again tomorrow.
MK
MK
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2008 04:15 PM
01-30-2008 04:15 PM
Re: Dualing SmartArray controllers and bootloader (RHEL 4)
Matti, you confirmed some things I'd read or assumed. It might be a day or two before I add the P600 controller back in and report my status/results.
I didn't see any BIOS option for changing order of logical disk/devices (e.g. at the /dev/cciss/cXdX level), only the order of the controllers themselves. It's set to something like:
1) E200 controller <-- internal disks (OS)
2) Integrated IDE controller <-- for DVD-ROM?
3) P600 controller <-- external disk array
4) SCSI controller <-- for tape drive
Note: The other BIOS option is for ordering: CDROM, hard drive, USB, NIC, etc.
My wish was/is for the E200 to boot first, show as cciss/c0XXXX in Linux, and thus match my other systems to avoid potential mistakes down the line. With the P600 present, the P600 got the c0 designation (reverse of what I wanted/expected)... probably based on the slot it's in and/or the PCI bus scan order. I wrongly assumed Linux would reflect the BIOS order (shown above).
At the moment, the system is working fine on just the E200 (as cciss/c0XXXX), but once the P600 is added back (assume: same slot), I expect it might steal the c0 assignment and force me into some device-map tweaking so that GRUB knows where the bootloader is.
Fun Fun. :-)
-Jared
I didn't see any BIOS option for changing order of logical disk/devices (e.g. at the /dev/cciss/cXdX level), only the order of the controllers themselves. It's set to something like:
1) E200 controller <-- internal disks (OS)
2) Integrated IDE controller <-- for DVD-ROM?
3) P600 controller <-- external disk array
4) SCSI controller <-- for tape drive
Note: The other BIOS option is for ordering: CDROM, hard drive, USB, NIC, etc.
My wish was/is for the E200 to boot first, show as cciss/c0XXXX in Linux, and thus match my other systems to avoid potential mistakes down the line. With the P600 present, the P600 got the c0 designation (reverse of what I wanted/expected)... probably based on the slot it's in and/or the PCI bus scan order. I wrongly assumed Linux would reflect the BIOS order (shown above).
At the moment, the system is working fine on just the E200 (as cciss/c0XXXX), but once the P600 is added back (assume: same slot), I expect it might steal the c0 assignment and force me into some device-map tweaking so that GRUB knows where the bootloader is.
Fun Fun. :-)
-Jared
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2008 07:40 PM
01-30-2008 07:40 PM
Re: Dualing SmartArray controllers and bootloader (RHEL 4)
Your running into a PCI enumeration issue that showed up in the 2.6 kernel. The 2.4 kernel did a breadth-first sort of the PCI bus, the 2.6 kernel does a depth-first sort on the PCI bus. an option was added in RHEL4U5 to address the issue. On your kernel boot line add pci=bfsort and the 200i should show up as c0d0. Most people see the bus enumeration issue on the network cards, what they think should be eth0 shows up as eth1. Even with the pci=bfsort option, if at a later date you add a 3rd controller what was c1d0 might become c2d0 depending on where the new controller shows up on the bus. If you use labels instead of device names this one shouldn't bite you.
Red Hat has "whitelisted" some of the systems that this effects, with a patch in pci.c, but the DL580 G5 isn't in the list yet. The patch basically forces the listed systems to use the pci=bfsort option
To fix your current situation, after you add the P600 back into the mix try the pci=bfsort option if your running RHEL4U5 or later. Worst case you have to boot into rescue mode and edit the files you mentioned.
Red Hat has "whitelisted" some of the systems that this effects, with a patch in pci.c, but the DL580 G5 isn't in the list yet. The patch basically forces the listed systems to use the pci=bfsort option
To fix your current situation, after you add the P600 back into the mix try the pci=bfsort option if your running RHEL4U5 or later. Worst case you have to boot into rescue mode and edit the files you mentioned.
No support by private messages. Please ask the forum!
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Support
Events and news
Customer resources
© Copyright 2025 Hewlett Packard Enterprise Development LP