Operating System - Linux
1752765 Members
4967 Online
108789 Solutions
New Discussion юеВ

Re: Create alternate LVM based Boot/OS Environment

 
SOLVED
Go to solution
Alzhy
Honored Contributor

Create alternate LVM based Boot/OS Environment

My OS is typically LVM based on HW RAID disks as follows:

/dev/sda:

sda1 -- /boot
sda2 -- LVM PV for vg00
vg00: partitioned minimally for /, swap, tmp and /var

Now I partition /dev/sdb in the same fashion as /dev/sda BUT my LVM VG is named vgbroot. I copy files and data over to this environment via cpio or tar or even cp or dd. I modify fstab of course afterwards.

What will it take to make this 2nd disk bootable aside from editing /boot/grub/menu.lst? This is where I am mising steps.

The idea is periodically -- we back up (copy vg00/sda1 to vgbroot/sdb1 so if primary boot disk become corrupt, accident or bad patching - we can simply boot off the other disk.

TIA.



Hakuna Matata.
12 REPLIES 12
Matti_Kurkela
Honored Contributor

Re: Create alternate LVM based Boot/OS Environment

You must install the boot loader (typically GRUB) to the 2nd disk too, so that it will be available even if the 1st disk has failed.

In your current configuration, /dev/sda is known to GRUB as (hd0) and the 2nd disk /dev/sdb as (hd1). But when the 2nd disk is alone, it will be seen by GRUB as (hd0). When you install GRUB to the 2nd disk, you must prepare for this.

Background:
GRUB's disk names map directly to disk ID numbers used by the PC BIOS. (hd0) = BIOS disk 0x80, (hd1) = BIOS disk 0x81 etc.
Also, if BIOS setup menus are used to select a non-default boot disk, BIOS usually does it by re-arranging the disk ID numbers so that the disk selected for booting is always listed by BIOS as the first hard disk = 0x80. Thus it will be also known to GRUB as (hd0).

If your GRUB is a reasonably modern version (you did not identify your Linux distribution!), there should be a file named /boot/grub/device.map.

In your situation, it should look like this:

# cat /boot/grub/device.map
(hd0) /dev/sda
(hd1) /dev/sdb

Now, create another copy of the device.map file, e.g. /boot/grub/device.map.otherdisk.
Edit it to flip the disk device names around:

# cat /boot/grub/device.map.otherdisk
(hd0) /dev/sdb
(hd1) /dev/sda

To install GRUB on the 2nd disk, first make sure /dev/sdb1 is *unmounted* and start the GRUB shell using the other device map file:

# grub --device-map=/boot/grub/device.map.otherdisk

Select the GRUB root device, i.e. the /boot partition on the 2nd disk:

grub> root (hd0,0)

Then install GRUB the same way it's installed on the 1st disk. GRUB might be located in the MBR, or in the beginning of the /boot partition. It would be much simpler to install GRUB into the MBR in this situation.

To install GRUB into the MBR:
grub> setup (hd0)

An alternative would be to install GRUB into the beginning of the /dev/sdb1 partition, but that would add three more requirements:

1.) You would have to ensure that the MBR of /dev/sdb has functional boot code. The "ms-sys" tool might be the easiest way to do it:
http://ms-sys.sourceforge.net/

2.) You would have to make sure /dev/sdb1 is flagged as bootable in the partition table.

3.) If you later modify the /boot/grub/stage* files on the 2nd disk in any way, you should remember to re-install GRUB on the 2nd disk afterwards. This is because in this installation method, GRUB may refer to those files using disk block numbers, not file names.

If you update the files, the new versions may be stored in different disk blocks than the old ones, and the old block number references stored in the boot record would not work any more.

MK
MK
Ralph Grothe
Honored Contributor

Re: Create alternate LVM based Boot/OS Environment

Alternatively, to Matti's suggested usage of the "entries swapped" device.map file you can issue the "device" command within the grub "shell" directly.

e.g.

grub> device (hd0) /dev/sdb
device (hd0) /dev/sdb
grub> root (hd0,0)
root (hd0,0)
Filesystem type is ext2fs, partition type 0xfd
grub> setup (hd0)
setup (hd0)
Checking if "/boot/grub/stage1" exists... no
Checking if "/grub/stage1" exists... yes
Checking if "/grub/stage2" exists... yes
Checking if "/grub/e2fs_stage1_5" exists... yes
Running "embed /grub/e2fs_stage1_5 (hd0)"... 15 sectors are embedded.
succeeded
Running "install /grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/grub/stage2 /grub/grub.conf"... succeeded
Done.



There is one caveat.
Should you have already booted with one disk defunct then you have to beware that if the dead disk was sda now your mirror disk, formerly sdb, has become sda, not that you accidentally wipe out your standby disk.

If you have a look into man proc at the scsi section there you will find a remark that so far only the add-single-device scsi command has been implemented.
So if your controller/disks support hotplugging then you could try to hotplug a spare disk and spin it up by something like

# echo "scsi add-single-device 1 0 5 0" > /proc/scsi/scsi

Of course, as this is merely cited from the man page you would have to find out the correct indexing for your controller/disk by issuing

# cat /proc/scsi/scsi

beforehand.
Madness, thy name is system administration
Alzhy
Honored Contributor

Re: Create alternate LVM based Boot/OS Environment

Matti and Ralph,

Thanks both.

My scenario for having an alternate boot/OS ENvironment is for quick fallback in case not because /dev/sda will disappear but due to bad patching, "accidents" or fallback to the previous rev/unpatched OS. So the environment desired is each boot disk will be independent of each other with vg00 (on the first disk) as the "primary" and vgbroot (on the 2nd Disk) as the secondary.

The idea is regulary we copy the LVM and /boot paritions over to the vgbroot/sdb disk and have an automated way of updating fstab and other updatables on the dormant but mountable vgbroot backup OS disk.

Our OS is Redhat EL 5.4.

Our servers are fairly large X86 servers - aka RISC Killers ;^) - in our UNIX-away efforts. These servers offer the ability to select the boot disk.

Hakuna Matata.
Matti_Kurkela
Honored Contributor

Re: Create alternate LVM based Boot/OS Environment

The boot procedure of x86 PC systems is really showing its age here: it was originally designed in the 1980s for a cheap single-user workstation, and did not really offer any good paths to enhance/upgrade the boot procedure seamlessly. So we're stuck with the legacy.

If you accept that the bootloader on /dev/sda1 will be a single point of failure, your plan is sound otherwise: you really don't need to do anything other than edit /boot/grub/menu.lst.

The weak points are bootloader patches (which should be rare, as GRUB is pretty stable) and the possibility that the system loses power or has a hardware fault just as the system is writing an updated kernel to /boot, corrupting the bootloader files on sda1 - which is a somewhat unlikely chain of bad luck.

And even if you lose your bootloader, reinstalling it is pretty simple: most Linux installation discs offer a rescue mode, which will allow you to access the installed system (in this case, either of your two installations) while the system is booted from an external media. Using this to re-establish the bootloader would not take much time at all. Then reboot once again, and you should be back in business.

MK
MK
Alzhy
Honored Contributor

Re: Create alternate LVM based Boot/OS Environment

Matti,
I think I am missing something.. When I boot off sdb (my 2nd disk)... it sees and activates the alternate OS VG - named vgbroot but the OS just panics with setuproot... seemingly unable to mount the OS filesystems.

Here's what I did:

*I created the exact same partitions on sdb.
*Created a vgbroot LVM VG with the exact same paritions for /, /opt, /tmp and /var.
*I did a cpio copy of each filesystem (including /boot).
* I installed grub on sdb (using grub-install or your procedure)
* Edited menu.lst on sdb
* Edited fstab on sdb

The error messages are:

mount: could not find filesystem '/dev/root'
setuproot: moving /dev/failed: No such file or directory
setuproot: error mounting /proc: No such file or directoy
setup root: error mounting /sys: No such file or directory
switchroot: mount failed: No such file or directory
Kernel panic -- not synching: ....

And here's my paritioning:

Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/vg00-root
20314748 3841572 15424600 20% /
/dev/mapper/vg00-var 40342864 1699604 36560856 5% /var
/dev/mapper/vg00-opt 10157368 993208 8639872 11% /opt
/dev/mapper/vg00-tmp 2031440 75988 1850596 4% /tmp
/dev/sda1 256666 28738 214676 12% /boot
tmpfs 66050764 0 66050764 0% /dev/shm
/dev/mapper/ocfs01p1 104856192 38974656 65881536 38% /ocfs
/dev/mapper/vgbroot-root
20642428 3811632 15782220 20% /mnt/tmp
/dev/mapper/vgbroot-opt
10321208 993060 8803860 11% /mnt/tmp/opt
/dev/mapper/vgbroot-var
40994128 1712936 37198788 5% /mnt/tmp/var
/dev/mapper/vgbroot-tmp
4128448 139392 3779344 4% /mnt/tmp/tmp
/dev/sdb1 256666 28731 214683 12% /mnt/tmp/boot

And here's my menu.lst on sdb:

default=0
timeout=30
splashimage=(hd1,0)/grub/splash.xpm.gz
hiddenmenu
title Red Hat Enterprise Linux Server (2.6.18-164.6.1.el5) VGBROOT
root (hd1,0)
kernel /vmlinuz-2.6.18-164.6.1.el5 ro root=/dev/vgbroot/root rhgb quiet
initrd /initrd-2.6.18-164.6.1.el5.img

My Device.Map on sdb:

root@sapserver # cat device.map
(fd0) /dev/fd0
(hd0) /dev/sda
(hd1) /dev/sdb

My fstab on sdb:

/dev/vgbroot/root / ext3 defaults 1 1
/dev/vgbroot/var /var ext3 defaults 1 2
/dev/vgbroot/opt /opt ext3 defaults 1 2
/dev/vgbroot/tmp /tmp ext3 defaults 1 2
LABEL=/boot1 /boot ext3 defaults 1 2
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
/dev/vgbroot/swap swap swap defaults 0 0

Anything I am missing?

Hakuna Matata.
Steven E. Protter
Exalted Contributor

Re: Create alternate LVM based Boot/OS Environment

Shalom,

Getting out of the box here, perhaps I'm wondering if the current exploration path is too complex.

Can we not use dd to back up the entire boot disk including grub configuration to a second disk. In the event of failure, we would use fdisk to mark the second disk as boot and run off that?

Alternatively, on a server with hardware, raid, could we not bypass this procedure altogether? It is a lot easier to handle that way. Hardware raid presents 1 disk tot he system, but that is really two disks.

Lastly what about software raid?

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Alzhy
Honored Contributor

Re: Create alternate LVM based Boot/OS Environment

SEP,

These servers are megaservers with built in RAID on them. sda and sdb are RAIDED (mirrored) disks. The "IDEA" is to have a very quick FALL BACK of the OS in the event of bad patching, errant app, someone whacking system files, miconfigs, etc.... so the environment can get back to what it was before by simply boting the alternate path.

I have not explored dd yet but to take all 146GB every so often is just preposterous. Not to mention the gyration to alter the UUIDs and partion labels after the dd.

With an alternate LVM based OS disk -- I can update the backup boot environment from time to time.
Hakuna Matata.
Alzhy
Honored Contributor

Re: Create alternate LVM based Boot/OS Environment

Matti.. Could it be my issue is with SELinux booting off of my backup vgbroot?

I forgot to mention my System is SELinux enabled.

Hakuna Matata.
Matti_Kurkela
Honored Contributor
Solution

Re: Create alternate LVM based Boot/OS Environment

The error seems to happen just before mounting the root filesystem, while the kernel is still using initrd.

The error messages don't indicate a permissions problem, so it's unlikely to be a SELinux issue. It's more like the system cannot find the correct root file system.

I took apart a RHEL5 initrd file and examined what it does.

(For reference:
# mkdir /tmp/initrd
# cd /tmp/initrd
# gunzip < /boot/initrd-2.6.18-164.6.1.el5.img | cpio -vid
Then examine the contents.)

The root filesystem is mounted at the end of the "init" script. That part looks like this:
-----
echo Scanning logical volumes
lvm vgscan --ignorelockingfailure
echo Activating logical volumes
lvm vgchange -ay --ignorelockingfailure vg00
resume /dev/vg00/lvol0
echo Creating root device.
mkrootdev -t ext3 -o defaults,ro /dev/vg00/lvol1
echo Mounting root filesystem.
mount /sysroot
echo Setting up other filesystems.
setuproot
echo Switching to new root and running init.
switchroot
-----

Note that in multiple locations the name of the root VG is embedded into the script! No wonder it did not work: the script probably attempted to activate vg00 as on your original system disk. This script is created by the "mkinitrd" command.

Mounting the root filesystem probably takes its parameters from the kernel command line (this "mount" is not the standard version but a special built-in command of nash, the RedHat boot NotASHell), so it failed because the initrd init script had activated the wrong VG.

Looks like you would have to somehow create a new version of initrd for your backup boot setup, or extract + modify + re-package the one you currently have.

MK
MK