1826639 Members
3245 Online
109695 Solutions
New Discussion

Re: Stlale extents

 
Eric Bradley
Occasional Contributor

Stlale extents

The primary vg00 PV is taking errors. Due to read errors on some extents during syncing to the mirror PV, the mirror is showing stale rather than current on those extents in lvol7 (/usr). Result: I have a Primary vg00 PV going bad at the hardware level and the good mirror PV showing stale extents. Cannont lvreduce - it fails. I haven't tried getting the disk key and reducing that way, but if I do I'm afraid /usr will be corrupt with some stale extents on the mirror disk. Any suggestions to fix this problem and replace the Primary PV withou an outage? They are hot swap internal drives.
13 REPLIES 13
Geoff Wild
Honored Contributor

Re: Stlale extents

How about add a third disk and mirror to it?

Rgds...Geoff
Proverbs 3:5,6 Trust in the Lord with all your heart and lean not on your own understanding; in all your ways acknowledge him, and he will make all your paths straight.
Lolupee
Regular Advisor

Re: Stlale extents

Eric,

this error could be tricky. Please, run

dd if=/dev/dsk/cXtXdX of=/dev/null bs=128k

on each of the drives and see if any would give error to confirm which disk is actualy defective. We would then workmm on the defective drive.

You can reduce the bs size if there is no error.
Hope you have tried to re-sync the drives before all these.

lvsync /dev/vg00/lvol7
Eric Bradley
Occasional Contributor

Re: Stlale extents

Both lvsync and lvreduce fail. The dd on the Primary disk (c2t6d0) has errors and is reporting incidents to ISEE. The dd on the mirror disk (c0t6d0) passes. The lvsync fails because c2t6d0 has 2 bad extents which are taking read errors resulting in the matching extents showing stale on c0t6d0. I can't mirror to a third disk because I have unreadable and stale extents on both of the current disks so I have nothing to mirror from. I have considered using lvreduce -m -k disk_key# but I'm not sure how the system would react to having /usr running on a disk with 2 stale extents. Any ideas?
Lolupee
Regular Advisor

Re: Stlale extents

It is confirmed that the Primary disk is the bad disk and it could be physically removed by HP CE.

The next step is. Do you have a good lvmconf file?.

check the /etc/lvmconf/vg00 and confirm that the file is a good one. Are we discusing on RISC server or Itanium.
Eric Bradley
Occasional Contributor

Re: Stlale extents

Itanium, rx8602. My concern with replacing the bad disk is that I won't be able to sync back to the new one from the current mirror since there are 2 stale extents on the mirror. Not sure how to get past that - maybe replace the bad disk and then recover /usr to the mirror disk from a backup to get rid of the stale extents, then vgcfgrestore, vgchange -a y, and vgsync. Any ideas on that one?
rariasn
Honored Contributor

Re: Stlale extents

Hi Eric,
Trie this,

pvchange -A n -an /dev/dsk/cxtydz /dev/vg00
Change disk and,

pvchange -a y /dev/dsk/cxtydz /dev/vg00

vgchange -a y vg00

Devender Khatana
Honored Contributor

Re: Stlale extents

Hi,

I would suggest to physically remove the device after shutdown and then try booting through the disk having only 2 stale PE's. This should not give any problem if the LVOLs are quite free, and if it gives errors you will have to recover /usr from a good backup.

I do not think you can avaoid a outage here, as even if you replace the disk and resync you cannot conform functionality without booting from both disks indivisually.

Here is the official disk replacement guide
http://docs.hp.com/en/5991-1236/When_Good_Disks_Go_Bad.pdf

HTH,
Devender


Impossible itself mentions "I m possible"
RAC_1
Honored Contributor

Re: Stlale extents

I would boot from mirror disk and take primary out. If everything goes well with boot from mirror disk, replace primery and re-mirror.
There is no substitute to HARDWORK
Steve Faidley
Valued Contributor

Re: Stlale extents

What came of this?
It would be good to hear the steps taken.
My thoughts are to setup a PVG including the mirror disk and do an lvplit to get /usr only on primary disk. then try to mirror back to the "mirror" disk. If that works then you should follow normal procedures for replacing a a vg00 disk. If it still fails due to the 2 bad extents you could try to cheat if you have at least 2 extents available.
1. create a dummy lvol up to those bad extents
2. create a 2nd dummy lvol on those 2 extents.
3. lvremove the 1st dummy lvol
4. mirror /usr to the "mirror" disk.

If the 2 disks were not configured corrctly ( mirror disk was not properly configured to boot off of) in the first place then all bets are off.
If it ain't broke, let me have a look at it.
Eric Bradley
Occasional Contributor

Re: Stlale extents

Couldn't get an outage window until tomorrow, Saturday night. I will update with results when complete. We have a good Ignite image prior to the stale extents so worst case we will be able to blow that down. Fortunately, the stale extents are in /usr, which is very static since the server has been in freeze mode for a month.
Lolupee
Regular Advisor

Re: Stlale extents

Eric,

here is my advice, that may work but still proceed on your request for outage. This is just the process of replacing the defective disk, other things being equal.

Follow below steps and you may not need an outage. You would know immediately you need it.

Please, do this during the outage window. To be on the safe side shutdown all databases, go to init 2. umount all other file systems not relevant to O/S but do not forget that umount -a would umount /stand. then mount it back.

If you have your vgcfgbackup file anywhere then you can proceed. The backup file is by default in /etc/lvmconf/vg00.conf

If you physically hot swap the bad drive from the system and you can still work and access /usr then there is a change that you do not need an outage.

follow these steps.
1. # vgcfgrestore -n /dev/vg00 /dev/rdsk/
Start the mirror synchronize process.
2. # vgsync /dev/vg00 (This process takes time to resync.)
Place the boot utilities in the boot area.
3. # mkboot /dev/rdsk/
Add an Auto file in the boot LIF area.
4. # mkboot -a "hpux (;0)/stand/vmunix" /dev/rdsk/
Note: The -lq switch could be added to override quorum on HIGH AVAILABILITY systems.
EXAMPLE: # mkboot -a "hpux -lq (;0)/stand/vmunix" /dev/rdsk/

Update LIF'S Lable file with information contained in the BDRA. (Boot Data Reserved Area)
5. # lvlnboot -Rv
Getting diagnostics on the root mirror disk
List the offline diagnostics from the good root boot disk.
6. # lifls -il /dev/rdsk/cXtXd0
Copy the diagnostics to the mirror root disk.
7. # cd /usr/sbin/diag/lif
# mkboot -b updatediaglif -p ISL -p HPUX -p LABEL -p AUTO /dev/rdsk/




Lolupee
Regular Advisor

Re: Stlale extents

Eric,

just discovered that you have an Itanium server.

Then the process is different apologies. This I have not tested very well but the process I listed below is a combination of steps for replacing HPUX version root disks and steps in creating mirror root disk and I believe it should work. You do not have anything to loose except time.

You would need to create the EFI partition first.


1. Create the system, OS, and service partitions.
# vi /tmp/partitionfile
3
EFI 500MB
HPUX 100%
HPSP 400MB
# idisk -wf /tmp/partitionfile /dev/rdsk/

idisk version: 1.31
********************** WARNING ***********************
If you continue you may destroy all data on this disk.
Do you wish to continue(yes/no)? yes <-- Answer "yes" and not "y"

3. Create device files needed for the new partitions.
# insf -eC disk

4. Verify the partition table.
# idisk /dev/rdsk/
5. Verify that the device files were created properly.
# ioscan -efnC disk -->
6. Populate the /efi/hpux/ directory in the new EFI system partition.
# mkboot -e -l /dev/rdsk/

7. Change the auto file for the mirror to boot without quorum.
NOTE: Using "s1"
# echo "boot vmunix -lq" > /tmp/AUTO.lq
# efi_cp -d /dev/rdsk/s1 /tmp/AUTO.lq /EFI/HPUX/AUTO

Please, do not forget the s1 behind the newdisk CTD.

NOTE: We assume that if we boot from the primary, the mirror is fully
functional and therefore we don't need to override quorum. Your site might
require that both disks override quorum.

9. Verify the contents of the auto file on the primary and the mirror.
NOTE: Using "s1"
# efi_cp -d /dev/rdsk/s1 -u /EFI/HPUX/AUTO /tmp/AUTO.pri
# efi_cp -d /dev/rdsk/s1 -u /EFI/HPUX/AUTO /tmp/AUTO.alt
# cat /tmp/AUTO.pri
# cat /tmp/AUTO.alt


then you can proceed on the vgcfgrestore and others.

1. # vgcfgrestore -n /dev/vg00 /dev/rdsk/s2
(Note this is section 2)
Start the mirror synchronize process.
2. # vgsync /dev/vg00 (This process takes time to resync.)
systems.
Update LIF'S Table file with information contained in the BDRA. (Boot Data Reserved Area)
5. # lvlnboot -Rv
Getting diagnostics on the root mirror disk
List the offline diagnostics from the good root boot disk.
6. # lifls -il /dev/rdsk/cXtXd0s2
Copy the diagnostics to the mirror root disk.
Simon Wickham_6
Regular Advisor

Re: Stlale extents

Hi Eric,

I would recommend you issue a vgsync and then issue a strings /etc/lvmtab. Then try running the following command.

dd if=/dev/rdsk/c4t8d0 of=/dev/null bs=1024k

If you get an I/O error from dd, the disk is bad.

Regards,
Simon