- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: I/O error while reading the VGDA
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-21-2002 08:15 AM
12-21-2002 08:15 AM
I/O error while reading the VGDA
We have a cluster of 2x L1000 Machines with HP-UX B.11.00 and MC service guard. The machines have the same configuration, which is:
vg00 --> 2x Internal disks (c1t2d0 e c2t2d0)
other vg's on the FC60 external array
Today we had the following problem:
1. while doing a bdf, the session hung and that kept happening with all the sessions. We stopped the cluster (cmhaltcl) on MACH1 and tried to shut it down. The process hung at the "Unmounting filesystems" stage. Solution was to power it down.
2. When we tried to boot it up, it gave the "System alert: 12 - Software failure" and after acknowledging it gave a couple times "alert: 3" and it was there.
3. Then we booted with the "hpux -lm" command.
4. When trying to activate vg00 (vgchange -a y vg00), we got "I/O error while reading the VGDA".
5. we checked the disks with diskinfo and they were ok, dd worked as well.
6. After some research, we issued the command "vgcfgrestore -n /dev/vg00 /dev/rdsk/c1t2d0" but while activating still got the same error as on 4.
7. We did the same command now for disk c2t2d0.
8. Now the activation of vg00 worked.
9. We rebooted the machine and everything was fine.
Now, the questions are:
1) Why did this happen ?
2) Is there a way to find out what happened ?
Thanks for your support
Julio Quadros
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-21-2002 11:48 AM
12-21-2002 11:48 AM
Re: I/O error while reading the VGDA
looks like disk structures got corrupt. As dd works and you received I/O error reading VGDA I think underlying structure (BDRA) was pointing to invalid/inexistent disk location (that's why I/O error). Check:
- /var/adm/crash for crashdumps;
- /var/adm/tombstones for processor logs;
- expert tool in STM to see if any of these 2 disks have defects logged in growing defect list.
Did you try to boot '-lm' from both disks? I mean can we understand if there was info corrupt on only one disk or on both?
As soon as you wrote 'vgcfgrestore -n /dev/vg00 /dev/rdsk/c1t2d0' so I guess this disk had corrupt LVM structures. I would check its health with diagnose/verify in STM and paid attention to its defect growing table. In addition as soon as these 2 boot disks are on the different controllers then issues can be with controller c1t2d0 connected to.
If you have contract/warranty active I would suggest to call HP
Eugeny
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-22-2002 01:17 AM
12-22-2002 01:17 AM
Re: I/O error while reading the VGDA
1) What is STM ? How do I check the disks for errors ?
2) At PDC or ISL level, is there a way to see if disks are working properly ? Something like diskinfo ?
I will follow your recommendations and post here the results.
Thanks
JQ
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-22-2002 02:07 AM
12-22-2002 02:07 AM
Re: I/O error while reading the VGDA
cstm - command line stm
mstm - menued
xstm - graphical
If you have 700/96 terminal use mstm. Select desired disk, go to Tools and then try 'Information', 'Verify' and 'Diagnose'. Try expert tools to see disk defect table. If they are not available then you need temporary password - call HP.
At PDC level go to service menu and run 'pim' command to see if there's a valid timestamp in its output. Memory stats you can see in information menu.
In ISL there're ODE (offline diagnostics environment) but if only you have installed them. There's an expert tool to diag disks, but better use STM in hpux.
As I already mentioned check files in /var/adm/crash and /var/adm/tombstones (recent file should be ts99).
These all you can do by yourself.
Eugeny
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-22-2002 08:45 PM
12-22-2002 08:45 PM
Re: I/O error while reading the VGDA
HELP HELP HELP
It happened again today and once again i had to vgcfgrestore c2t2d0.
Following your advice, i ran xstm and this is what i got:
1)No errors on both disks
2)"Diagnose" and "Expert tool" are not available
3) /var/adm/crash has no crash dumps
4) My machine has no /var/adm/tombstones. In /etc/rc.config.d I have the file pdcinfo with PDCINFO=1 and PDCINFO_OPTS=
5) Information on both controllers do not show any errors. If there is an error on controller, will they show up here ? It shows for both controllers:
Device status:
Bit 9-10:DEVSEL timing 01 - medium
I am really worried because this is a critical cluster and I don??t know what the root cause is.
Thanks for your support
JQ
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-22-2002 09:33 PM
12-22-2002 09:33 PM
Re: I/O error while reading the VGDA
Is these disks c1t2d0 and c2t2d0 are mirrored?.
This seems that your c1t2d0 disk
is going to crash. So its time
to take a make_recovery tape and
check the possibilities of changing the disk.
As your stm gives no error on disks, may be LVM info on your disk may be getting corrupted.
Also check the SCSI termination on the bus.
Is there any errors logged on syslog file, like disk access error on c1t2d0 like that?.
Srini
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-22-2002 11:32 PM
12-22-2002 11:32 PM
Re: I/O error while reading the VGDA
I think you should call HP. They will diagnose system and develop action plan
Eugeny
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-27-2002 08:01 AM
12-27-2002 08:01 AM
Re: I/O error while reading the VGDA
# tail -f /var/adm/syslog/syslog.log
& check what kind of events being logged at that point in time. U can always go back in time in syslog & OLDsyslog for any indications:
# grep -Ei "scsi|power|lbolt" syslog.log
should return any scsi resets or power fail msg. Watch for msg such as: .. dev_t 0x1F012000
the above HEX code means disk c1t2d0 has a problem, not necessarly a hardware one. Power fail msg can be corrected by increasing the IO time-out (seen by pvdisplay /dev/dsk/c#t#d#). Usually set to default, but can be modified by pvchange -t 90 disk for example. More info are found in pvchange man pages.
Generally, any HW failures can be predicted if you have a high nbr of IO Errors about a specific device. With diagnostics installed, run cstm ---> ru --> logtool option --> rs (to run summary). This will show you the nbr of IO errors per device.
To reset the log you can run SL (for switch log) at LogUility prompt.
Cheers,
T?