HPE 9000 and HPE e3000 Servers
cancel
Showing results for 
Search instead for 
Did you mean: 

K580 hanging

 
SOLVED
Go to solution
John Peace
Frequent Advisor

K580 hanging

K580
HPUX 11.00

I administer a K580 remotely. A drive (scsi 5) went bad last week and was replaced by an HP tech. Since then the system hangs after boot up. Sometime it runs a long as 11 hours sometime 30 minutes. There is not access to the machine vi applications, ssh, telnet, ftp or the console. All internal crons also stop. The front panel shows a F14F (sometimes it has Fa4F or F24F). The key has to be turned off and back on to restart. I have run cm.collect and loaded "all" patches needed, so it is up to date on patches. On some of the reboots from the patches, the system has hung on reboot with can't find boot device. I have my local "eyes and hands" do a sea and the boot device disk (SCSI 6) is not showing up. Once the power is turned off and back on it boots up to work for an undetermined amount of time. So, what do I look at next? My root disk, system board, SCSI cables? I have a current backup and Ignite.
Any thoughts?
8 REPLIES 8
A. Clay Stephenson
Acclaimed Contributor

Re: K580 hanging

I think the key point here is that SCSI ID 6 is sometimes not showing up. That strongly suggests that your have a problem in the iosubsystem. It could be the disk itself, the controller (probably the multi-funcion io card) or poor connections of bad termination. There are simply not enough data at this point. A very salient point is when the SEA is next done do any of the other devices on the same bus appear. If all the other expected devices appears (except for SCSI ID 6) then you can be confident that you have a failing disk. Your problem really indicates the importance of mirrored disks. Had all your LVOL's been mirrored (preferably on devices on an entirely separate bus) then your box would probably never nmiss a beat. In fact, I never use the internal drives on K-boxes because they are not hot-pluggable. I always use external drives for K's and never have had to shut one down to replace a failing/failed disk.
If it ain't broke, I can fix that.
A. Clay Stephenson
Acclaimed Contributor

Re: K580 hanging

I think the key point here is that SCSI ID 6 is sometimes not showing up. That strongly suggests that your have a problem in the iosubsystem. It could be the disk itself, the controller (probably the multi-funcion io card) or poor connections of bad termination. There are simply not enough data at this point. A very salient point is when the SEA is next done do any of the other devices on the same bus appear. If all the other expected devices appear (except for SCSI ID 6) then you can be confident that you have a failing disk. Your problem really indicates the importance of mirrored disks. Had all your LVOL's been mirrored (preferably on devices on an entirely separate bus) then your box would probably never nmiss a beat. In fact, I never use the internal drives on K-boxes because they are not hot-pluggable. I always use external drives for K's and never have had to shut one down to replace a failing/failed disk.
If it ain't broke, I can fix that.
John Peace
Frequent Advisor

Re: K580 hanging

All other devices show up on the sea.

As a side note. I would mirror if the powers that be would pay for the hardware and software. Maybe this will push them that way. All of my "L" series servers (12) are mirrored.
John Peace
Frequent Advisor

Re: K580 hanging

I am going to Ignite recover a new disk for SCSI 6 today. Another observation:

If I have a remote Glance session to the system, it still runs and updates every 8 seconds even while no one else can connect, including the console. Once I exit glance, my connection is then hung. I have swapped between screens and it has hung after a few different screens also.

I am lost and confused. There are no errors in any of the logs at all. The tombstones look great.
A. Clay Stephenson
Acclaimed Contributor
Solution

Re: K580 hanging

All of this is consistent with the intermittent failure of the boot drive. I assume that /var/adm/syslog is physically located on the same drive so that when trouble occurs, it can't be logged. Glance (and other processes) keeps running until some disk i/o is actually required.
If it ain't broke, I can fix that.
John Peace
Frequent Advisor

Re: K580 hanging

Thanks A. Clay. I am about to do the Ignite recovery and am going to fly to the site to provide a "warm" fuzzy to the client. I will award points once I get back.

Thanks again. I will update from the site.
Dave Unverhau_1
Honored Contributor

Re: K580 hanging

John,

It seems like an interesting coincidence that you're having a problem with drive 6 immediately after having drive 5 replaced...

I assume (always a bad thing to do, I know) that these are internal FWD drives. My gut tells me that (assuming, again, that drive 6 didn't decide to join its old friend drive 5 in the junk pile) the replacement drive at address 5 has a problem where it's responding inappropriately to commands meant for drive 6 *OR* the cabling to drive 6 (power or SCSI) may have been partly dislodged during the repair.

Before replacing more drives, you might just want to have a careful look at the cables attached to all the internal drives.

Happy troubleshooting!

Dave
Romans 8:28
John Peace
Frequent Advisor

Re: K580 hanging

The problem was the root disk. Once replaced and reloaded (with Ignite) it has run for 3 weeks now with no problems. Thanks for all the help.