1833780 Members
2380 Online
110063 Solutions
New Discussion

L2000 crashes

 
Marty Hoff
Advisor

L2000 crashes

We have an L2000 that is part of a Service Guard environment. Recently, it failed in such a way that I could not get any response from the machine except that ports would respond to network connections. The connections themselves would NOT work, but I was able to telnet to ports and establish the connection, e.g., the SSH port. I hooked up a console to the machine and it was locked as well. I eventually powered the machine off and back on. Well, the machine continues to fail after a couple/few days of running (just sitting idle). I discovered I can reboot the machine from the GSP, but other than that everything locks up completely including my login on the console. And nothing gets logged, at least that I have been able to find.

Does anyone have any suggestions about how I can diagnose this problem? Of course, we do not have a current service contract on the machine.

Thanks for your help. Vexed in Austin,

Marty
17 REPLIES 17
Steven E. Protter
Exalted Contributor

Re: L2000 crashes

Are you getting a fault light?

If so, you are probably getting dumps in /var/adm/crash

You need to follow the attached q4 procedure, create output and get it to HP.

Last time I had to do this I was missing a patch.

I have three L2000/rp5450 servers with Ultrium external tape backup drives. I was getting errors restoring large files, system just crashed, red lighted and went down. HP identified the problem an hour after they got the crash dump q4 analyssis.

See attachement.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Marty Hoff
Advisor

Re: L2000 crashes

Nothing is showing up in /var/adm/crash after the reboots.
S.K. Chan
Honored Contributor

Re: L2000 crashes

Have you take a look at the chassis code ? Typically if there is one it'll be captured in /var/stm/logs/os and the filename is "log1.raw.cur" (something like that), look at its timestamp to see if it's the latest one. To view it you got to run STM ..
# cstm
cstm> ru logtool
In the "logtool" prompt enter "ce" (ie chassis error log) to view it ..
logtool util> ce
Post any findings here.
Jeff Schussele
Honored Contributor

Re: L2000 crashes

Steven,

You forgot the attachment.

Cheers,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Jeff Schussele
Honored Contributor

Re: L2000 crashes

Hi Marty,

I'd also get into the GSP (Ctrl-B on the console) & check the service logs for any logged errors.
Some older Ls have a known problem with the power monitor board that caused *phantom* fan or power supply errors. It's remotely possible that a bad PM board could cause the system to think you've lost power supplies and cause the system to halt, but the GSP would log that.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Marty Hoff
Advisor

Re: L2000 crashes

There was nothing in the ce log that you pointed too. The one entry was the machine name and IP address. The only log that anything newer is the ccbootlog file and that seems to be full of messages about the last boot.

Marty
James A. Donovan
Honored Contributor

Re: L2000 crashes

Sounds like it could be a slowly failing part, possibly exacerbated by a thermal issue.

I'd run the offline diags from the SupportPlus CD and stress test components as much as you can.

http://docs.hp.com/cgi-bin/otsearch/getfile?id=/hpux/onlinedocs/diag/ode/ode_over.htm&searchterms=offline%7cdiagnostics%7cODE%7crunning&queryid=20030227-151854
Remember, wherever you go, there you are...
Marty Hoff
Advisor

Re: L2000 crashes

Jeff, how do I check the GSP logs? I'm not very familiar with the GSP. Also, I don't think it is shutting the machine down simply because the network keeps responding. If the machine were being shutdown, I would thing it would no longer be on the network.

Marty
S.K. Chan
Honored Contributor

Re: L2000 crashes

To check the GSP log .. do a "ctrl B" to get to GSP (from the console) and simply enter "sl" at the GSP prompt. You would want to select "E" for error log. Another way is to use "cclogview" command which I've never used before. That command looks at /var/stm/logs/os/ccerrlog or "ccbootlog" depending on what argument you give to cclogview.
Jeff Schussele
Honored Contributor

Re: L2000 crashes

Hi Marty,

Attach a terminal to the console port - 9600 baud 8-N-1 xon/xoff
Hit enter a time or two to see if you get a login prompt.
If you see it then enter Ctrl-B to get to the GSP login
Enter twice thru the GSP login & PW.
At the GSP> prompt enter SL
Then select E for error & then N for no filter.
There may be quite a few in there, but look them all over.

Do a Q then CR to exit the log & then at the GSP> prompt enter CO to go back to console mode.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
Steven E. Protter
Exalted Contributor

Re: L2000 crashes

We had the same problem as Jeff this week.

It wasn't a fan at all, but ours went down and refused to come up, GSP complaining it didn't have enough fans to start the box.

Replaced no fans, the power card and we're back in business.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Marty Hoff
Advisor

Re: L2000 crashes

I've checked the logs from the gsp and there are no error entries since 4/14/2002. I've checked and it is logging things under the correct date as there are entries in the other logs from 2/26/2003.

I guess I'm going to try and use the diagnostics on the support Cd.

Marty
Eugeny Brychkov
Honored Contributor

Re: L2000 crashes

Marty,
do you have GSP LAN console connected to public network? If yes, then disconnect it and connect to private LAN console subnetwork. I have seen similar symptoms as yo described when everything was hung due to LAN broadcast storms when GSP LAN console got crazy hanging not able to process all broadcast packets
Eugeny
Marty Hoff
Advisor

Re: L2000 crashes

The console is not on the LAN. I actually have it plugged into a PC that we have at the colo and I use VNC to get to the PC and then Hyperterm to get on the console.

Thanks, though. I hadn't heard that about LAN consoles.
Saurav_1
Valued Contributor

Re: L2000 crashes

Hi,
can you tell me what happens once u login into the system.Plus do u get any message like "no more ttys or somthing like that. Plus check the free space in /var.
Even I faced the same sort of problem. in which I was unable to get login prompt also. So HP engineer told that root harddisk is getting overloaded by too many application. He replaced the internal scsi cable. & installed latest SCSI driver patches. Till date we hav not faced any problem like this. One test U can doo is once U reboot pls try to search all the hdds in BCH prompt by search command. & try to locate if all are getting senced.

Regards

Saurav



Marty Hoff
Advisor

Re: L2000 crashes

Logging into the system is no problem. We never see any error messages, but after 1-3 days of the machine running, it simply hangs. The console will not let us login even. If I am already logged into the console, then the first command that I try to run hangs and I can never get any response whatsoever. I have to go into the GSP and reset the box.

Marty
Marty Hoff
Advisor

Re: L2000 crashes

Also, I see all of the disks when doing a SEArch at the boot prompt before booting. I do not believe it is a SCSI issue.