- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- SCSI: Abort abandoned -- lbolt:
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-06-2006 10:49 PM
03-06-2006 10:49 PM
An N-class server appeared to crash at the weekend and I need some help as to why this happened. I received an alert that the ssh service had stopped on the box so I attempted to remotely telnet onto the box unsuccessfully. I hooked a console up to the server and the screen was blank. The LED showed that the 2nd disk had a flashing green light but the primary disk was not. As I could not get access to the GDP I rebooted the box (power off/on). The system did not come back up. I could however access the GDP but was unable to boot the box up from the pri or alt disk. Neither was I able to boot into single user mode.
I was able to boot into maintenance mode from the ISL (hpux -lm) and from here managed to manually bring the system up through the run levels to multi-user mode.
The only errors I could find to explain what happened here was from the OLDsyslog.log which produced the following as its last entry:
vmunix: SCSI: Abort abandoned -- lbolt: 1410939057, dev: 1f026000, io_id: 20c3f76, status: 200
The two system disks are fine as I tested using 'dd' and no i/o errors were detected. Has anyone seen a similar issue happen before?
Rgds,
Duffs.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-06-2006 11:01 PM
03-06-2006 11:01 PM
Solutionlbolts are normally down to SCSI bussproblems, either caused by failed disks, power interrupts,cable faults or termination issues.
1f026000 I think translates:
1f = Decimal 31, which is sdisk device major number
and 026000 identifies minor number.
So, faulty device should be c2t6d0.
Can you do a:
diskinfo -v c2t6d0
and check the output
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-06-2006 11:28 PM
03-06-2006 11:28 PM
Re: SCSI: Abort abandoned -- lbolt:
The output of the following command:
# diskinfo -v /dev/rdsk/c2t6d0
SCSI describe of /dev/rdsk/c2t6d0:
vendor: SEAGATE
product id: ST318203LC
type: direct access
size: 17783240 Kbytes
bytes per sector: 512
rev level: HP01
blocks per disk: 35566480
ISO version: 0
ECMA version: 0
ANSI version: 2
removable media: no
response format: 2
(Additional inquiry bytes: (32)52 (33)46 (34)33 (35)33 (36)36 (37)35 (38)33 (39)0 (40)0 (41)0 (42)0 (43)0 (44)0 (45)0 (46)0 (47)0 (48)0 (49)0 (50)0 (51)0 (52)0 (53)0 (54)0 (55)0 (56)0 (57)0 (58)0 (59)0 (60)0 (61)0 (62)0 (63)0 (64)0 (65)0 (66)0 (67)0 (68)0 (69)0 (70)0 (71)0 (72)0 (73)0 (74)0 (75)0 (76)0 (77)0 (78)0 (79)0 (80)0 (81)0 (82)0 (83)0 (84)0 (85)0 (86)0 (87)0 (88)0 (89)0 (90)0 (91)0 (92)43 (93)6f (94)70 (95)79 (96)72 (97)69 (98)67 (99)68 (100)74 (101)20 (102)28 (103)63 (104)29 (105)20 (106)31 (107)39 (108)39 (109)39 (110)20 (111)53 (112)65 (113)61 (114)67 (115)61 (116)74 (117)65 (118)20 (119)41 (120)6c (121)6c (122)20 (123)2 (124)1e (125)b3 (126)90 (127)0 (128)0 (129)2 (130)0 (131)0 (132)0 (133)0 (134)0 (135)0 (136)0 (137)0 (138)0 )
To me this looks normal?
Rgds,
Duffs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-06-2006 11:30 PM
03-06-2006 11:30 PM
Re: SCSI: Abort abandoned -- lbolt:
With the exception of when I swap out a hot swap drive, every time I get an lbolt it eventually results in drive replacement.
I'd get good backups made and prepare for that eventuality.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-06-2006 11:38 PM
03-06-2006 11:38 PM
Re: SCSI: Abort abandoned -- lbolt:
If you use a console terminal, the screen will of course be blank if you're just connected it to the server. The server does not generally keep track of what should be on the console's screen: the console does it by itself. After connecting the console, you generally press some keys to see if the server reacts.
Pressing Enter once or twice should normally bring up a login prompt. If that does not work, press Ctrl-B to access the GSP. If that does not work either, the GSP is probably hung. The server may be fine, the data just isn't going through the GSP to the console.
See the image on the page 97 of this file:
http://docs.hp.com/en/3687/rp7400_customer_hardwaremanual.pdf
Item 20 is the GSP reset button. If you cannot access the GSP, try pushing that first. It resets only the GSP without bothering the server proper.
Resetting the server from the power switch will not actually reset the GSP. Only pulling all the power cords physically off will remove power from the GSP.
Check the GSP firmware version (from the console, press Ctrl-B, then enter command HE and see the top line of the help information). If it's very old, it might be useful to update it. Newer GSP firmware versions are generally more stable than the old ones.
Did you check the GSP error log? (Ctrl-B and command SL, then E for error)
You can decode the dev: -number from the lbolt message. The first two digits are 1f hexadecimal, which is 31 in decimal. That tells us it's a device in /dev/dsk. The device nodes in /dev/dsk all have major number 31, and the rest of the dev:-number is the minor number.
Do a "ll /dev/dsk" and see which of the disk devices has numbers "31 0x026000" in their listing.
Check the firmware version of the disks using the "diskinfo -v" command. There was a disk firmware problem that caused disks to fail in use and then come back after the power was cycled. It was mainly with the L class servers, but I guess a N-class server might be just the right age to have the same problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-07-2006 12:11 AM
03-07-2006 12:11 AM
Re: SCSI: Abort abandoned -- lbolt:
lbolt errors are usually down to termination and timeouts with larger systems. If its just a small system with not many disks then this could be the start of a disk failure.
It is however abit starnge that this was an error in the oldsyslog.log. what was the date stamp on the log?
are the disks mirrored as the box shouldnt have hung if one disk failed and not allowed you to reboot the alt disk?
from the diskinfo command you posted you do have an older version of firmware on the disk that should be upgraded to stop the disks from going offline. excert from patch details
PF_DSEACH3HP04:
A problem has been identified with certain Cheetah III
disk drives that use a Cypress SRAM (9, 18 and 36 GB).
The most common symptom seen is that the drive goes
offline and is inaccessible to the system. In some
instances, the drive has been reported to have a solid
LED or a flash code. The problem may be temporarily
corrected by a bus reset or by unplugging and
reconnecting the drive. Updated firmware that addresses
the problem is labeled HP04.
downloadable from here
http://www4.itrc.hp.com/service/patch/patchDetail.do?BC=patch.breadcrumb.main|patch.breadcrumb.search|&patchid=PF_DSEACH3HP04&context=firmware:disk
probbaly unlikely for both disks to encounter the same state at the same time though?
As you were in fact able to go through the run levels to get the box up I would also check the lif area on the disks aswell, And check the mirroing.
I would also check the logs in GSP and check for any HPMC's with STM or pdcinfo.
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-07-2006 01:50 AM
03-07-2006 01:50 AM
Re: SCSI: Abort abandoned -- lbolt:
Yes I mean GSP. Maybe I should have been more clear, yes I hit a few keys after hooking up the console and Ctrl-B didn't prompt for GSP login either.
I checked the GSP error logs but they didn't tell me much. I will ckeck the GSP firmware version, thanks for your help.
Andrew,
The OLDsyslog.log entry was timestamped around the time when the server alerts began, which I suspect is when the server hung. The LIF area looks fine on both disks as does the mirroring. I will look into possible firmware upgrades and hope that eliminates the chances of this occuring again. Thanks for your help!
Rgds,
Duffs