GreenLake Administration
- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Cause of and action upon SCSI bus disconnects
Operating System - HP-UX
1848150
Members
9557
Online
104022
Solutions
Forums
Categories
Company
Local Language
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Forums
Discussions
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-12-2004 08:06 PM
12-12-2004 08:06 PM
Cause of and action upon SCSI bus disconnects
Hello,
we have this
# model;uname -srv
9000/804/K450
HP-UX B.11.11 U
which is exhibiting quite a lot of SCSI bus disconnects
# grep vmunix /var/adm/syslog/syslog.log|tail
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571835, dev: cb00e002, io_id: d1
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571836, dev: cb00e002, io_id: d1
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571833, dev: cb00e002, io_id: d1
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571837, dev: cb00e002, io_id: d1
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571838, dev: cb00f002, io_id: d2
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571839, dev: cb00f002, io_id: d2
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571840, dev: cb00f002, io_id: d2
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571841, dev: cb00f002, io_id: d2
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571838, dev: cb00f002, io_id: d2
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571842, dev: cb00f002, io_id: d2
# grep -c 'SCSI: Unexpected Disconnect' /var/adm/syslog/syslog.log|tail
2485
To me this looks like a termination problem, or similar.
The volumes however are all in sync
# lvdisplay -v $(vgdisplay -v|awk '/LV Name/{print$NF}')|grep -ic stale
0
And a look at the disks with OnlineDiag showed no errors.
On the other hand this box'es patch level is a bit obsolete.
For instance I'm sure there will be a more recent SCSI patch (which may have fixed some hoax SCSI bus errors, who knows?)
# swlist -l fileset -a create_date -a install_date -a title PHKL_25896|sed -n 5,\$p
#
# PHKL_25896 Fri Dec 27 13:46:12 MET 2002 200212270852.50 SCSI IO Cumulati
ve Patch
PHKL_25896.C-INC Fri Dec 27 13:46:12 MET 2002 200212270852.50 ProgSupport.C-IN
C
PHKL_25896.CORE2-KRN Fri Dec 27 13:46:12 MET 2002 200212270852.50 OS-Core.CORE2-KR
N
First, how do I translate the device hex identifier from syslog entries to single out the affected devices?
Then what would you recommend?
Though the syslog SCSI errors have disappeared since Dec 10th, which also happened the date of last reboot
# who -b
. system boot Dec 10 13:33
I think this requires further investigation.
Regards
Ralph
we have this
# model;uname -srv
9000/804/K450
HP-UX B.11.11 U
which is exhibiting quite a lot of SCSI bus disconnects
# grep vmunix /var/adm/syslog/syslog.log|tail
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571835, dev: cb00e002, io_id: d1
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571836, dev: cb00e002, io_id: d1
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571833, dev: cb00e002, io_id: d1
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571837, dev: cb00e002, io_id: d1
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571838, dev: cb00f002, io_id: d2
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571839, dev: cb00f002, io_id: d2
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571840, dev: cb00f002, io_id: d2
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571841, dev: cb00f002, io_id: d2
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571838, dev: cb00f002, io_id: d2
Dec 10 15:06:19 tiber vmunix: SCSI: Unexpected Disconnect -- lbolt: 571842, dev: cb00f002, io_id: d2
# grep -c 'SCSI: Unexpected Disconnect' /var/adm/syslog/syslog.log|tail
2485
To me this looks like a termination problem, or similar.
The volumes however are all in sync
# lvdisplay -v $(vgdisplay -v|awk '/LV Name/{print$NF}')|grep -ic stale
0
And a look at the disks with OnlineDiag showed no errors.
On the other hand this box'es patch level is a bit obsolete.
For instance I'm sure there will be a more recent SCSI patch (which may have fixed some hoax SCSI bus errors, who knows?)
# swlist -l fileset -a create_date -a install_date -a title PHKL_25896|sed -n 5,\$p
#
# PHKL_25896 Fri Dec 27 13:46:12 MET 2002 200212270852.50 SCSI IO Cumulati
ve Patch
PHKL_25896.C-INC Fri Dec 27 13:46:12 MET 2002 200212270852.50 ProgSupport.C-IN
C
PHKL_25896.CORE2-KRN Fri Dec 27 13:46:12 MET 2002 200212270852.50 OS-Core.CORE2-KR
N
First, how do I translate the device hex identifier from syslog entries to single out the affected devices?
Then what would you recommend?
Though the syslog SCSI errors have disappeared since Dec 10th, which also happened the date of last reboot
# who -b
. system boot Dec 10 13:33
I think this requires further investigation.
Regards
Ralph
Madness, thy name is system administration
1 REPLY 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-12-2004 09:14 PM
12-12-2004 09:14 PM
Re: Cause of and action upon SCSI bus disconnects
Hi Ralf,
It reminds me a failure I had few years ago, a root mirrored disk failure...
With the bad side of it the disk was disrupting time to time but did never crash or break definitely. In had unexplained crashes but coulnd diagnose because after the reboot all was fine till the next time (few days-few weeks) th crash was due to sharing on the same scsi controller a HDS 5750 subsystem with internal root disks and at each time it was going bezerk, it sent a reset that the HDS acknowledged. it was looking at the HDS logs I found out: thousands of reset were done. I called HDS saying this can happen when a disk fails but not completely, switching istelf on/off the crashes wer due to swap I had on the HDS...
The difficulty was to decide which internal mirrored disk was causing all the trouble since after every reboot EMS found nothing...
One support Enginner asked me a type a now forgotten command which did return as error blaming the other disk, I was asked which disk was faulty and we changed the OTHER one, Ive been told experience shows that devices can often lie and blame the alter ego...
And the problem was solved...
So I would keep an eye on this system to se if you have more occurences...
cb would be 203:
# pwd
/dev/dsk
# ll|more
total 0
brw-r----- 1 bin sys 31 0x003000 Feb 26 2002 c0t3d0
brw-r----- 1 bin sys 31 0x012000 Feb 5 2002 c1t2d0
cr-------- 1 root root 203 0x012000 Feb 26 2002 c1t2d0.pt
these are the vg00 disks and one time I did have vxvm...
Good luck
All the best
Victor
It reminds me a failure I had few years ago, a root mirrored disk failure...
With the bad side of it the disk was disrupting time to time but did never crash or break definitely. In had unexplained crashes but coulnd diagnose because after the reboot all was fine till the next time (few days-few weeks) th crash was due to sharing on the same scsi controller a HDS 5750 subsystem with internal root disks and at each time it was going bezerk, it sent a reset that the HDS acknowledged. it was looking at the HDS logs I found out: thousands of reset were done. I called HDS saying this can happen when a disk fails but not completely, switching istelf on/off the crashes wer due to swap I had on the HDS...
The difficulty was to decide which internal mirrored disk was causing all the trouble since after every reboot EMS found nothing...
One support Enginner asked me a type a now forgotten command which did return as error blaming the other disk, I was asked which disk was faulty and we changed the OTHER one, Ive been told experience shows that devices can often lie and blame the alter ego...
And the problem was solved...
So I would keep an eye on this system to se if you have more occurences...
cb would be 203:
# pwd
/dev/dsk
# ll|more
total 0
brw-r----- 1 bin sys 31 0x003000 Feb 26 2002 c0t3d0
brw-r----- 1 bin sys 31 0x012000 Feb 5 2002 c1t2d0
cr-------- 1 root root 203 0x012000 Feb 26 2002 c1t2d0.pt
these are the vg00 disks and one time I did have vxvm...
Good luck
All the best
Victor
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
Company
Events and news
Customer resources
© Copyright 2026 Hewlett Packard Enterprise Development LP