1820895 Members
3960 Online
109628 Solutions
New Discussion юеВ

Re: possible HDD-fault?

 
SOLVED
Go to solution
Thomas Greig
Regular Advisor

possible HDD-fault?

found this in my syslog.log-file, and did NOT like one bit..:

Jan 24 08:12:28 kristin vmunix: beeper_restore: dma still occurring
Jan 24 08:51:57 kristin vmunix: SCSI: Device violation of Contingent Allegiance -- lbolt: 62780597, dev: 1f038100
Jan 24 08:51:57 kristin vmunix: lbp->state: 4060
Jan 24 08:51:57 kristin vmunix: lbp->offset: f0
Jan 24 08:51:57 kristin vmunix: lbp->uPhysScript: f4000000
Jan 24 08:51:57 kristin vmunix: From most recent interrupt:
Jan 24 08:51:57 kristin vmunix: ISTAT: 01, SIST0: 00, SIST1: 00, DSTAT: 84, DSPS: 00000010
Jan 24 08:51:57 kristin vmunix: lsp: 000000004a8d8500
Jan 24 08:51:57 kristin vmunix: bp->b_dev: 1f038100
Jan 24 08:51:57 kristin vmunix: scb->io_id: 38946e7
Jan 24 08:51:57 kristin vmunix: scb->cdb: 2a 00 01 13 65 40 00 00 04 00
Jan 24 08:51:57 kristin vmunix: lbolt_at_timeout: 0, lbolt_at_start: 0
Jan 24 08:51:57 kristin vmunix: lsp->state: 1
Jan 24 08:51:57 kristin vmunix: lbp->owner: 0000000000000000
Jan 24 08:52:04 kristin vmunix: SCSI: Resetting SCSI -- lbolt: 62780697, bus: 3
Jan 24 08:52:04 kristin vmunix: SCSI: Reset detected -- lbolt: 62780697, bus: 3
Jan 24 08:52:04 kristin EMS [2252]: ------ EMS Event Notification ------ Value: "MAJORWARNING (3)" for Resource: "/storage
/events/disks/default/10_0_15_1.6.0" (Threshold: >= " 3") Execute the following command to obtain event details: /
opt/resmon/bin/resdata -R 147587074 -r /storage/events/disks/default/10_0_15_1.6.0 -n 147587073 -a
Jan 24 16:02:09 kristin vmunix: beeper_restore: dma still occurring
Jan 24 16:08:27 kristin vmunix: beeper_restore: dma still occurring
Jan 24 16:10:22 kristin above message repeats 3 times

its over a week since it came in the log. have not seen in since..
11 REPLIES 11
Pete Randall
Outstanding Contributor

Re: possible HDD-fault?

As syslog suggests, execute the "opt/resmon/bin/resdata" command and see what you get for event details. As long as you're not seeing any recurrences of this and the details don't cause any reason to be alarmed, I would ignore it for now but file it away for future reference.


Pete

Pete
Robert-Jan Goossens
Honored Contributor

Re: possible HDD-fault?

could you post your HPUX version.
Thomas Greig
Regular Advisor

Re: possible HDD-fault?

kristin 41: uname -a
HP-UX kristin B.11.11 U 9000/785 2008036996 unlimited-user license

it's a HP Vizualize C3750 with HP-UX 11i

I'm not familiar with the 'resdata'-command. What will it do? the system is operative and I do not want to risk anything during operations..
Pete Randall
Outstanding Contributor

Re: possible HDD-fault?

The resdate command will just read and report the details for that particular event from the log files.


Pete

Pete
Thomas Greig
Regular Advisor

Re: possible HDD-fault?

The command "/opt/resmon/bin/resdata -R 147587074 - r /storage/events/disks/default/10_0_15_1.6.0 -n 147587073 -a" gave me the following reply:



CURRENT MONITOR DATA:

Event Time..........: Mon Jan 24 08:52:04 2005
Severity............: MAJORWARNING
Monitor.............: disk_em
Event #.............: 100091
System..............: kristin

Summary:
Disk at hardware path 10/0/15/1.6.0 : Software configuration error


Description of Error:

The device is in a condition where it requires action on the part of the
device driver or a human operator.

Probable Cause / Recommended Action:

The device has been reset by a Bus Device Reset message, a hard reset
condition, or a power-on reset.

If this is the case, no action is necessary.

Alternatively, a removable medium has been loaded or replaced.

If this is the case, no action is necessary.

Alternatively, the mode parameters, microcode, or inquiry data for the
device have been changed.

If this is the case, no action is necessary.

Alternatively, the installed version of the device driver does not match
that of the installed version of HP-UX. Install the correct version of the
driver.

Additional Event Data:
System IP Address...: x.x.x.x
Event Id............: 0x41f4a92400000000
Monitor Version.....: B.01.01
Event Class.........: I/O
Client Configuration File...........:
/var/stm/config/tools/monitor/default_disk_em.clcfg
Client Configuration File Version...: A.01.00
Qualification criteria met.
Number of events..: 1
Associated OS error log entry id(s):
0x41f4a92300000000
Additional System Data:
System Model Number.............: 9000/785
OS Version......................: B.11.11
STM Version.....................: A.45.00
EMS Version.....................: A.04.00
Latest information on this event:
http://docs.hp.com/hpux/content/hardware/ems/scsi.htm#100091

Details:

Component Data:
Physical Device Path...: 10/0/15/1.6.0
Device Class...........: Disk
Inquiry Vendor ID......: HP 36.4G
Inquiry Product ID.....: ST336607LC
Firmware Version.......: HPC3
Serial Number..........: 3JA16YGQ000073355BMW

Product/Device Identification Information:

Logger ID.........: sdisk
Product Identifier: SCSI Disk
Product Qualifier.: HP36.4GST336607LC
SCSI Target ID....: 0x06
SCSI LUN..........: 0x00

I/O Log Event Data:

Driver Status Code..................: 0x0000000B
Length of Logged Hardware Status....: 22 bytes.
Offset to Logged Manager Information: 24 bytes.
Length of Logged Manager Information: 34 bytes.

Hardware Status:

Raw H/W Status:
0x0000: 00 00 00 02 70 00 06 00 00 00 00 0A 00 00 00 00
0x0010: 29 02 02 00 00 00

SCSI Status...: CHECK CONDITION (0x02)
Indicates that a contingent allegiance condition has occurred. Any
error, exception, or abnormal condition that causes sense data to be
set will produce the CHECK CONDITION status.

SCSI Sense Data:

Undecoded Sense Data:
0x0000: 70 00 06 00 00 00 00 0A 00 00 00 00 29 02 02 00
0x0010: 00 00

SCSI Sense Data Fields:
Error Code : 0x70
Segment Number : 0x00
Bit Fields:
Filemark : 0
End-of-Medium : 0
Incorrect Length Indicator : 0
Sense Key : 0x06
Information Field Valid : FALSE
Information Field : 0x00000000
Additional Sense Length : 10
Command Specific : 0x00000000
Additional Sense Code : 0x29
Additional Sense Qualifier : 0x02
Field Replaceable Unit : 0x02
Sense Key Specific Data Valid : FALSE
Sense Key 0x06, UNIT ATTENTION, indicates that the target has been
reset by a BUS DEVICE RESET message, a hard reset condition, or by a
power-on reset. If not a reset, then one of the following may have
occurred.
1. A removable medium may have been changed.
2. The mode parameters in effect for this initiator have been
changed by another initiator.
3. The version or level of microcode has been changed.
4. Tagged commands queued for this initiator were cleared by
another initiator.
5. INQUIRY data has been changed.
6. The mode parameters in effect for this initiator have been
restored from non-volatile memory.
7. A change in the condition of a synchronized spindle.
8. Any other event that requires the attention of the initiator.

SCSI Command Data Block:

Command Data Block Contents:
0x0000: 2A 08 00 88 66 20 00 00 20 00

Command Data Block Fields (10-byte fmt):
Command Operation Code...(0x2A)..: WRITE
Logical Unit Number..............: 0
DPO Bit..........................: 0
FUA Bit..........................: 1
Relative Address Bit.............: 0
Logical Block Address............: 8939040 (0x00886620)
Transfer Length..................: 32 (0x0020)

Manager-Specific Data Fields:
Request ID.............: 0x038946E8
Data Residue...........: 0x00004000
CDB status.............: 0x00000002
Sense Status...........: 0x00000000
Bus ID.................: 0x03
Target ID..............: 0x06
LUN ID.................: 0x00
Sense Data Length......: 0x12
Q Tag..................: 0x7E
Retry Count............: 0
Sense Key Specific Data : 0x00 0x00 0x00
Robert-Jan Goossens
Honored Contributor

Re: possible HDD-fault?

Inquiry Product ID.....: ST336607LC is a Ultra320 SCSI Disk

Could you check if one of below patches is installed ?

PHKL_30511 PHKL_32089

# swlist -l product | grep PHKL_30511
# swlist -l product | grep PHKL_32089

Best regards,
Robert-Jan
Pete Randall
Outstanding Contributor
Solution

Re: possible HDD-fault?

Since you've had no recurrences and you would probably know if someone had removed media, I believe cause #1 is most likely:

" The device has been reset by a Bus Device Reset message, a hard reset
condition, or a power-on reset.

If this is the case, no action is necessary."

As the message says, no action is necessary, but I would make note of this for further reference. If you see any further messages concerning this same device, it's time to call in HP for a look at hardware problems.


Pete

Pete
Thomas Greig
Regular Advisor

Re: possible HDD-fault?

kristin 54: swlist -l product |grep PHKL_30511
kristin 55: swlist -l product | grep PHKL_32089
kristin 56:

To my knowledge no media have been removed. We removed a external hdd-cabinet in November but the machine have since been to service at HP.
Robert-Jan Goossens
Honored Contributor

Re: possible HDD-fault?

Hi,

As Pete said, it only happened once, so I would keep an eye on the syslog.

Check the patch instructions for the recommended patch I gave. Search for the word ├в violation├в


http://www4.itrc.hp.com/service/patch/patchDetail.do?BC=patch.breadcrumb.main|patch.breadcrumb.patchDetail{PHKL_30509,hpux:800:11:00}|&patchid=PHKL_30511&context=hpux:800:11:00

Regards,
Robe
Thomas Greig
Regular Advisor

Re: possible HDD-fault?

I just checked the log again and the msg have not repeated it self.. but the message

Feb 14 23:32:42 kristin above message repeats 8 times
Feb 14 23:33:09 kristin vmunix: beeper_restore: dma still occurring
Feb 14 23:43:33 kristin vmunix: beeper_restore: dma still occurring

is coming about every 20min. It started occuring in the log at the same time as the original msg. I posted earlier.
Robert-Jan Goossens
Honored Contributor

Re: possible HDD-fault?

This could still be an error message from the scsi channel / backplane.

If you have a support contract I would call them.

Regards,
Robert-Jan