1850473 Members
2348 Online
104054 Solutions
New Discussion

Re: HSG80 Error

 
Andrew_168
Regular Advisor

HSG80 Error

Hi, I had a problem the other day, I think a 1GB Brocade switch caused the problem because when the switch was rebooted the HSG80's did this!
show last all
Last Failure Entry: 7. Flags: 006FF901
Template: 1.(01) Description: Last Failure Event
Occurred on 16-DEC-2003 at 12:55:59
Power On Time: 3. Years, 141. Days, 5. Hours, 41. Minutes, 30. Seconds
Controller Model: HSG80
Serial Number: ZG02601351 Hardware Version: E12(2A)
Software Version: V86F-13(BA)
Instance Code: 0102030A Description:
An unrecoverable software inconsistency was detected or an intentional
restart or shutdown of controller operation was requested.
Reporting Component: 1.(01) Description:
Executive Services
Reporting component's event number: 2.(02)
Event Threshold: 10.(0A) Classification:
SOFT. An unexpected condition detected by a controller software component
(e.g., protocol violations, host buffer access errors, internal
inconsistencies, uninterpreted device errors, etc.) or an intentional
restart or shutdown of controller operation is indicated.
Last Failure Code: 02DD0101
Last Failure Parameter[0.] 00000009
Last Failure Code: 02DD0101 Description:
The VA state change deadman timer expired, and at least one VSI was still
interlocked.
> Last Failure Parameter[0] contains the nv_index.
Reporting Component: 2.(02) Description:
Value Added Services
Reporting component's event number: 221.(DD)
Restart Type: 0.(00) Description: Full software restart

Both of them!

Anyone got any ideas what could have caused this?

Andy
2 REPLIES 2
Mike Naime
Honored Contributor

Re: HSG80 Error

Re-booting your SAN switch(es) should not cause your HSG(s) to re-boot if you have multiple fabrics. Did you only have one SAN switch that this controller was connected to? It could have been a number of things to cause an individual controller to restart. A failing disk drive that was hanging a bus, a bad memory DIMM, running CONFIG and having a disk not properly seated, The controller itself failing... Etc. But to have both controller re-boot at the sams time is an unusual behavior.

You last error indicates --The VA state change deadman timer expired, and at least one VSI was still interlocked.

Is this same error also seen on the other controller? What is the OS that you had using the HSG? Is this really a SecurePath issue? I.E. The OS did not failover paths like it should have?


Do you monitor (Log) both CLI ports on your HSG's?

My TAM usually has me pull logs of the following from each controller when troubleshooting a problem.

RUN FMU
show time
show last most
show last all
show mem all
exit

Gee... More questions than answers. :-)
VMS SAN mechanic
Andrew_168
Regular Advisor

Re: HSG80 Error

Hi,

Thanks for that, the SAN is dual fabric with two 16 port switches in each, two servers did loose contact with the HSG,(12:06pm) all the others were ok,(Until the controllers rebooted 12:54pm) Windows 2000 SP4, the other controller had exacly the same error, this has been in and running for 3 years, no configuration was taking place at the time. All servers connected to the SAN are running secure path, and on visiting the site yesterday all paths are operational, whether they were on the day could be questioned. No the ports are not monitored, but a SAN management appliance is soon to be purchased. There are no disk errors reported and VTDPY shows all disks are ok.
I have the full FMU logs and this is the last error, there were none since I last upgraded the firmware.
Thanks

Andy