Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

errors on CIXCD device

 
Joewee
Regular Advisor

errors on CIXCD device

Hi All,

We received few errors on the following device and I don have much idea of how to handle it. please send me the suggestions.


SYSTEM> sho dev PNA0:/full

Device PNA0:, device type CIXCD, is online, shareable, error logging is enabled.

Error count 4 Operations completed 0
Owner process "" Owner UIC [SYSTEM]
Owner process ID 00000000 Dev Prot S:RWPL,O:RWPL,G,W
Reference count 0 Default buffer size 0


errors::




******************************* ENTRY 624. *******************************
ERROR SEQUENCE 15937. LOGGED ON: CPU_TYPE 00000007
DATE/TIME 18-MAR-2009 08:11:31.77 SYS_TYPE 0000000C
SYSTEM UPTIME: 29 DAYS 02:35:52
SCS NODE: XXXX OpenVMS AXP V7.1-1H1

HW_MODEL: 00000621 Hardware Model = 1569.

ERL$LOGMSCP ENTRY AlphaServer 8400 5/440

MESSAGE TYPE 000B
DATAGRAM FOR NON-EXISTING "UCB"
CLASS DRIVER 4B534944
/DISK/
CDDB$Q_CNTRLID 012D0001 80407246
UNIQUE IDENTIFIER, 000180407246(X)
MASS STORAGE CONTROLLER
MODEL = 45.
CDDB$B_SYSTEMID 10011B24
4200
MSLG$L_CMD_REF 00000000
MSLG$W_SEQ_NUM 001A
SEQUENCE #26.
MSLG$B_FORMAT 00
CONTROLLER LOG
MSLG$B_FLAGS 02
NON-ERROR/INFORMATIONAL EVENT
MSLG$W_EVENT 022A
CONTROLLER ERROR
UNKNOWN SUBCODE #0011(X)
MSLG$Q_CNT_ID 012D0001 80407246
UNIQUE IDENTIFIER, 000180407246(X)
MASS STORAGE CONTROLLER
MODEL = 45.
MSLG$B_CNT_SVR 01
CONTROLLER SOFTWARE VERSION #1.
MSLG$B_CNT_HVR 0C
CONTROLLER HARDWARE REVISION #12.

CONTROLLER DEPENDENT INFORMATION

LONGWORD 1. 00640000
/..d./
LONGWORD 2. 24050705
/...$/
LONGWORD 3. 00000000
/..../
LONGWORD 4. D5D00000
/..ÐÕ/
LONGWORD 5. 00000852
/R.../
LONGWORD 6. 00000000
/..../
LONGWORD 7. 00008800
/..../
LONGWORD 8. 00000000
/..../
LONGWORD 9. 00000000
/..../
LONGWORD 10. 00000000
/..../
LONGWORD 11. 00000000
/..../
LONGWORD 12. 00000000
/..../
LONGWORD 13. 00000000
/..../
LONGWORD 14. 00000000
/..../
LONGWORD 15. 18020000
/..../
******************************* ENTRY 625. *******************************
ERROR SEQUENCE 15938. LOGGED ON: CPU_TYPE 00000007
DATE/TIME 18-MAR-2009 08:11:33.79 SYS_TYPE 0000000C
SYSTEM UPTIME: 29 DAYS 02:35:54
SCS NODE: XXXX OpenVMS AXP V7.1-1H1

HW_MODEL: 00000621 Hardware Model = 1569.

ERL$LOGMESSAGE AlphaServer 8400 5/440

CIXCD SUB-SYSTEM, _XXXX$PNA0:

DATA CABLE(S) STATE CHANGE PATH #1. WENT FROM GOOD TO BAD

LOCAL STATION ADDRESS, 00000000000C(X)
LOCAL SYSTEM ID, 00000000077E(X)

REMOTE STATION ADDRESS, 000000000003(X)
REMOTE SYSTEM ID, 420010030324(X)

UCB$L_ERTCNT 00000032
50. RETRIES REMAINING
UCB$L_ERTMAX 00000000
0. RETRIES ALLOWABLE
UCB$L_ERRCNT 00000002
2. ERRORS THIS UNIT

CIXCD DEVICE DEPENDENT REGISTERS
TOKEN 00000000 81021EC0
OPCODE 00
CHNL INDX 00
FLAGS 8001
STATUS 0829
CIOPC 05
CIFLAG 00
D_SNODE 00
D_PGRP 03
S_SNODE 00
S_PGRP 00
MSG LENGTH 0000
MSG TYPE 0000
RESERVED 00000000 00000000

BYTE <3:0> 00000000 /..../
BYTE <7:4> 00000000 /..../
BYTE <11:8> 00000000 /..../
BYTE <15:12> 00000000 /..../
BYTE <19:16> 00000000 /..../
******************************* ENTRY 626. *******************************
ERROR SEQUENCE 15939. LOGGED ON: CPU_TYPE 00000007
DATE/TIME 18-MAR-2009 08:11:33.80 SYS_TYPE 0000000C
SYSTEM UPTIME: 29 DAYS 02:35:54
SCS NODE: XXXX OpenVMS AXP V7.1-1H1

HW_MODEL: 00000621 Hardware Model = 1569.

ERL$LOGMESSAGE AlphaServer 8400 5/440

CIXCD SUB-SYSTEM, _XXXX$PNA0:

DATA CABLE(S) STATE CHANGE PATH #0. WENT FROM GOOD TO BAD

LOCAL STATION ADDRESS, 00000000000C(X)
LOCAL SYSTEM ID, 00000000077E(X)

REMOTE STATION ADDRESS, 000000000003(X)
REMOTE SYSTEM ID, 420010030324(X)

UCB$L_ERTCNT 00000032
50. RETRIES REMAINING
UCB$L_ERTMAX 00000000
0. RETRIES ALLOWABLE
UCB$L_ERRCNT 00000003
3. ERRORS THIS UNIT

CIXCD DEVICE DEPENDENT REGISTERS
TOKEN 00000000 81021EC0
OPCODE 00
CHNL INDX 00
FLAGS 4001
STATUS 0229
CIOPC 05
CIFLAG 00
D_SNODE 00
D_PGRP 03
S_SNODE 00
S_PGRP 00
MSG LENGTH 0000
MSG TYPE 0000
RESERVED 00000000 00000000

BYTE <3:0> 00000000 /..../
BYTE <7:4> 00000000 /..../
BYTE <11:8> 00000000 /..../
BYTE <15:12> 00000000 /..../
BYTE <19:16> 00000000 /..../
******************************* ENTRY 627. *******************************
ERROR SEQUENCE 15940. LOGGED ON: CPU_TYPE 00000007
DATE/TIME 18-MAR-2009 08:11:33.80 SYS_TYPE 0000000C
SYSTEM UPTIME: 29 DAYS 02:35:54
SCS NODE: XXXX OpenVMS AXP V7.1-1H1

HW_MODEL: 00000621 Hardware Model = 1569.

ERL$LOGMESSAGE AlphaServer 8400 5/440

CIXCD SUB-SYSTEM, _XXXX$PNA0:

SOFTWARE IS CLOSING VIRTUAL CIRCUIT

LOCAL STATION ADDRESS, 00000000000C(X)
LOCAL SYSTEM ID, 00000000077E(X)

REMOTE STATION ADDRESS, 000000000003(X)
REMOTE SYSTEM ID, 420010030324(X)

UCB$L_ERTCNT 00000032
50. RETRIES REMAINING
UCB$L_ERTMAX 00000000
0. RETRIES ALLOWABLE
UCB$L_ERRCNT 00000004
4. ERRORS THIS UNIT

CIXCD DEVICE DEPENDENT REGISTERS
TOKEN 00000000 81021EC0
OPCODE 00
CHNL INDX 00
FLAGS 4001
STATUS 0229
CIOPC 05
CIFLAG 00
D_SNODE 00
D_PGRP 03
S_SNODE 00
S_PGRP 00
MSG LENGTH 0000
MSG TYPE 0000
RESERVED 00000000 00000000

BYTE <3:0> 00000000 /..../
BYTE <7:4> 00000000 /..../
BYTE <11:8> 00000000 /..../
BYTE <15:12> 00000000 /..../
BYTE <19:16> 00000000 /..../


Kindly suggest me.. how to proceed on this.
9 REPLIES 9
Hoff
Honored Contributor

Re: errors on CIXCD device

The CI and OpenVMS Alpha V7.1-1H1 and an AlphaServer 8400 5/440? That's some seriously old kit and all of it is in need of replacement. But ok, you've got what looks to be a potential hardware issue here.

Call HP or your locally-preferred hardware service organization. You've probably lost a CI cable (those do get damaged fairly easily, either through degradation or bending or getting stepped on or kinked or disconnected), or you could have a problem with the CIXCD or with the SC008 Star Coupler.

Those two errors and the pairing there tends to indicate somebody might be working within the CI hardware, too.

And the usual recommendations apply here, too: patch to current, and move from V7.1-1H1 to V7.1-2 with current patches, or move to a supported release. There are a number of patches for OpenVMS V7.1-* releases.
Joewee
Regular Advisor

Re: errors on CIXCD device

Many thanks Hoff...


I will call Hp now. and I will update you the status.

Also, how to check the status of the connectivity being restored in this case.

Thanks again,
Joe
Jim_McKinney
Honored Contributor

Re: errors on CIXCD device

> REMOTE STATION ADDRESS, 000000000003(X)
> REMOTE SYSTEM ID, 420010030324(X)

Is that an HSJ sitting out on the CI? Is it behaving?
Joewee
Regular Advisor

Re: errors on CIXCD device

Jim,

Yes we have 2 HSJ50 controllers.

One of them was hung today and i have crashed it to respond back with please find the snap shot below.


MRLB_SYSTEM> set h/dup/task=cli/server=mscp$dup HSJ029
%HSCPAD-F-NOLOCEXE, Local program not executing
-SYSTEM-F-PGMLDFAIL, program load failure
%HSCPAD-S-END, Control returned to node MRLB
MRLB_SYSTEM> SET HOST/DUP/SERVER=MSCP$DUP/TASK=cli hsj029
%HSCPAD-F-NOLOCEXE, Local program not executing
-SYSTEM-F-PGMLDFAIL, program load failure
%HSCPAD-S-END, Control returned to node MRLB
MRLB_SYSTEM> SET HOST/DUP/SERVER=MSCP$DUP/TASK=crash hsj029
%HSCPAD-I-LOCPROGEXE, Local program executing - type ^\ to exit
Controller will now crash...

%HSCPAD-F-NOLOCEXE, Local program not executing
-SYSTEM-F-VCBROKEN, virtual circuit broken
%HSCPAD-S-END, Control returned to node MRLB
MRLB_SYSTEM> SET HOST/DUP/SERVER=MSCP$DUP/TASK=cli hsj029
%HSCPAD-I-LOCPROGEXE, Local program executing - type ^\ to exit

Copyright Compaq Computer Corporation 1993, 1998. All rights reserved.
HSJ50-AX Firmware version V57J-1, Hardware version B04

Last fail code: 88000000

Press " ?" at any time for help.

The CLI will take up to 60 seconds to initialize.


MRLB_HSJ029> sho fail


Joe
Jim_McKinney
Honored Contributor

Re: errors on CIXCD device

> One of them was hung today

The CI errors observed on the 8400 may been a reaction to the HSJ issue. Those errors were logged at 8:11 - is that when the HSJ misbehaved?
Joewee
Regular Advisor

Re: errors on CIXCD device

Hi Jim,

No the controllers are fine from 7:00 in the morning coz i crashed it before that itself.

Joe
Hoff
Honored Contributor

Re: errors on CIXCD device

Might want to have HP look at the firmware on the HSJ50 while they're in looking at the rest of the CI and the error logs, too. There seems to be some slightly newer firmware (V57J-6, or later?) around.

(I would not hold out hope that the firmware is the issue here, however.)

If the CI cabling or a CI host or a CI storage controller goes wonky, all bets are off. And anything within or along CI path zero anywhere in the star could be involved here.

Joewee
Regular Advisor

Re: errors on CIXCD device

Hardware call has been placed with HP :)

This is out of my curiosity.

But, now the controllers are fine and all the disks are available to the system and there is no problem with the system.

Will the problem still be there or could the problem be restored? something like the cable might be disturbed and then its been fixed by mistake or something like that.

How to make sure that everything is fine. What am i supposed to check to make sure every thing is fine.
Hoff
Honored Contributor

Re: errors on CIXCD device

How can you be sure? Well... You're not going to like this answer ... But it involves commencing the planning for and then executing on the replacement strategy of this teenage-vintage gear.

In terms of the hardware, an rx2600 box will run rings around this AlphaServer 8400 box. In 3U of rack. With up to 24 GB of memory. Add 3U or 6U of rack if you need more than the near-terabyte of disk storage of memory that can be stuffed into the base 3U box.

And yes, getting the software ported from OpenVMS Alpha V7.1-1H1 to OpenVMS I64 V8.3-1H1 will involve some work.

Moving to a newer AlphaServer would involve less work in porting over the software. And a 3U-class AlphaServer box would still easily outrun this AlphaServer 8400 box.

Otherwise, ask HP services when they arrive; they'll hopefully have a better idea of why your HSJ is tipping over and why your CI is throwing faults. They'll have a far better picture of the configuration than we do.