1760572 Members
3205 Online
108894 Solutions
New Discussion юеВ

Shared SCSI problem

 
Willem Grooters
Honored Contributor

Shared SCSI problem

I have a problem with (at least) one machine on the shared SCSI bus (DIDO).
It can access all drives, but the first attempt to access a disk for write takes a long time, and gives console output as in the attached file.
The same happens if this node is booted from a system disk (containing the pagefile!). It DOES start but at some point it hangs with sequences like this.

On all systems, this is devive PKB. On the system that bever has a problem, PKB is a KZPBA (ID=7), on the other system it's a KZPAC (ID=6).

What can be the problem and how to overcome it?
Willem Grooters
OpenVMS Developer & System Manager
9 REPLIES 9
Uwe Zessin
Honored Contributor

Re: Shared SCSI problem

A KZPAC is a PCI backplane RAID controller.

I guess you have a 3X-KZPCA-AA instead.

http://h71000.www7.hp.com/wizard/wiz_8893.html
says:
""The KZPCA is not supported for multihost operations with OpenVMS.""


Which variant of the KZPBA (-CA, -CB, -CC) do you use?
.
Willem Grooters
Honored Contributor

Re: Shared SCSI problem

Sorry: no KZPAC, KZPSA. The card holds the floowing data on stickers:
*54-22944-B1* *RB1* *ZGB2572224*
A09-KZPSAPS

KZPBA is -CY - and working perfectly in one machine. I have two others but all have ID=7 and I read in the manual I need to have some software to change that. And I obtained the cards, not the sofwtare....


Willem Grooters
OpenVMS Developer & System Manager
Robert_Boyd
Respected Contributor

Re: Shared SCSI problem

The SCSI ID can be changed at the console level with environment variables for most of the supported PCI cards.

at the >>> prompt enter the command SHOW PK* and see what you get.

You can change the id by selecting the right one and setting the address to something else so that you can share the bus. If you have only 2 systems it will be easier than if you have 3 (you'll need tri-link connectors or one of the active devices depending on how many total connections there are).

Robert
Master you were right about 1 thing -- the negotiations were SHORT!
Ian Miller.
Honored Contributor

Re: Shared SCSI problem

as well as checking the scsi id on each card also check termination. Some KZPSA have termination resistors on board (bank of yellow resistors) and some do not (therefore need external terminator).

Check also the other pk* environment variables for that card. IIRC pk*fast and pk*term
____________________
Purely Personal Opinion
Rinkens
Advisor

Re: Shared SCSI problem

Hi

Can it be that it is the following parameter,
at your chevron prompt. I remember that we had a problem like this a few years ago.

We changed the scsi_reset from on to off

scsi_poll ON
scsi_reset OFF (ON)

This means that you get a scsi bus reset extra, if you reboot the system.

Regards Kor

Willem Grooters
Honored Contributor

Re: Shared SCSI problem

Robert,
that does not solve the problem.
What I did observe:
When KZPSA in the system, when trying to write to a disk on shared SCSI, the connection to the HSZ50 is lost, swapped to use MSCP by the other (running) node; Next, it swappes back and will access the disk - and copy the files. When a cluster node is booted, this will continue on and on so the node won't boot at all. (MSCP is started)
In another node, I exchanged KZPSA by KZPBA - because this device is a requirement for shgared scsi (according the documyentation). I set PKB_ID to 5, but that is not sufficient - the system will boot (from local disk) but hang on starting MSCP - with no error. From the documentatio I understand I need a program eeromcfg - I found that. It makes sense - if the firmware says ID=7 (that's what PKB_ID was set by SRM) but PKB_ID has been set to another valuse - there is a mismatch.
Even more: when this system is in console mode, it may crash - and it continouosly dos expecting XDelta being loaded when another system (where PKB0_ID is correctly set to 7) is booting.
Not a matter of termination - that is correct(as fas as I can see).
A brief description in attachement.
Willem Grooters
OpenVMS Developer & System Manager
John H. Reinhardt
Frequent Advisor

Re: Shared SCSI problem

According to your last diagram each node has a different node allocation class value. According to VMS Cluster Systems, Section 6.2.4.1. Rule #5 is "Each node for which MSCP serves a device should have the same nonzero allocation class value." You didn't specifically mention MSCP serving the disks but apparently you are. I think you either need to turn off MSCP (MSCP_SERVE_ALL = 0 to disable all serving or = 2 to only serve local, i.e. non-HS* or DSSI disks) serving for them or set the node allocation class to the same number for all three nodes.

Also I think that setting the SRM variable PKx0__HOST_ID overrides the firmware setting for the SCSI card's id.

In addition, since you said all the SCSI cards are PKB you do not need to use the Port Allocation Class method. You can use the Node Allocation Class method. For this you would still want to have all three nodes with the same allocation class.

Personally, I would give all three nodes the same Node Allocation Class, ditch the Port Allocations, turn off MSCP serving for non-local disks and put the same SCSI card in all three using the PKB0_HOST_ID environment variable to set the scsi id for each card. This may not solve your problem totally, based on the description I think there may be other problems, but it may be part of the cause. There may be a configuration error in the HSZ50(s) also. Unfortunately it's been a while since I've configured any and I couldn't tell you from memory what to do.
John H. Reinhardt
Frequent Advisor

Re: Shared SCSI problem

Nevermind what I said before. I just read your other thread about creating a single system disk and it finally dawned on me that each node has at least two SCSI cards (PKB - duh!) so you probably can't have the same Node Allocation Class number on each. From what I read then what you need to do is turn off the MSCP serving for the HSZ disks and see what that does. In the other thread you mentioned the second added (SYS1) hangs somewhere about the time MSCP starts so that may be the problem.
Willem Grooters
Honored Contributor

Re: Shared SCSI problem

Think I found it: PSA cannot handle shared SCSI so all systems should have PCB.
Problem deferred...
Willem Grooters
OpenVMS Developer & System Manager