Alpha Servers
1752448 Members
5834 Online
108788 Solutions
New Discussion

Re: ds20 servers connect to hsz80

 
allant-au
Occasional Advisor

ds20 servers connect to hsz80

We have 2 ds20 servers, one with 2 pk interfaces and one with 3.

The hsz80 is setup in a standard failover configuration, with controller being scsi 6 and the other 7.

 

Node 1 is a ds20 connected to the controller with scsi 6 has 3 pk interfaces and are set:

pka0_host_id = 6

pkb0_host_id = 6

pkc0_host_id =6 <--- connected to hsz80

 

Node 2 is a ds20 connected to the controller with scsi 7 has 2 interfaces and are set:

pka0_host_id = 7

pkb0_host_id = 7 <--- connected to hsz80

 

The first host (6) boots ok.

 

The first host hangs as soon as I power on the second host and issue a show dev at the >>> prompt. 

If I try booting the second host, there is an obvious

scsi contention since the boot disk keeps going offline / online.

 

We have tried replacing all the cables and the disk controllers.

 

It is not the hosts, since I can boot either node (individually)

if I use the cable connected to the controller with scsi 6.

 

Neither node will boot with the cable connected to controller with scsi of 7.

 

We have changed all the cables and the controllers.

 

Any help would be appreciated.

 

We currently have another cluster in production which is basically identical except I do not know what the

pk*_host_id values are set to and it is difficult to get downtime to get the values.....

 

Any help woiuld be greatly appreciated as I think I have tried everything....

11 REPLIES 11
Bob Blunt
Respected Contributor

Re: ds20 servers connect to hsz80

You should make a map, really.  You've got a variety of ways the hosts and controllers can be connected and the HSZs can be setup in a variety of ways so the drives can be split across the controller SCSI interfaces.  This gets complex quickly hence the need for the map.

 

bob

allant-au
Occasional Advisor

Re: ds20 servers connect to hsz80

I will take a photo when I return to work after the easter break.
Basically, at the host end is a terminated Y-Cable
At the hsz end, each of the 2 controllers have a bi-connector.
One port of each of the b-connectors goes to each host and the other is used to loop to the other controller for failover.
allant-au
Occasional Advisor

Re: ds20 servers connect to hsz80

I have attached a diagram. I am grateful for any help. I think I have tried everything including re-installing the os and adding the second node via cluster_config.com. I have also tried most combinations of alloclass on the hosts (currently 0).

allant-au
Occasional Advisor

Re: ds20 servers connect to hsz80

Further information. I have added the configuration for the HSZ (show this, show other).

Bob Blunt
Respected Contributor

Re: ds20 servers connect to hsz80

in all honesty I don't see how this is working at all.  You HAVE to have unique SCSI addresses for the different things on the bus.  For instance, based on the configuration information you're providing it LOOKS like you have HSZ controllers at SCSI ID 7 and SCSI controllers on the same bus that also are using SCSI ID 7.  Same for the other system, it looks like you've got HSZs at SCSI ID 6 and host SCSI controllers at SCSI ID 6.  These host controllers should be differential SCSI so you've got addresses available from 0 through 15.  I'd change the SCSI controllers on one host to ID 8 and the other host to ID 9.

 

Ah, I see... the HSZs are HSZ70s and NOT HSZ80s.  I'd still change the IDs so there aren't any conflicts at all.  Change one host's SCSI controllers, all of them, to ID 8 and the other host to 9.

 

Since the HSZ70s only have one SCSI port you could either configure them connected to one common SCSI bus or as you've got them now; one bus from one system and a different bus from the other.  I'd recommend using one common bus and making sure that you use the same "controller letter" from both systems.  For example:  daisy chain from system 1 SCSI bus B to system 2 SCSI bus B to controller X's SCSI port to controller Y's SCSI port.  Depending on what operating system you're running this might be the best setup for a real cluster.  OpenVMS, for example, would prefer for all the drives to use the same names for all clustered systems.  I'm not so sure that other operating systems would need that same consideration.  The biggest part of your configuration issue at this time is the conflicting SCSI addresses though.  They've GOT to be unique.

allant-au
Occasional Advisor

Re: ds20 servers connect to hsz80

Firstly, sorry about the hsz80/70  confusion, it is difficult enough trying to explain the setup without getting the facts wrong.

 

I am returning to work tomorrow and will consult with the suppliers of our hardware.

 

It is reassuring to know that there are still people around who can help. I am the only OpenVMS person left and I do not normally  deal with the hardware very much.

 

Thank you for your prompt reply.

allant-au
Occasional Advisor

Re: ds20 servers connect to hsz80

I set alpha2's interfaces to:

 

pka0_host_id = 8

pkb0_host_id = 8

pkc0_host_id = 8 <--- connected to hsz

 

I set alpha7's interfaces to :

 

pka0_host_id = 9

pkb0_host_id = 9 <-- connected to hsz

 

But I am still getting the boot disk going offline  / online upon boot of alpha7

like there is still scsi contention even though I know there isn't.

 

When I can get the cables, I will re-connect via your suggestion:

 

pkb0 on alpha2 --> pkb0 on alpha7 --> controller 1 on hsj --> controller 2 on hsj --> teminator.

 

Another side effect which may be of significance is that if I connect the disk array on either system to other-than the

highest interface in each system, I cannot see the disks. Eg. pkb or pka on alpha2 or pka on alpha7.

Bob Blunt
Respected Contributor

Re: ds20 servers connect to hsz80

I'm not sure what O/S you're running...but...  The bus resets ARE an unfortunate part of the booting process particularly for OpenVMS.  One system will boot fairly normally and you won't see the resets in device errors or errorlog.  When you boot any other systems you will definitely "see" the resets and offline states from the first booted system during the boot during the boot of the second.  This is unavoidable (well, it was on OpenVMS anyway).

 

I didn't focus on the SCSI adapter types if you provided, from the console, a SHOW CONFIG output.  It could possibly be different type adapters (non-differential or HVD vs LVD or even one of those SE/LVD adapters).  Having the same connectors isn't always a guarantee, anymore, that the adapters are identical.  There are SCSI interfaces on the DS20s that reside on the motherboard that don't really function with peripherals on OpenVMS.  If, however, all of your SCSI interfaces are plugged into the PCI bus (which doesn't seem right, I'm pretty sure the DS20s have two on the MB of which one works, but I'd have to double-check that to be sure) one of those might not be differential.  In fact, I'm sure that NONE of the on-board SCSI adapters are differential so I'm not sure how you're talking to the HSZ from the system with only two PK adapters.

 

Just checked:  The HSZ70 DOES require differential (or as the manual states: Wide Ultra Differential SCSI–2.  I'm sure that would require the installation of a like controller into the PCI bus on both DS20s...which does beg the question:  Just where IS the controller connected to on DS20#2?

 

bob

allant-au
Occasional Advisor

Re: ds20 servers connect to hsz80

The 2 DS20 nodes are now identical. There are 3 wide ultra scsi 2 interfaces in each.

 

I tried daisy chaining as suggested:

 

hsz c1 --> alpha2 pkc0 --> alpha7 pkc0  --> terminator 

 

 

hsz c1 --> loop to hsz c2 --->terminator (failover setup).

 

Unfortunately, I could not see the disks on either system in this configuration.

 

Show config from one system (the other should be identical)

 

P00>>>show config
                        AlphaServer DS20 500 MHz

SRM Console:    V7.2-1
PALcode:        OpenVMS PALcode V1.98-79, Tru64 UNIX PALcode V1.92-74

Processors
CPU 0           Alpha EV6 pass 2.3 500 MHz      SROM Revision: V1.82
                Bcache size: 4 MB

CPU 1           Alpha EV6 pass 2.3 500 MHz      SROM Revision: V1.82
                Bcache size: 4 MB

Core Logic
Cchip           DECchip 21272-CA Rev 2.1
Dchip           DECchip 21272-DA Rev 2.0
Pchip 0         DECchip 21272-EA Rev 2.2
Pchip 1         DECchip 21272-EA Rev 2.2

TIG             Rev 4.14
Arbiter         Rev 2.10 (0x1)

MEMORY

Array #       Size     Base Addr
-------    ----------  ---------
   0         1024 MB    000000000
   1         1024 MB    040000000
   2         1024 MB    080000000
   3         1024 MB    0C0000000

Total Bad Pages = 0
Total Good Memory = 4096 MBytes


PCI Hose 00
     Bus 00  Slot 05/0: Cypress 82C693
                                                         Bridge to Bus 1, ISA
     Bus 00  Slot 05/1: Cypress 82C693 IDE
                                   dqa.0.0.105.0
     Bus 00  Slot 05/2: Cypress 82C693 IDE
                                   dqb.0.1.205.0
     Bus 00  Slot 05/3: Cypress 82C693 USB
                                   usba0.0.0.305.0
     Bus 00  Slot 07: DEGXA-TA Gigabit Ethernet
                                   ega0.0.0.7.0          00-D0-59-61-69-8B
     Bus 00  Slot 08: DEGXA-TA Gigabit Ethernet
                                   egb0.0.0.8.0          00-08-02-91-03-FA
     Bus 00  Slot 09: QLogic ISP10x0
                                   pkc0.8.0.9.0          SCSI Bus ID 8
                                   dkc0.0.0.9.0           HSZ70
                                   dkc1.0.0.9.0           HSZ70
                                   dkc2.0.0.9.0           HSZ70
                                   dkc3.0.0.9.0           HSZ70
                                   dkc4.0.0.9.0           HSZ70
                                   dkc5.0.0.9.0           HSZ70
                                   dkc6.0.0.9.0           HSZ70

PCI Hose 01
     Bus 00  Slot 07: QLogic ISP10x0
                                   pka0.8.0.7.1          SCSI Bus ID 8
                                   dka500.5.0.7.1         RRD46
     Bus 00  Slot 08: QLogic ISP10x0
                                   pkb0.8.0.8.1          SCSI Bus ID 8


ISA
Slot    Device  Name            Type         Enabled  BaseAddr  IRQ     DMA
0
        0       MOUSE           Embedded        Yes     60      12

        1       KBD             Embedded        Yes     60      1

        2       COM1            Embedded        Yes     3f8     4

        3       COM2            Embedded        Yes     2f8     3

        4       LPT1            Embedded        Yes     3bc     7

        5       FLOPPY          Embedded        Yes     3f0     6       2


P00>>>

 

 

 

Note:

 

To have at least one system up, I have broken the daisy chain and terminated at the first node.

This at least shows me the disks.