cancel
Showing results for 
Search instead for 
Did you mean: 

Hardware Configuration

admin1979
Super Advisor

Hardware Configuration

Hello,


We have a TRU64 Hardware cluster with 4 (2+2 split) nodes architecture.
Could anyone point the hardware installation manual for this? We have a task to power off and on the whole cluster. Hence looking for a harware manual for reference in order to shut it down in a proper way.
I know the general way would be to shutdown the hosts/nodes and then the disk enclosure but yet need to figure out various parts that are interconnected to each other mainly the power supply cables.
Can you help?

Thanx,
admin
12 REPLIES
Rob Leadbeater
Honored Contributor

Re: Hardware Configuration

Hi admin,

You've given us no details about the hardware configuration, so nobody will be able to tell you how to power it off...

As a start try searching for your servers here:

http://www.hp.com/go/alphaserver

Cheers,

Rob
admin1979
Super Advisor

Re: Hardware Configuration

Hi,
My apology.

The storage used for this cluster is RA7000 Storage System with dual HSZ70 Ul-traSCSI controllers. It has Memory Channel for cluster-interconnect and 2 Memory Channel Hub to connect all nodes to Memory Channel.
Nodes are running with TRU64 V5.1A.

Please let me know if anymore details are required.

Thanks,
Admin

Rob Leadbeater
Honored Contributor

Re: Hardware Configuration

Server details ?
admin1979
Super Advisor

Re: Hardware Configuration

Hi,

It has 4 (2+2 split) Compaq DEC6600 DS20E servers as the 4-node cluster.

Anything more then please let me know.


Thanx,
admin
Pieter 't Hart
Honored Contributor

Re: Hardware Configuration

I don't think ther will be a manual that will describe your configuration.

this describes neccessary steps.
- you can do a cluster shutdown "shutdown -c ..."
to shutdown all nodes at the same time.
- power down the hosts
- power down memory hubs
(- check storage replication if any?)
- flush storge controlers write back cache
- shutdown storage controlers
- powe down storage controller
- "shut down" controller cache batteries
(else they maintain power on the cache that's now empty).
- power down disks
- power down network (lan and san switches)
admin1979
Super Advisor

Re: Hardware Configuration

Thanks for the reply. Please find my questions/comments starting with >>>>>>


- you can do a cluster shutdown "shutdown -c ..." to shutdown all nodes at the same time.

>>>>>Ok is shutting down each hosts 1 by 1 not recommended? Please clarify.

- power down the hosts
- power down memory hubs
(- check storage replication if any?)
- flush storge controlers write back cache
>>>> this is required always?? How do we do that ?? Is it given in manual then I am going to check.

- shutdown storage controlers
- power down storage controller
- "shut down" controller cache batteries
(else they maintain power on the cache that's now empty).
>>>>> how?

- power down disks
- power down network (lan and san switches)



admin
Pieter 't Hart
Honored Contributor

Re: Hardware Configuration

>>>>>Ok is shutting down each hosts 1 by 1 not recommended? Please clarify.
NO if one by one you need to worry about maintaining quorum, els you cannot easily shut down the nodes at the time quorum is lost

- flush storge controlers write back cache
>>>> this is required always?? How do we do that ?? Is it given in manual then I am going to check.
at the storage controller
= connect a serial console
= do show this_controller, show other_controller
= wait until "no unflushed data in cache"(may shorten this by setting unflushed cache timer to low value. "set this_controller cache_flush_timer="

and see attached document about powerdown
this is extracted from a HP hsz70 document.
admin1979
Super Advisor

Re: Hardware Configuration

Thanks so much for the document and reply.

It seems , things are pretty complicated here. Umm .. to be precise, what we actually intend to do is , (if there is really a way),
we would like not to disturb the whole configuration at all. There is a maintenance activity to replace the power sockets of the cluster. I could see there are 4 external power cables coming out of the cluster cage.
But really not sure what they belong to (cluster or controllers or what not).
If there is any chance with which we can remove 1 power cable and put it in new socket then the next 1 and so on without takign down any of the hosts or controllers, it would be really great !!
This is only possible in case of a power redundancy.
But since not sure what these power cables are meant to be, no way we can try out this.

So basically our first job would be to identify these 4 long power cables.

Any idea??
Pieter 't Hart
Honored Contributor

Re: Hardware Configuration

It could be each cable is connected to a power distribution unit (PDU) in the cabinet.
It's also likely two cables go directly to the HSZ subsystem and just two to PDU's for hostst/MC-hub's/switches/KVM.

If setup the richt way each component with redundant PSU is connected to two different PDU's and component with single PSU must have a failover path connected to different PDU.

technically you could disconnect one power-cable and reconnect without interrupting operation.
We had some DS20E's that at first did not report the second PSU online after reconnecting!
After reconnecting the state still showed as redundant power failed.
I think it was resolved after a firmware update.
I would check this first by dis/re-connecting on a single host.

so you need to check globally if this goes for the cabinet.

If you have real doubts if the cabling is correct, you'll need to check all cables or do a complete shutdown of each the cabinets.
Rob Leadbeater
Honored Contributor

Re: Hardware Configuration

Hi admin,

I think you'll be best to try and do a complete shutdown of (at least) the DS20Es.

Although these can have redundant power supplies, they only have a single power input lead, therefore just pulling one of the power leads to the rack is likely to turn the power off to at least one of them.

The HSZ70 is more likely to have dual inputs, so it should be easier to move the leads for this one at a time.

Cheers,

Rob
admin1979
Super Advisor

Re: Hardware Configuration

Hi,

Please find my comments as >>>>>>

I think you'll be best to try and do a complete shutdown of (at least) the DS20Es.

>>>> Yes..shutting down 4 hosts looks preferable. If something goes wrong then atleast hosts should not get affected abruptly.
But as Pieter Hart suggested, shutdown -c need to be given on all hosts?? and Simple "shutdown -h now" does not work here??
What should be the sequence to shutdown these 4 hosts. And what about startup sequence?

Although these can have redundant power supplies, they only have a single power input lead, therefore just pulling one of the power leads to the rack is likely to turn the power off to at least one of them.

The HSZ70 is more likely to have dual inputs, so it should be easier to move the leads for this one at a time.

>>>> what about the other components though, like disk enclosures/ storage controller / memory channels?


Admin
Rob Leadbeater
Honored Contributor

Re: Hardware Configuration

Hi,

You'll find the manuals for the storage system here:

http://vt100.net/mirror/mds-199909/cd3/storage/smcppuga.pdf

If it is configured with redundant power you would normally see black power leads into the left hand side and white power leads into the right.

I think the shutdown procedures for the array are in that doc.

You shouldn't need to worry about the memory channel controllers if you're shutting the servers down.

The -c option to shutdown performs an orderly shutdown of the cluster. You just need to run it on one node. Look at the man page for shutdown.

Cheers,

Rob