Operating System - HP-UX
1835170 Members
2701 Online
110077 Solutions
New Discussion

SCSI hard disk failover problem

 
sk800i
Advisor

SCSI hard disk failover problem

I install MCOE in rx1600 and rx2620 server with 11.23 OS (ServiceGuard ver. 11.17). Both servers connected to external SCSI HDD using built-in SCSI port and configure the SCSI interface ID to 6 for rx2620 server only. I installed onlineJFS too. When I configure the Event Monitoring Services EMS) to handle external HDD failover (I un-plug one side of SCSI cable), it can't failover to standby server but network has no problem for virtual IP.

I guess the EMS configuration problem, do you have a sample to handle?
20 REPLIES 20
Steven E. Protter
Exalted Contributor

Re: SCSI hard disk failover problem

Shalom,

EMS merely monitors and reports.

Most likely EMS is incorrectly configured, I agree.

Check the package and cluster logs for Serviceguard to look into the virtual IP problem.

Perhaps submit cluster configuration files up to ITRC so people can take a look.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

Re: SCSI hard disk failover problem

So presumably you have defined an EMS service in Serviceguard package config to monitor the disk or LVM configuration?

If you do this, one requirement is that you need to have SERVICE_FAIL_FAST_ENABLED set to YES for the EMS monitor service. This is cos if the disk is not accessible Serviceguard can't do a clean shutdown of your application and unmount the filesystems. In this scenario all Serviceguard can do is TOC the node.

HTH

Duncan

I am an HPE Employee
Accept or Kudo

Re: SCSI hard disk failover problem

oops - I was thinking about if you had a service which checks for the disk rather than using EMS. If you're using EMS, then you'll need to set NODE_FAIL_FAST_ENABLED. Here's an example:


NODE_FAIL_FAST_ENABLED YES

RESOURCE_NAME /vg/vg01/pv_summary
RESOURCE_POLLING_INTERVAL 10
RESOURCE_START AUTOMATIC
RESOURCE_UP_VALUE = UP
RESOURCE_UP_VALUE = PVG_UP


HTH

Duncan

I am an HPE Employee
Accept or Kudo
sk800i
Advisor

Re: SCSI hard disk failover problem

Thanks all,

That's mean I shall not use EMS to configure the share disk or network failover. However, which config can handle external share disk? Because I can configure VIP and workable for system or network failover. Only external share disk can't do this.

dannyss
Advisor

Re: SCSI hard disk failover problem

Hello SEP & Duncan,

I got a same problem also.

My environment as the follow:

2 HP integrity Server(1600 and 2620) and contain a single external SCSI bus for 2 external HDD.
One HDD configure as a lock disk, the other one configure as a sharedisk.

The Serviceguard can perform a failover on network fail issue and running process fail issue. But the disk fail issue will make the package halted and down from all node.

I read your comment and apply related setting on the config file.

But the problem still in here. I post the package log file for your reference. I have not idea why the Serviceguard can not complete the halt process on the package.



########### Node "SG02": Halting package at Tue Apr 8 17:13:23 EAT 2008 ###########
Apr 8 17:13:23 - Node "SG02": Halting service baseclock
Apr 8 17:13:23 - Node "SG02": Remove IP address 192.168.123.3 from subnet 192.168.123.0
Apr 8 17:13:23 - Node "SG02": Deactivating volume group vg01
Deactivated volume group in Exclusive Mode.
Volume group "vg01" has been successfully changed.
Apr 8 17:13:23 - Node "SG02": Deactivating volume group vg02
Deactivated volume group in Exclusive Mode.
Volume group "vg02" has been successfully changed.

########### Node "SG02": Package halt completed at Tue Apr 8 17:13:24 EAT 2008 ###########

########### Node "SG02": Starting package at Tue Apr 8 17:18:17 EAT 2008 ###########
Apr 8 17:18:17 - Node "SG02": Activating volume group vg01 with exclusive option.
Activated volume group in Exclusive Mode.
Volume group "vg01" has been successfully changed.
Apr 8 17:18:18 - Node "SG02": Activating volume group vg02 with exclusive option.
Activated volume group in Exclusive Mode.
Volume group "vg02" has been successfully changed.
Apr 8 17:18:18 - Node "SG02": Checking filesystems:
/dev/vg02/sharedisk
/dev/vg02/rsharedisk:file system is clean - log replay is not required
Apr 8 17:18:18 - Node "SG02": Mounting /dev/vg02/sharedisk at /sharedisk
Apr 8 17:18:18 - Node "SG02": Adding IP address 192.168.123.3 to subnet 192.168.123.0
Apr 8 17:18:18 - Node "SG02": Starting service baseclock using
"/usr/bin/X11/xclock -display 192.168.123.121:0.0"

########### Node "SG02": Package start completed at Tue Apr 8 17:18:18 EAT 2008 ###########

########### Node "SG02": Halting package at Tue Apr 8 17:19:44 EAT 2008 ###########
Apr 8 17:19:44 - Node "SG02": Halting service baseclock
Apr 8 17:19:44 - Node "SG02": Remove IP address 192.168.123.3 from subnet 192.168.123.0
Apr 8 17:19:44 - Node "SG02": Unmounting filesystem on /dev/vg02/sharedisk
Safarali
Valued Contributor

Re: SCSI hard disk failover problem

Hi can you give the First_cluster_lock_vg and Pv is it same on both nodes

Regards
Safar
Safarali
Valued Contributor

Re: SCSI hard disk failover problem

Please provide the output of cmviewconf

Reagrds\

Safar
dannyss
Advisor

Re: SCSI hard disk failover problem

Hi Safar,

First_cluster_lock_vg and Pv is same.

Please verify on the cmviewconf result as the attachment.
Safarali
Valued Contributor

Re: SCSI hard disk failover problem

Hi
Are you able to start the package on second node,
if possible halt the package on first node and start in second node


Regards
Safar
dannyss
Advisor

Re: SCSI hard disk failover problem

Hi Safar,
Thank you for your reply.

I can start the package on second node.

I can kill the package process or unplug the network cable for trigger the failover from node 1 to node 2.

Danny
Safarali
Valued Contributor

Re: SCSI hard disk failover problem

Hi Dan may be I misunderstand your problem

1. You are not able to failover the package once you removed lan cable or syste shutdown

2. but you are able to start the package on secondary node once you stop the the package on first node

if it is OK then the problem I assumed is your second node is not able to communicate with lock disk

Regards
Safar
dannyss
Advisor

Re: SCSI hard disk failover problem

Hi Safar,

I am able to failover the package when I removed lan cable, system shutdown & kill the process that operation by the packeage

But I am not able to failover the package(running on node2) when I disconnect the extenal SCSI cable.

My SCSI bus is connect as the follow:
Node1 ---> lock disk ---> sharedisk ---> Node2
X
Disconnect
dannyss
Advisor

Re: SCSI hard disk failover problem

Sorry Safar,

The drawing should be like this:

My SCSI bus is connect as the follow:
Node1 ---> lock disk ---> sharedisk ---> Node2
X
Disconnect



I am disconnect the cable between sharedisk and node2.
Safarali
Valued Contributor

Re: SCSI hard disk failover problem


Hi
Your problem is second node is not able to communicate with lock disk, lock disk is applicable only for two node cluster, so cluster lock do not know who is master but when you start the package it is forming signlge cluster system with out lock disk

Regards
safar
dannyss
Advisor

Re: SCSI hard disk failover problem

Hi safar,

Thank you for your help.

I think I know how come the fail over not working.

Thank you very much!

Re: SCSI hard disk failover problem

What safar described is part of your problem... also if what you describe is a 'true' reperesentation of your storage system, you also have a big SPoF in the SCSI bus, bot least beacuse modern SCSI speeds don't support ILT:

(see the last paragrpaph here: http://docs.hp.com/en/B3936-90105/ch08s03.html#inlineterm )

I would *never* reccomend a SCSI based Serviceguard cluster now, there are just too many gotchas like this... go with fibre channel.

HTH

Duncan

I am an HPE Employee
Accept or Kudo
dannyss
Advisor

Re: SCSI hard disk failover problem

Hi Duncan,

Thank you for your remind.

Danny
Safarali
Valued Contributor

Re: SCSI hard disk failover problem

Hi if you are happy with our suggestion please assign the points

Regards
Safar
dannyss
Advisor

Re: SCSI hard disk failover problem

HI all,

I appreciate your suggestion and help.
But I am sorry.
I can not find any way to assign a generous point to you. If you know how to assign a point, please let me know. I will given the point.
Safarali
Valued Contributor

Re: SCSI hard disk failover problem

Hi Dan you can see the option assign the point only auther is autherized to assign the point to answer
this is the best practise if you are satisfied with answer, you can provide the point it will encourage the experts to provide the answer in future also.
below will be the information how to assign the points

What is the member point system?
Member status is established using a points-based feedback mechanism. Authors of original questions "grade" (on a Scale from 1-10) other members who reply to their questions. The higher the grade the more effective the response was in solving the question. This feature is HP's way of turning the reward mechanism over to the community. Each response to a question is eligible to receive feedback for its success in solving the user's problems.

» Top of page


I've read the replies to my question and I see "unassigned" next to each reply. What does that mean?
If you were to login, the term "unassigned" would turn into a dropdown menu of points. The points are used to rate the repliers to your question. Points range from 1-10. Every reply deserves recognition. Another member has taken the time to try and help you, so please take the time to assign them a point value based on the value of their reply. See "How Do I Assign Points".

» Top of page


How do I earn points in the Support Forums?
Another great question! To earn points, simply provide "replies of value" to questions posted in the Support Forums. The authors of the questions "grade" your reply or "solution" using a point scale of 1-10 (higher points awarded to responses having greater value). The question author is the only person who can assign points to replies. The more questions you answer, the more points you'll be eligible to earn and the more status you'll hold amongst your peers!