TruCluster
Showing results for 
Search instead for 
Do you mean 

Unable to get root2_domain's partitions correct

Frequent Advisor

Unable to get root2_domain's partitions correct

Hello,

I have a fielded system that I am trying to get the 2nd cluster member working on. I run clu_add_member and all runs well until the boot disk is partitioned. It fails stating:

disklabel: partition h extends past end of unit
disklabel: Replacement label is invalid

a few more errors stating that the disklabel cannot be used is displayed and the process dies. I have tried to manually set the partition sizes, only to have them overwritten by the clu_add_member process (kinda expected this). I have tried clu_bdmgr with the same results. I am at wits end...

I can tell you that my customer is using EMC instead of the HSG80 (which I know much better), they are running Tru64 5.1A, and apparently the clu_create worked fine (The partitions on dsk19 look fine)

Can anyone give me an idea of where to look next? Some of this was done while I was out on medical leave and now I am trying to piece it all together. Any help will be appreciated.

Thanks
Allan
12 REPLIES
HPE Pro

Re: Unable to get root2_domain's partitions correct

Can you post the output of disklabel -r dsk19?
I work for HP
A quick resolution to technical issues for your HP Enterprise products is just a click away HP Support Center Knowledge-base
See Self Help Post for more details

Honored Contributor

Re: Unable to get root2_domain's partitions correct

can you please check
disklabel of root disk of first memeber and second.

Try copying disklabel from 1st to second member.

BR,
Kapil+
I am in this small bowl, I wane see the real world......
Valued Contributor

Re: Unable to get root2_domain's partitions correct

Or even, zero and reset the disklabel on member #2's proposed boot disk :

# disklabel -rz dskX
# disklabel -rw dskX

Then check that the default disklabel matches the size of the underlying unit or volume - ie. partition 'c' should match the disk's capacity as reported by scu. If they don't agree, a label created by hand is required.

John M
Nobody can serve both God and Money
Frequent Advisor

Re: Unable to get root2_domain's partitions correct

Martin:

The disklabel for dsk19 looks like this:
a: 524288 0 AdvFS
b: 3028219 524288 swap
c: 104857600 0 unused
d: 0 0 unused
e: 0 0 unused
f: 0 0 unused
g: 52232192 393216 unused
h: 2048 3552507 cnx



For completeness, the disklabel for dsk20:
a: 131072 0 unused
b: 262144 131072 unused
c: 104857600 0 unused
d: 0 0 unused
e: 0 0 unused
f: 0 0 unused
g: 52232192 393216 unused
h: 52232192 52625408 unused


Thanks for looking at this with me!
Allan
Frequent Advisor

Re: Unable to get root2_domain's partitions correct

Kapil,

I have tried copying te disklabel information from dsk19 to dsk20 and verified that it worked. Then when running clu_add_member, the disklable on dsk20 (member 2) is rewritten back to the default (see earlier post) and the procedure fails stating that partition h is basically out of bounds. I have also tried to set the partition size manually, and get the same results. I have even tried to use clu_bdmgr to set the partition sized... srill fails.

Is there a way to tell clu_add_member to NOT rewrite the disklabel information? So far I have only been interested in seting the partition sizes, I have not tries to edit the "dscriptive" information.

Thanks
Allan
Frequent Advisor

Re: Unable to get root2_domain's partitions correct

John,

I did as you suggested and I still get the same results. The partition table is included in an earlier post. I sure hope someone can lead me in the right direction. I have been banging away at this for 2 day now...

Thanks for your help
Allan
Frequent Advisor

Re: Unable to get root2_domain's partitions correct

Here is another piece to the puzzel I just found a minute ago.

I was looking through the tmp files (/cluster/admin/tmp) and there are files named "bootlabel.clu_add_member.{pid}" for each attempt I made to create this member. The file contains what the partition table would look like, and yes partition h is WAY out of bounds. Here is what the partition table looks like:

a: 524288 0 AdvFS
b: 16777216 524288 swap
c: 104857600 0 unused
d: 0 0 unused
e: 0 0 unused
f: 0 0 unused
g: 52232192 393216 unused
h: 2048 104855552 cnx

what looks like is happening is that the white space is missing between the 2048 and the rest of the numbers.

I do not know why this would happen....
Frequent Advisor

Re: Unable to get root2_domain's partitions correct

oops, I put the space in the partition h's number. In reality I see no space after the 2048... sorry for the confusion.
Honored Contributor

Re: Unable to get root2_domain's partitions correct

Hi Allan,

AFAIK the clu_add_member does not copy the disklabel and OS from member-1 rootdomain, but from another disk.
There must be a disk (possibly a local disk) that contained the preinstalled software from wich the cluster was originally created?

is this a new installation? or an existing installation that broke down?
Is the EMC redundant conected?

- first see that member-2 is powerd down.
- run clu_check_config and inspect the output.
Frequent Advisor

Re: Unable to get root2_domain's partitions correct

Member-2 is at boot prompt (of course). There is a seed disk for member-1, but it's partitions do not look like any other boot disk I have seen, and it is smaller than the disk I am working with.

I have not been to the site where they elected to install the EMC. Actually, they replaced a perfectly working EMC with a newer model (I am waiting for the specs as I type this out). Thed did not do this to recover, they did this to upgrade.

I have heard that clu_add_member does not always work well with EMC, but I am unfamiliar with any other cluster tools to accomplish what needs to be done. I have run a clu_check_config, and NONE of member-1 daemons are running (aliasd,kloadsrv,slu_wall, and so on).

What will powering off member-2 do for us? I can ask the site to do this, but they are going to ask why...

Thanks for your help
Allan
Frequent Visitor

Re: Unable to get root2_domain's partitions correct

fwiw,
From Trucluster Server Handbook
By Jim Lola and Scott Fafrak, Dennis O'Brien, Gregory Yates, Brad Nichols
Publisher: Digital Press
Paperback 448 pages
ISBN 1555582591

Section 10.1.2
To use a custom disklabel, create the
sizes you want and prelabel the partitions:
ie)
disklabel -sF dsk1a AdvFS
disklabel -sF dsk1b swap
disklabel -sF dsk1h cnx

The setup will see the labels and prompt
"whether you want to use the existing partition sizes" - and so the existing partitions will be re-used and not overwrite them.

hth for the future,
Marlene
Frequent Advisor

Re: Unable to get root2_domain's partitions correct

OK... after a long review and a trial installation on one of our lab systems, we have discovered that the disk we were trying to use as a boot drive was too big. There is a known bug in TruCluster that will mess up the partition numbers if the left over partition size for CNX is too large. I do not know what that number is, but ours apparently exceeded that threshold.

Using TruCluster (5.1A), once we modified the size of the disk on the EMC to something more in line with a regular disk that works, we had no problems with the clu_add_member.

While running TruCluser with 5.1B, and installing the patches), we had no problems with the installation.

Just thought I would post what I found out. I appreciate all of you offering me things to look at/try.

Allan
//Add this to "OnDomLoad" event