Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Question with reference to HBMM and System Disk

SOLVED
Go to solution
VMS Support
Frequent Advisor

Question with reference to HBMM and System Disk

I have just set up minimerge policy on all my shadow set in a DTCS three node cluster.
The cluster looks like below:
Site one :Two nodes boot from DSA0.
Site two :One node boots from DSA99.

Enable mini merge and policy on all shadow sets in cluster including DSA0 and DSA99.

To test crashed a single node (One that boots from DSA0:).
All minmerge ( 6 shadow sets) started after a 30 second (Estimated pause) and completed within 30 seconds ...Wow !
Forced crash of node to test via:
CTRL P to SRM prompt:
>>>Crash
Then wrote a crash dump file.

The help says the System disk is excempt from minimerge when you have the crash dump file on the shadow set disk. In my case the system disk DSA0, completed a mini merge.
How does this work? How does the other node know what blocks the node performing the crash dump has changed ?

Some details below:
show shadow dsa0/full


_DSA0: Volume Label: ALPHASYS_DDC
Virtual Unit State: Steady State
Cluster Virtual Unit Status: 0001 - normal
Local Virtual Unit Status: 00000110 - enforce_local_read,hbmm_eval_policy_enab
led
Enhanced Shadowing Features in use:
Host-Based Minimerge (HBMM)

Mount Count 2 Error Count 0
Total Devices 2 VU_UCB 8160FBC0
Source Members 2 SCB LBN 021E8FEC
Act Copy Target 0 Generation 00A550A1
Act Merge Target 0 Number 4F1230ED
Last Read Index 0 Master Mbr Index 0
Copy Hotblocks 0 Copy Collisions 0
SCP Merge Repair Cnt 0 APP Merge Repair Cnt 0
VU Timeout Value 16777215 VU Site Value 0
Copy/Merge Priority 5000 Mini Merge Enabled
Recovery Delay Per Served Member 30

HBMM Policy
HBMM Reset Threshold: 50000
HBMM Master lists:
Up to any 2 of the nodes: VLCPR1,VLCPR2
Any 1 of the nodes: VLCPR3
HBMM bitmaps are active on VLCPR2,VLCPR1
HBMM Reset Count 3 Last Reset 1-MAY-2006 19:16:04.84
Modified blocks since last bitmap reset: 44704

Device $1$DGA1198 Master Member
Index 0 Status 000000A0 mbr_src,mbr_valid
Ext. Member Status 00
Read Cost 2 Site 0
Member Timeout 120 UCB 815BBCC0
Error Count 0

Device $1$DGA2101
Index 1 Status 000000A0 mbr_src,mbr_valid
Ext. Member Status 00
Read Cost 2 Site 0
Member Timeout 120 UCB 818EFA40
Error Count 0

VLCPR2 $ana/crash sys$system
%SDA-I-USEMASTER, accessing dump file via _$1$DGA1198:, master member of shadow
set _DSA0:

OpenVMS (TM) system dump analyzer
...analyzing an Alpha compressed selective memory dump...
System crash information
------------------------

Time of system crash: 1-MAY-2006 12:05:31.73


Version of system: OpenVMS (TM) Alpha Operating System, Version V7.3-2

System Version Major ID/Minor ID: 3/0


VMScluster node: VLCPR2, a AlphaServer ES40

Crash CPU ID/Primary CPU ID: 00/00

Bitmask of CPUs active/available: 00000003/00000003


CPU bugcheck codes:
CPU 00 -- OPERCRASH, Operator forced system crash
1 other -- CPUEXIT, Shutdown requested by another CPU





10 REPLIES
Volker Halle
Honored Contributor
Solution

Re: Question with reference to HBMM and System Disk

With recent versions of OpenVMS, SDA is clever enough to read the dump from the member it's been written to. See

%SDA-I-USEMASTER, accessing dump file via _$1$DGA1198:, master member of shadow set _DSA0:

Volker.
VMS Support
Frequent Advisor

Re: Question with reference to HBMM and System Disk

So what your saying is that the other member of the shadow set is not in synch with the member the dump was written to. So if I was to remove DGA1198 from the shadow set, I would not be able to get to my recently written dump file, via $ana/crash sys$system: via DSA0 remaining member.

So the minimerge on the system disk did not include the dump file.

Thanks
Kevin
Volker Halle
Honored Contributor

Re: Question with reference to HBMM and System Disk

Kevin,

if you're interested, try

$ ANAL/DISK/SHADOW disk

to find out, if all blocks on that shadowset are identical.

Volker.
VMS Support
Frequent Advisor

Re: Question with reference to HBMM and System Disk

Proved point.
Thanks for the quick reply.

KevinVLCPR1>show dev dsa0

Device Device Error Volume Free Trans Mnt
Name Status Count Label Blocks Count Cnt
DSA0: Mounted 0 ALPHASYS_DDC 42787452 476 3
$1$DGA1198: (VLCPR1) ShadowSetMember 0 (member of DSA0:)
$1$DGA2101: (VLCPR1) ShadowSetMember 0 (member of DSA0:)
KevinVLCPR1>
KevinVLCPR1>
KevinVLCPR1>
KevinVLCPR1>dismount /policy=minicop $1$DGA1198:/cluster
KevinVLCPR1>show dev dsa0

Device Device Error Volume Free Trans Mnt
Name Status Count Label Blocks Count Cnt
DSA0: Mounted 0 ALPHASYS_DDC 42787452 476 3
$1$DGA2101: (VLCPR1) ShadowSetMember 0 (member of DSA0:)
KevinVLCPR1>ana/crash sys$system
%SDA-I-SINGLEMEM, single member shadow set; accessing dump file via _DSA0:

OpenVMS (TM) system dump analyzer
%SDA-E-NOTALPHADUMP, dump file does not contain an OpenVMS Alpha dump
VMS Support
Frequent Advisor

Re: Question with reference to HBMM and System Disk

From OpenVMS help (See below).
Is this still the case ?
Does anyone run minimerge against system disk shadow set with minimerge enabled ?

$help sys_p dumpstyle









Sys_Parameters

DUMPSTYLE

DUMPSTYLE specifies the method of writing system dumps.

DUMPSTYLE is a 32-bit mask, with the following bits defined.
Each bit can be set independently. The value of the system
parameter is the sum of the values of the bits that have been
set. Remaining or undefined values are reserved for HP use only.

Bit Mask Description

0 00000001 0 = Full dump (SYSGEN default).
The entire contents of physical
memory are written to the dump
file.
1 = Selective dump. The contents of
memory are written to the dump


Press RETURN to continue ...

file selectively to maximize
the usefulness of the dump file
while conserving disk space.

1 00000002 0 = Minimal console output.
1 = Full console output (includes
stack dump, register contents,
and so on).
2 00000004 0 = Dump to system disk.

1 = Dump off system disk (DOSD) to
an alternate disk. (Refer to
the HP OpenVMS System Manager's
Manual for details.)

3 (Alpha only) 00000008 0 = Do not compress.
1 = Compress. (VAX systems do not
support dump compression.)

4 (Alpha only) 00000010 0 = Dump shared memory.


Press RETURN to continue ...

1 = Do not dump shared memory. (VAX
systems do not support shared
memory.)

5 - 14 Reserved for HP use only.

15 (VAX only) 00008000 0 = Disable use of bits 16 - 27.
(Specific to VAX 7000s.)
1 = Enable use of bits 16 - 27.

16 - 27 (VAX 0FFF0000 Range of DOSD unit numbers. (VAX
only) systems do not support shared
memory.)

28 - 31 Reserved for HP use only.

If you plan to enable the Volume Shadowing minimerge feature on
an Alpha system disk, be sure to specify DOSD to an alternate
disk.
VMS Support
Frequent Advisor

Re: Question with reference to HBMM and System Disk

Sorry ...I dumped the hole help section ...The bit I found interesting was:

If you plan to enable the Volume Shadowing minimerge feature on
an Alpha system disk, be sure to specify DOSD to an alternate
disk.

Cant find reference to it any where else.
Volker Halle
Honored Contributor

Re: Question with reference to HBMM and System Disk

Kevin,

the mini-merge operation would have been done by the 'other' member, that also has this system disk mounted. It certainly cannot know about the IOs done from the crashing system writing the system dump to the boot/master member.

You can find a short description in the OpenVMS V7.3-2 Shadowing Manual (has not been updated for V8.2):

http://h71000.www7.hp.com/doc/732FINAL/aa-pvxmj-te/aa-pvxmj-te.HTML

Minimerge operations
. configuring for system disk

But this talks about controller-based mini-merge.

The description of HBMM is found in the V8.2 New Features Manual (Chapter 6):

http://h71000.www7.hp.com/doc/82FINAL/6675/6675pro_007.html#minimerge_fc_h

But there are no references to system-disks and HBMM in that chapter.

Due to the special behaviour of a shadowed system disk regarding the dump file and HBMM, it is suggested to use DOSD, to be able to always (transparently) access/recover the dumpfile under all circumstances. As you've seen, it also 'works' without DOSD, but you need to understand the implications.

Volker.
VMS Support
Frequent Advisor

Re: Question with reference to HBMM and System Disk

Volker,
Thanks for your reply.
I will look at moving the crash dump away from the system disk.

Would you like to reply with the SYSGEN param I need to apply.
I want to give you the 10 points you need for your promotion. It would be a pleasure to see you promoted. Well deserved I think.

Thanks
Kevin
Volker Halle
Honored Contributor

Re: Question with reference to HBMM and System Disk

Kevin,

to set up DOSD on OpenVMS Alpha (or I64), you need to:

- set bit 2 in the DUMPSTYLE system parameter (typically changing the default value from 9 to 13). Run AUTOGEN or manually set the new value with SYSGEN.

- create the DOSD dump file on dosd_disk:[SYSn.SYSEXE]SYSDUMP.DMP, e.g.

Mount dosd_disk:
Add DUMPFILE_DEVICE="dosd_disk:" to MODPARAMS.DAT
make sure there is no DUMPFILE=0 in MODAPRAMS.DAT
Run @AUTOGEN GETDATA GENFILES NOFEEDBACK
to create the DOSD dump file (and directory)

- set console environment variable DUMP_DEV to point to your DOSD device, then your system disk(s), e.g.

>>> set DUMP_DEV DKA100, DKA0

- MOUNT the DOSD device in SYCONFIG.COM or SYLOGICALS.COM and assign the system logical name CLUE$DOSD_DEVICE to point to that device, so that CLUE will work, e.g.

$ MOUNT/SYS/NOASSI dosd_disk label CLUE$DOSD_DEVICE

Volker.
VMS Support
Frequent Advisor

Re: Question with reference to HBMM and System Disk

Volker and everyone.

Thanks for the reply.

Volker a well deserved promotion I think.

Hope these are the only points that Germany will score from England during the next few months :-) ...
Have a happy World Cup.

Thanks
Kevin