Operating System - OpenVMS
1827838 Members
1646 Online
109969 Solutions
New Discussion

Storage works merging for 24 hours after node crash

 
SOLVED
Go to solution

Storage works merging for 24 hours after node crash

We have a two node alpha cluster . In the past couple of days , one of the nodes has crashed twice. After it reboots, the shadow set is taking up to 24 hours to merge the shadow sets and peggs CPU on the nonchashed node. Any ideas?
23 REPLIES 23
Robert Brooks_1
Honored Contributor

Re: Storage works merging for 24 hours after node crash


What version of VMS? If it's V7.3-2 (with recent patch kits) or later, you should strongly consider using Host-Based Minimerge (HBMM).

If HBMM is not an option, please see the documentation regarding the logical name SHAD$MERGE_DELAY_FACTOR


-- Rob (one of the HBMM engineers)

Re: Storage works merging for 24 hours after node crash

OpenVMS V7.3-2 , Ive never seen shadow server take this long to complete the merge.
Jan van den Ende
Honored Contributor
Solution

Re: Storage works merging for 24 hours after node crash

Kendall,

to start with:
WELCOME to the VMS Forum.

The time you observe for Shadow Merges has for a LONG time been an issue.
That is, it BECAME an issue when non-DSA-compliant disks came into wide use.
If you were shadowing devices on HSC, (and I beleive both of HSJ and HSD, no experience there, so not sure) you were used to shadow merge times of several SECONDS.

HBVS shadowing of SCSI devices however took MANY hours. Your 24 hour is no record, by far.

But, Engeneering FINALLY managed to get HBMM (HostBasedMiniMerge) to work. It certainly earned them a BIG applause from me (on behalve of our users)!

Obviously you do not have HBMM activited.
If ever there was an incentive to do just that, _YOU_ have found out the hard way!

Just instal HBMM_002 (I have not got the exact name, but you need V2).
The hardest part is finding the command in the realease notes, and you are all set.

Success!

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Ian Miller.
Honored Contributor

Re: Storage works merging for 24 hours after node crash

There has been various issues with this sort of thing. Are you reasonably current on patches?
(esp shadowing, fibre_scsi)
____________________
Purely Personal Opinion

Re: Storage works merging for 24 hours after node crash

I saw a note out there about hbmm being on hold for VMS 7.32. anything I should be concerned about?
Robert Brooks_1
Honored Contributor

Re: Storage works merging for 24 hours after node crash

With respect to HBMM being on hold for V7.3-2; that issue is almost two years old. HBMM for V7.3-2 was released in fall 2004; shortly after we released the kit, we found an issue that needed to be addressed. Any UPDATE kit from 2005 and beyond will contain the correct HBMM bits.


-- Rob
Thomas Ritter
Respected Contributor

Re: Storage works merging for 24 hours after node crash

We just completed Host-Based Minimerge (HBMM) deployment on all of our VMS 7.3-2 systems.
On the Test Cluster,after a crash, disk merges would take about 3 days to completed. After HBMM was introduced the merges were completed in about 15 minutes. Repeat 15 minutes....
Treat HBMM deployment as a project. You will need to know your disk usage patterns and configure accordingly.

Robert Brooks_1
Honored Contributor

Re: Storage works merging for 24 hours after node crash

15 minutes? What's the reset_threshold on the various policies?


-- Rob
Jan van den Ende
Honored Contributor

Re: Storage works merging for 24 hours after node crash

Like Robert, I am surprised at those 15 minutes.
We went down from 18 hours to well under 1 minute.

fwiw.

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Jiri_5
Frequent Advisor

Re: Storage works merging for 24 hours after node crash

we have good experince with HBMM too. Befor activation we must manage merge (only 4 disk paralel on each node in cluster - aplication must run) and merge was slow an take long time (12 hour). After activation is it under 1 minute.


Jirza

Re: Storage works merging for 24 hours after node crash

What exact kit name am I looking for , so I can download .
Volker Halle
Honored Contributor

Re: Storage works merging for 24 hours after node crash

Kendall,

any VMS732_UPDATE_Vnn00 kit (starting with VMS732_UPDATE-V0300) has the fixes from VMS732_HBMM-V0200 included.

If you have not installed that update kit (or any newer ones: check with PROD SHOW HIST *UPDATE*), you would want to download and install the most recent update kit (VMS732_UPDATE-V0700) and the other patches, which are not included in the UPDATE kits (see ftp://ftp.itrc.hp.com/openvms_patches/alpha/V7.3-2/ALPHA_V732_MASTER_ECO_LIST.txt).

Volker.
Robert Brooks_1
Honored Contributor

Re: Storage works merging for 24 hours after node crash

What exact kit name am I looking for , so I can download .

----
What kits do you have installed now? It's quite possible that you already have the needed bits on your system.

PROD SHOW HIST VMS732_UPDATE

will show the relevant kits installed

-- Rob

Re: Storage works merging for 24 hours after node crash


$ PROD SHOW HIST VMS732_UPDATE
----------------------------------- ----------- ----------- --------------------
PRODUCT KIT TYPE OPERATION DATE AND TIME
----------------------------------- ----------- ----------- --------------------
DEC AXPVMS VMS732_UPDATE V4.0 Patch Install 08-JUL-2005 11:56:23
DEC AXPVMS VMS732_UPDATE V3.0 Patch Install 11-MAR-2005 09:01:18
----------------------------------- ----------- ----------- --------------------

2 items found
Volker Halle
Honored Contributor

Re: Storage works merging for 24 hours after node crash

Kendall,

you are all set regarding the HBMM patch. You have not installed the most recent update kit, but you should be fine regarding HBMM.

Volker.

Re: Storage works merging for 24 hours after node crash

I imagine the release notes have the necessary instructions to configure and "turn on " HBMM?
Jan van den Ende
Honored Contributor

Re: Storage works merging for 24 hours after node crash

kendall,

in one word: YES.

But it is a little bit hidden in there,
I have not got them at hand, but something like SET VOLUME Dxxx /HBMM. It can be issued while the shadow set is in use.
For now, do not worry about the various policies: the default will do almost always.

Proost.

Have one on me.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Volker Halle
Honored Contributor

Re: Storage works merging for 24 hours after node crash

Kendall,

the OpenVMS V8.2 New Features and Documentation Overview manual contains HBMM related information in Chapter 6:

http://h71000.www7.hp.com/doc/82FINAL/6675/6675pro_007.html#minimerge_fc_h

You can also refer to the OpenVMS Volume Shadowing manual.

Volker.

Re: Storage works merging for 24 hours after node crash


Thank you all for all your help. I have assigned as many points as possible so y'all
get the correct head gear!
Robert Brooks_1
Honored Contributor

Re: Storage works merging for 24 hours after node crash

HBMM can be rather confusing; if you have questions here regarding the use of HBMM, either post your question here, or contact your local support centre.

The HBMM policy syntax can look a bit intimidating at first, but most sites won't need things like multi-masterlist policies.

Good luck!


-- Rob (part of the HBMM engineering team)

Re: Storage works merging for 24 hours after node crash

Im searching for the "How to turn on" HBMM Documentation and the policy stuff .... where is it.
Robert Brooks_1
Honored Contributor

Re: Storage works merging for 24 hours after node crash

Volker Halle's last post contains a pointer to the relevant documentation.

"Turning on" HBMM isn't as simple as using a single DCL command. You really should read the entire HBMM documentation, to have a complete understanding of how it works, what resources are consumed, considerations to be taken when planning what nodes will have bitmaps, etc...


-- Rob

Re: Storage works merging for 24 hours after node crash

Thats what my plan was.