What is the technical reason DSAx devices can't be MSCP served?

Jon Pinkley · ‎07-13-2007

Instead of hijacking
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1139761
I will start a new thread.

Does someone remember the technical reason that Shadowing Phase II virtual units can't be MSCP served?

I think it had something to do with the fact that both MSCP and the SHDRIVER used SCS services (SHDRIVER uses lock manager over SCS) and there was some circular dependency.

From a networking standpoint, MSCP serving the virtual units to satellites would seem to reduce the amount of traffic generated by writes to the virtual unit from the satellite, at least for cases where there were more than one Shadowset member.

For cases where the satellite has only MSCP connections to the members, it seems it would be preferable to have the virtual unit served instead of the members, but I don't know it is would be possible for the satellites to ignore the MSCP served members.

The customer view at the time HBVS was released was that not allowing the DSAx: devices to be MSCP served was a way to increase HBVS licensing revenue, since each "consumer" of the shadowing service had to license the software. But I thought I heard these was also a technical reason that all users of the DSAx virtual units needed to have the shadowing code running, and MSCP serving the DSAx units was not allowed.

it depends

Jur van der Burg · ‎07-13-2007

One of the reasons I can think of is the same as why I don't allow LD devices to be mscp served: a served device is controlled by SYS$DUDRIVER on the remote node and not by it's original driver. This means that a 'remote' shadowset is different in behaviour to management commands than a 'local' shadowset. There are probably ways around that, but it would add a lot of complexity to the shadowset management code.

For pure serving disk blocks it would be possible, but I think this was one of the tradee-offs that was made.

This is also true for LD, although LD has the added complxity that it can create and remove devices on the fly which is impossible (remove anyway) to do for devices that are already served.

Fwiw,

Jur.

Jan van den Ende · ‎07-13-2007

Jon,

I have always thought that to stem from the same distinction as the difference between shadow SETs and shadow set MEMBERs.
The first is a virtual thing, the latter are (well in principle anyway, now with SAN etc) the underlying physical stuff.

The first question would be: _WHICH NODE_ is to serve the DSA? In the typical case the member drives are attached to _DIFFERENT_ nodes!

To zoom in to your specific question: that would mean that the node doing the IO had to distinguish between shadow sets consisting only of members attached (served) by ONE node (then use the shadow set, and have the individual members handled by the server) and sets consisting of members attached to different nodes (you really do not want those IOs handled by the server of the virtual unit, to by transfered in part to tey another node).
--- and IF you decide to use served shadow sets in case of the same node serving the two members, how would you go about adding the third member on another node? And what to do in case of dismount of such member?

How about shadow sets with on local and one remote ( = MSCP served) member?

-- If only starting to think about the added complexities, I can see why it was not done.

Proost.

Have one on me.

jpe

Don't rust yours pelled jacker to fine doll missed aches.

Art Wiens · ‎07-14-2007

Thanks, but I think this question would have been ok there ... I was sort of wondering myself.

Cheers,
Art

Hoff · ‎07-14-2007

Ask yourself how much this configuration is "worth", in terms of engineering investment and customer demand, as compared with continuing with the current configuration, and as compared with other requests and other engineering projects.

A host-based volume shadowing (HBVS) virtual unit (VU) has a one-to-many relationship with some of the devices local and some served, and the VU manages physical devices and particularly physical device errors locally. A VU that was itself served would quickly have a many-to-one-to-many or a many-to-fewer-to-many configuration, and accordingly far more complex error handling and error recovery processing.

Remember that the VU is already built on -- some might say balanced on -- multiple different device drivers, for the multitudes of local and various served devices that can be shadowed. The current HBVS I/O environment is already complex, and it's rather more constrained than would be a served-VU stack.

What happens when a served VU encounters an I/O error? Does it kick everybody in the whole environment out? Where did the error actually arise? How to run and to manage a merge or a copy?

What is being considered here -- served VUs -- is essentially what the old (and subsequently canceled) $qio server project was targeting. It's not an easy problem, given the product expectations of the folks typically using OpenVMS. And the assumed and intended reliability and robustness of HBVS increases the effort.

If you want something in this area, I'd tend to look toward looser-coupled configurations, and to served-storage mechanisms such as those provided by NFSv4.1. NFSv4.1 provides what is basically a loosely-coupled cluster; there are discussions and pointers over at the web site.

The likely available solution would involve NFS (NFSv2 and NFSv3 server, NFSv2 client) available within TCP/IP Services -- Process many well have different NFS versions -- or the classic DFS package; these are the typical remote-served disk I/O packages available for OpenVMS.

And FWIW, I haven't typically encountered served pseudo devices on OpenVMS for any purpose; these are fairly rare. Off the top, I can't recall one...

Stephen Hoffman
HoffmanLabs LLC

Robert Brooks_1 · ‎07-14-2007

What happens when a served VU encounters an I/O error? Does it kick everybody in the whole environment out? Where did the error actually arise? How to run and to manage a merge or a copy?

---

This, in a nutshell, is the answer to Jon's question. The shadowing driver is the most complex and bizarre driver in VMS; it's certainly the biggest, as measured by lines of MACRO-32 code. The majority of code in the driver deals with error recovery, whether it's HBVS's "volume processing" (shadowing's own version of mount verification) or dealing with copies or merges. As the driver is written, there *MUST* be a local SHDRIVER on every node for shadowing to work; DUDRIVER (the MSCP client-side driver) does not have the semantics to mimic SHDRIVER.

After the HBMM work, there's now an even tighter relationship between SHDRIVER and its ACP, otherwise known as the SHADOW_SERVER process. SHADOW_SERVER is now involved in the mounting of any virtual unit, if the process isn't there; the unit won't mount. Given this relationship, you simply could not get DUDRIVER to manage the state that SHDRIVER must have.

It was, and is not, an attempt to wring out more licensing dollars from customers.

-- Rob

Jeff Friedrichs · ‎08-15-2007

Here are some other answers from other VMS engineers....

When we had â Controller Based Shadowingâ DUS devices we allowed them to be served. This led to problems since there are times where you need the Virtual Device to be available before the physical devices are which led to hangs. There was just no way to make it work so we dropped serving of Virtual Devices.

----

My recollection is that the MSCP - Controller Based Write Logging (i.e. only on CI (HSC HSJ) controllers) was already on the table, when HBVS was in the design phase. Getting write logging done - for devices that were MSCP served SSMs was a big enough challenge â much less adding the complexity of having multiple clients with the ability to have a â servedâ DSA device. That kind of design, where the MSCP system that was doing the serving would have to be â responsibleâ for â write loggingâ potentially multiple clients â with the requisite house keeping that that would entail, when one of those clients crashed, was a fair bit of work. It would have been MUCH more complicated to complete the write logging project, in an area that was already very complex.

---

Cheers,
Jeff

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

What is the technical reason DSAx devices can't be MSCP served?

What is the technical reason DSAx devices can't be MSCP served?

Re: What is the technical reason DSAx devices can't be MSCP served?

Re: What is the technical reason DSAx devices can't be MSCP served?

Re: What is the technical reason DSAx devices can't be MSCP served?

Re: What is the technical reason DSAx devices can't be MSCP served?

Re: What is the technical reason DSAx devices can't be MSCP served?

Re: What is the technical reason DSAx devices can't be MSCP served?