Operating System - OpenVMS
1832587 Members
2937 Online
110043 Solutions
New Discussion

White paper on distrubtive vs non-distributive or network overhead

 
john a caulfield
New Member

White paper on distrubtive vs non-distributive or network overhead

Back in the Digital days or early Compaq days I found a white paper either on Digital or DECUS website that talked about when to distribute an application and when not too due to network overhead. To encode a message in say XML (today) pass it down through all the TCP/IP layers/code, out to the network to another box, up through TCP/IP, through an XML parser, to an application, and then back through the same path requires a fair number of instructions/overhead. You wouldn't want to do it just to add one to a counter. This white paper talked about when does it make sense. How much overhead was there, etc. Other issues also come into play (reliablity, std DB, etc). I was wondering if anyone had a copy of this or a similar whitepaper.
2 REPLIES 2
Robert Gezelter
Honored Contributor

Re: White paper on distrubtive vs non-distributive or network overhead

John,

I did not attempt to quantify things, but I did publish an article in the long-gone Hardcopy magazine on just such a question.

I do not have it in machine-readable form, but I do have some copies somewhere, if it is of interest.

My general conclusion (in those days -- mid-1980's) was that it made sense, for a minimum reduction of data by an order of magnitude per level of architecture (and that may be too generous).

- Bob Gezelter, http://www.rlgsc.com
John Gillings
Honored Contributor

Re: White paper on distrubtive vs non-distributive or network overhead

john,

I'm trying to imagine how this question could be quantified, especially without some understanding of what the application in question does. There are some that, by their nature, MUST be distributed, and others that must not. I'd expect the decision would be based more on the requirements of the application and its users than by pre judging network overheads.

"You wouldn't want to do it just to add one to a counter." In some cases you don't have a choice, you will be required to do exactly that because that's what the application demands. You then have to choose an implementation that minimises the cost (in whatever terms you're measuring).

An example of an application that MUST be distributed - say a mobile phone network. One that probably should not - something like this forum (imagine trying to synchronise all those clients... hmmm, sounds like a "net news" architecture)

I'd recommend you think about the flow of data through the system, and have plenty of "backs of envelopes" to sanity check your design. Simple example... many years ago we had a character cell application that displayed incoming events. The data was received and stored in a central file, and updated events transmitted to clients. Updates to clients were expected to be displayed within 10 seconds of the event.

A client starting up would read the entire file to get started, then wait for updates.

Someone thought character cell was too old fashioned and wrote a GUI client to run on a PC. This involved reading a copy of the whole data file from the central location, processing it on the client and updating the display. The programmer hadn't thought this through. At the time ethernet was 10Mbit/s, and the file was between 500K and 1MB. So, with a 10 second refresh rate, how many clients could this system support before the ethernet was saturated? The system worked well under test with one or two users, but it doesn't scale.

I don't have any "white papers" (one day I might figure out what that actually means...), but if it was Digital days it's likely that many of the assumptions are no longer valid. Networks are 10 to 100 times faster and cheaper, memory is 100 to 1000 times cheaper, the network is far more ubiquitous, and there are far more active components to which parts of systems may be distributed. In 1998 the idea of a PDA with several different types of wireless connectivity was almost science fiction, today it can be an assumption.

These days the default answer will probably be "distributed" - it's just a matter of determining what and where.
A crucible of informative mistakes