Operating System - OpenVMS
cancel
Showing results for 
Search instead for 
Did you mean: 

Celebrating!!!

Jan van den Ende
Honored Contributor

Celebrating!!!

AMLVMS$ Write sys$ouput f$getsyi("cluster_ftime")
13-APR-1997 11:35:50.75

CET DST ~ GMT + 2:00:00.

Yes, this means the cluster uptime has reached seven years today, at 9:35 GMT.

May 5, at the ENSA/Interex Symposium in Muenchen, Anton van Ruitenbeek and myself will be presenting it as a case study.
Consider sourselves invited!!

Cheers,

jan
Don't rust yours pelled jacker to fine doll missed aches.
20 REPLIES
Willem Grooters
Honored Contributor

Re: Celebrating!!!

Jan,

Congratulations!

Tim efor a contest: Who's up longer that that? Issue OS as well - it will be VMS only, I guess ;-)

I won't be in Muenchen but would very much get the presentation. Will it be available?

Willem
Willem Grooters
OpenVMS Developer & System Manager
Jan van den Ende
Honored Contributor

Re: Celebrating!!!

Willem,

we are trying to do the session for PinkRoccade internally before going to Muenchen as a try-out. If and when we succeed in doing that, I will ask if we can invite you there (argument: have someone that can ask the relevant questions). And maybe the VMS-SIG can be interested? That's all I can promise for now.

jpe
Don't rust yours pelled jacker to fine doll missed aches.
Ian Miller.
Honored Contributor

Re: Celebrating!!!

any change of the presentation appearing elsewhere or can you post some details here?
____________________
Purely Personal Opinion
Willem Grooters
Honored Contributor

Re: Celebrating!!!

Jan,

VMS-SIG: Ok for the technical details, but the actual implications ("we had to convince management that the applications did NOT have to be stopped") are of higher importance OUTSIDE this SIG. Contact chairman and coordinator!

Willem
Willem Grooters
OpenVMS Developer & System Manager
Martin P.J. Zinser
Honored Contributor

Re: Celebrating!!!

Hello Jan,

Congratulations from here too! One of the obvious questions here is how much of the infrastructure has been exchanged here over time. OS upgrades will be the trivial part, as well as adding/removing single nodes. How about the cluster interconnect, is this still the same?

You already have a room booked for the party in 2007?

All the best,

Martin
Uwe Zessin
Honored Contributor

Re: Celebrating!!!

> "we had to convince management that the applications did NOT have to be stopped"

He He - one of my former managers asked me to reboot the systems because they had been running for so long that even processes like ERRFMT had accumulated a noticable amount of CPU time. Unfortunately the whole cluster locked up after 497.5 days of uptime because some 32-bit counter inside the MicroVAX-3400 wrapped around and VMS was not able to deal with this at that time. :-(
.
Willem Grooters
Honored Contributor

Re: Celebrating!!!

Uwe,

I happen to know Jan and the environment. He made this remark one day (in a VMS-sig meeting) Jan published a description of the environment on OpenVMS.org: http://www.openvms.org/stories.php?story=03/11/28/7758863 (dated Nov. 28th 2003). In that description, you'll see that EVERYTHING has been upgraded during that period and the application was NOT brought down (unless fail-over, I presume).
Willem Grooters
OpenVMS Developer & System Manager
Ganesh Babu
Honored Contributor

Re: Celebrating!!!

Hi Jan,
Congratulations..

Ganesh
Hein van den Heuvel
Honored Contributor

Re: Celebrating!!!


for your entertainment...
---
VMS internal timekeeping is generally 64bits. However, elapsed time for a
process is kept () as a longword. These times are in 10ms intervals, not
100ns like a lot of timekeeping, so 31 bits lasts maybe 10ish months. Then it
wraps into bit 31 and becomes an abs time, as you see.

---

$ show process /accounting /id=20800069

24-MAR-2004 17:13:21.34 User: SETIATHOME

Accounting information:
Buffered I/O count: 27892722
Direct I/O count: 23006056
Images activated: 600
Elapsed CPU time: 26-JUN-1859 21:54:19.90 <----- :-)
Connect time: 294 03:12:45.05

-----
A workaround/fix has been submitted to at least use 32 bits, not 31, expanding the range to 500 days or so. Yeah... Yeah... not enough for some, but how many other OSes do you know that would come close to this problem huh? (Plus... you can sort of back-track to see what the real value is based on the current + wrap-around :-).
------

XXXX> sho proc/id=53800128/acc

13-APR-2004 16:13:13.87
Process name: "SETI@home 39% "

Accounting information:
Buffered I/O count...
:
Images activated: 1
Elapsed CPU time: 495 05:30:43.16
Connect time: 2 04:07:56.31
XXXXX>

They are looking into promoting PHD$L_CPUTIM to a quadword in a future release.


:-).

Hein.
Terry Yeomans
Frequent Advisor

Re: Celebrating!!!

About time you did some patching then !
I bet the cluster doesn't stay up for another 7 years after a big patch update !
Yours Terry.
Jan van den Ende
Honored Contributor

Re: Celebrating!!!

So now,

time to try and give some answers.

Ian: no plans yet, but: Que sera, Sera.

Willem: as you know, my co-presenter is also SIG-coordinator; let's see how things devellop.

Martin: yes, EVERYTHING was replaced. The oldest piece of hardware now present is aged just over half the cluster uptime.
Has the interconnect changed? From 10Mb ethernet early on quite soon to mutli-mode fiber FDDI to single-mode FDDI (both with 10MB E fallback) to 100MB E with FDDI as fallback
Location from 3 meters separation to 7 KM,
OS from 6,2 to 7.1-2 to 7.2-1 to 7.3-1.
All DBMS's were upgraded (most of them more than once), as were all applics.
External communication changed from X25/DECnetPlus to TCP/IP (the most regretted change of all).
Enough?

Uwe: never met that one, although I seem to remember that we DID have some VAX with node-uptime approaching 2 years (5.5-2)
Hein: in view of your story: Is that possible, or is my memory exaggerating things?

Terry: my answer to Martin should show that for a "mere" patch we don't go down. It's called "rolling upgrade".

The issue of uptime in the future will be a political one: Big reorganisitions are being prepared, the current viewpoint being to throw it all away and start from scratch.
Obviously not everyone concurs. :-)

All. Cluster uptime being a nice statistic, of course the real importance is application uptime AND reachability.
Those statistics vary widely, although for the total running time we never lost ALL applics at the same time.
Most users nowadays have a (ugh) MS desktop, usually via Citrix, with terminal emulation for VMS (and *IX) access. If that breaks down once again, obiously those users loose their VMS apps also. That's why some department insisted on staying on VT's. The call-room and the car-to-callroom-communication also don't use Billyware.
The app with the best statistics is an RMS app (~ 4 M records in the various 'tables') of the VT-using department that was NEVER down... Except for the month Januari 2002.
The app has a large financial aspect, and the supplier succeeded in NOT having the Euro version available until one month late..
Second best is a DBMS app which has been out for a day three times during upgrade conversions.

This wil have to do for the moment

Jan



Don't rust yours pelled jacker to fine doll missed aches.
Sanjiv Sharma_1
Honored Contributor

Re: Celebrating!!!

Congratulations Jan!!

Thanks, Sanjiv
Everything is possible
Martin P.J. Zinser
Honored Contributor

Re: Celebrating!!!

Hello Jan,

I think I speak also for the rest of us here when I thank you for sharing your story with us. I actually did expect that you changed all the bits and pieces in your environment, but as FUD spreaders tend to belittle such achievements by claiming "its just a static box sitting in a corner" it is very nice to have this on record.

All the best and keep it running!

Martin
Antoniov.
Honored Contributor

Re: Celebrating!!!

Jan,
congratulation from me too.
Here some user of my customer still use VT terminal without Windoze; I'm happy no allbodies follows Billyware choices!

@Antoniov
Antonio Maria Vigliotti
Jan van den Ende
Honored Contributor

Re: Celebrating!!!

Antoniov,
Well. -- I -- definitely am one of those guys who insist on using a VT!! It's much less tiring for my eyes, for one thing.

But you have to admit Billy G has achieved a lot, and not only financially:
- "Everybody" now knows that from using computers you get RSI.
- "Everybody" is used to 'computers' failing regularly, and then "you just hope you have not lost much"
- "Everybody" knows that a computer or an operating system older than last year "is from the Stone Age", and not fit for today's use.

Execept of course those crazy people who keep telling that that's NOT true if you stay away from M$s*t...

:-(

Enjoy!

Jan
Don't rust yours pelled jacker to fine doll missed aches.
Uwe Zessin
Honored Contributor

Re: Celebrating!!!

Jan,
> Uwe: never met that one, although I seem to remember that we DID have some VAX with node-uptime approaching 2 years (5.5-2)

It was a problem in a specific version of OpenVMS that only happened on the MicroVAX-3400 as I recall.

I have now read your story and join the club: "well done". It is a bit over 7.5 years now that I have worked as a full-time system manager, but at the last position I was responsible for a few small clusters scattered over Germany, so I think I have an idea what your job is about.
.
Ian Miller.
Honored Contributor

Re: Celebrating!!!

A couple of other links
http://uptimes.hostingwired.com/stats.php?op=all&os=OpenVMSCluster
http://uptimes.hostingwired.com/stats.php?op=all&os=OpenVMS

and overall there is a VMS Cluster at No.2
http://uptimes.hostingwired.com/stats.php?op=active

but its up time is nowhere near yours.

____________________
Purely Personal Opinion
Jan van den Ende
Honored Contributor

Re: Celebrating!!!

Ian,
I'm not fully sure, but I --think-- they are measuring NODE uptimes, ie, time since last reboot. THERE we obviously don't score that high: Alpha ES series didn't even EXIST 7 years ago! (Neighter did VMS 7.3-1). They even state you are considered bogus if you report boottimes for OSses or hardware that didn't exist.
All that will be in Muenchen: we don't plan a formal party, but wouldn't it be great to organise a "Now at least I know your faces" gettogether (maybe in a Biergarten)?

Jan
Don't rust yours pelled jacker to fine doll missed aches.
Keith Parris
Trusted Contributor

Re: Celebrating!!!

Glad to see your session accepted for Munich. Unfortunately, the post-conference seminars all got cancelled and now I won't get to be there. But have fun without me!

Have you considered submitting this session for HP World 2004 in the USA? See http://www.hpworld.com/conference/hpworld2004/hpw04_speak_01.html Deadline is April 23.
Jan van den Ende
Honored Contributor

Re: Celebrating!!!

Keith,

sorry to miss you in Muenchen.
And the HP World Chicago question was asked before, but that is impossible without a sponsor for travel & accomodation expenses.
We have already needed special permissions to go over the allocated budget as it is now!

We'l meet again, aybe at the TUD in october?


Jan
Don't rust yours pelled jacker to fine doll missed aches.