- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: How to quantify uptime of MC Service Guard clu...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-28-2005 05:27 AM
01-28-2005 05:27 AM
99.5
99.99
or what?
I need an actual doc - one of the clusters I have has an SLA of 99.3, but last year we were at 99.91
Thanks...Geoff
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-28-2005 05:42 AM
01-28-2005 05:42 AM
Re: How to quantify uptime of MC Service Guard cluster
Bill Hassell, sysadmin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-28-2005 06:20 AM
01-28-2005 06:20 AM
Re: How to quantify uptime of MC Service Guard cluster
Unplanned downtime should be zero, but if it occurs, note the start, stop and length. Also note the reason so you can make it preventable next time.
Planned downtime for upgrades and such do not count.
If its not downtime its uptime for calculation purposes.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-28-2005 06:36 AM
01-28-2005 06:36 AM
Re: How to quantify uptime of MC Service Guard cluster
Environment:
Data centre has a redundant power grid, redundant PDU's, as well as diesel.
The servers themselves are completely redundant as well, 2 node cluster, multiple Lan cards, redundant networks, multiple paths to SAN, etc..., as well as Mission Critical support with HP.
We also have in place a 48 hour DR plan to another city - that is we will be up an running in less then 48 hours with this system in the event of a disaster (tested every year).
Thanks...Geoff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-28-2005 07:37 AM
01-28-2005 07:37 AM
SolutionIts unlikely you'll find any kind of guarantee anywhere... the only reference I could find was in this rather old doc:
http://docs.hp.com/en/223/sgdtwb.pdf
This indicates that 99.8 - 99.998 could be achievable with Serviceguard - of course it all depends how you measure your uptime...
One automated way of keeping track of this is the little used foundation monitor toolkit which is (or was?) part of the Enterprise Cluster Master Toolkit:
http://docs.hp.com/en/B5139-90038/B5139-90038.pdf
HTH
Duncan
I am an HPE Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-28-2005 07:46 AM
01-28-2005 07:46 AM
Re: How to quantify uptime of MC Service Guard cluster
Anyhow, Slide 1-7 lists an availability of 99.95% for a 2-node cluster running MC/SG assuming a 10-minute package failover. When you get up to these levels, one of the most significant factors is the failover times of the packages themselves.
I can say that I am at 5.5+ years of zero unplanned downtime using MC/SG, redundant networks, redundant HVAC, generator, ...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-29-2005 08:56 AM
01-29-2005 08:56 AM
Re: How to quantify uptime of MC Service Guard cluster
So last year you had 7hr 54mins 20sec downtime! This is quite alot of time, does it include all the planned downtime? The SLA requires no more than 61 Hours of down time per year. So Id say job done..
To me ServiceGuard is only one element in achieving high availability. External to SG
o Network. No matter how resiliant SG is, if you have a network broadcast storm you are stuffed.
o SCSI devices. Most environments have a database involved. If so there are usually some form of logs and log archiving. SG may not be set up to detect this, and so it is quite easy for a DB to freeze because the log archive mechanisim has failed.
o Storage devices. Most SG clusters make use of some form of shared or SAN storage. Again a failue on this will cause the whole thing to be out of service.
o Application outage due to peak loading etc
o Assurances about the backup generator and batteries
o floods, fire, earthquakes, terrorists etc.
o and so on...
So in your search for document you need to find out what all the other things are to to be able to derrive a meaningful and achievable SLA. You may want to exclude certain items above (if so explain why!!), but there are enough items on the list to mean that it is not just the availability of SG that determins the SLA but the combination..
To get to my point... if you have measured an availability of 99.91, then this is more meaningful than an acedemic excercise of gathering all the numbers for other potential souces of outage. You might want to anlayse the source of these outages (e.g. 1 failover took 25 mins; 10 application patches of 40 mins each; 55 mins network storm) then concentrate on reducing the worst (10x 40 mins of application patching would be the one!!!)
Regards
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-29-2005 09:56 PM
01-29-2005 09:56 PM
Re: How to quantify uptime of MC Service Guard cluster
Planned downtime for upgrades and such do not count.
I sincerely beg to differ on that!!
Try running a police call-room and tell the callroom manager: "Well, next weekend we will be doung an upgrade. Expect your callroom systems to be unavailable for 6 hours minimum, 12 hours max."
In the last 10 years, we (actually, the NETWORK department) had to do that once, and it took 3 months of planning, all kinds of temporary measures, LOTs of extra manpower, and still the callroom manager had the right to veto up until the last minute.
_WE_ have to come with good explanations for hickups of minutes, and then plans for how to prevent/circumvent that next time!
_I do recognise the reasoning.
Lately we had a national Interex symposium day on uninterupted computing, and one session was by a bank, about how they set up their environment.
When it transpired that they defined "100% uptime" as monday-saturday 07:00-22:00, batch processing and backups at night, maintenance on Sundays, the man was violently attacked by the majority of the audience.
In the discussion afterward the audience agreed that 90 hours out of 168 does not even constitute 60% uptime!
Then again, 90 hours is all they needed, and if that is satified, GREAT! Only, do NOT try to sell that as 100%.
Geoff,
in my math, 48 hours constitutes some .8% of a year. If you do a yearly rehearsel, is that then so realistic as to take down your production? In that case, your attainable upper limit is 99.2%, barring any other unavailabilities.
Like Bill started the first answer in this stream:
it depends HOW you define downtime.
Recently, in another stream,
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=793109
I did a breakdown of various (we hope: most, or all) points of view on downtime.
Especially running many (sometimes interlinked) apps, accessed from a wide area by a segmented network, from WBT's via Citrix desktop servers, UPTIME is in the eye of the beholder!
The only thing strictly measurable is SERVER uptime, but to remote users of some application, that is NOT what THEY perceive!
hth,
Proost.
Have one on me.
Jan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-29-2005 11:25 PM
01-29-2005 11:25 PM
Re: How to quantify uptime of MC Service Guard cluster
Maybe they suspect that they haven't been maintaining it properly, or don't trust their own testing.
So maybe you need a schedule to switch it over now and again to keep the management's confidence in availability.
But as been said above, ultimately its the end user who defines availability.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-31-2005 01:51 AM
01-31-2005 01:51 AM
Re: How to quantify uptime of MC Service Guard cluster
Thanks to the pointers to the docs and the course material (I have that too - didn't think to check it).
Rgds...Geoff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-31-2005 03:30 AM
01-31-2005 03:30 AM
Re: How to quantify uptime of MC Service Guard cluster
have a look at
http://www.amazon.com/exec/obidos/ASIN/1587130173
Up to now I have not found a more professional book on calculating system availability and even though it's a cisco book is not at all focussed on router.
they also write on how to include part MTBFs and all such, so in the end You'll have a really good calculation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-24-2005 08:21 PM
02-24-2005 08:21 PM
Re: How to quantify uptime of MC Service Guard cluster
In my experience with writing and being involved in SLA's it all depends on what the business wants. If you are a bank then you would want no 'unplanned' downtime - i.e. 100% SLA. But if you are a 9-5 office hours only business then 99.5% is more appropriate and realistic.
The important thing is to 'plan' all changes and downtime/maintanance slots and do not make changes on the fly as this will only come back and bite you and cause unplanned outage.
Chad.