Business Recovery Planning
1752770 Members
5047 Online
108789 Solutions
New Discussion юеВ

DRP

 
SOLVED
Go to solution
Madanagopalan S
Frequent Advisor

DRP

I am preparing DRP documents/procedure for my
production server. I hope so many of them done
this before. I need your suggestions and advices in
this regards, pls point some web documents for DRP if
anyone knows it.
let Start to create peaceful and happy world
7 REPLIES 7
Mike McKinlay
Honored Contributor
Solution

Re: DRP

If this is only for your production server, then consider providing the following:

1. Complete hardware configuration.
2. Complete software configuration.
3. Step-by-step plans for restoring the complete system from backup.

Keep in mind that DR is relative. What is a disaster to your boss (in your absence) might be resolved with a simple reboot. Items #1-3 are useful in terms of a complete system meltdown, but consider what kinds of things are more likely to happen -- files get deleted, databases get corrupted, services hang or stop. A DR is not a meant to be a replacement for you -- so don't kill yourself trying to document every possible scenario.

Do give some thought to your most common issues with this system, especially if you're very familiar with it.

#4 List the top five most common issues and quick ways to determine if that is the problem and how to resolve it.

#5 Train someone in the proper use of the backup program in use and how to replace specific files, etc. If possible, practice a full restore on a test system if you have one.

DR is no good if you don't test it out.

#6 Review the DR every six months or whenever changes are made to the system to make sure it's kept up to date.
"Hope springs eternal."
Tim Malnati
Honored Contributor

Re: DRP

I have to agree with Mike on this. Too often organizations spend all their time and money preparing for the unlikely 'end of the IT world' scenario and completely ignore the low and intermediate level disasters that usually have much more impact in the real world. One particular disaster to keep in mind is a total loss of power in the data center. Even with all the UPS's, generators, and the like; this sort of thing actually happens more frequently than most would suspect. In this situation, recovery may not mean completely restoring a machine from scratch, but will probably require a well planned corruption verification instead.

Although the topic is your production server, other 'low level' servers providing support to your organization may be just as important (or more important). Disaster recovery is a coordinated effort. What good is recovering your production machine if the little Linux box in the corner providing DHCP/DNS, etc is down for the count also. The organization needs to define what the business priorities are and the overall disaster plan needs to include everything that impacts the recovery of the business. In many cases, getting email back on line is more important to a business than the big machine application.

Another very big piece of the pie is backup media. Having all the aspects of this well defined is critical. A common thing I see is organizations immediately sending tapes off to a vault when the real world has them needed in the data center for a day instead. MAKE SURE THE BACKUPS ARE VERIFIED! I would not want to count how many times a client has asked me to attempt to recover an environment that has a few corrupt files and the backups are missing the data as well (due to an open file or whatever).
Mike McKinlay
Honored Contributor

Re: DRP

Tim -- thanks for agreeing with me!

One disaster I've never seen discussed that happens all too often is loss of key individuals. Few companies have a plan in that regard. You need to back-up your human systems, too.
"Hope springs eternal."
Dave Wherry
Esteemed Contributor

Re: DRP

Let me give you an example of Mike's last comment on redundant human systems.
I was out of town this past weekend and I am the only Sys. Admin. Our QA server crashed. Not critical, but, still needed to be taken care of. My backup came in and "looked" at the system. Couldn't figure it out so went home. Did not even call HP.
I don't expect the backups to have as much experience as I do. That's why we have maintenance agreements. Pick up the phone.

One suggestion I have for you is to not get too bogged down in possible scenarios and too many details. I was on a team writing a DRP once and at each weeks status meeting some manager would bring up "What if this happened or that happened?" You are not going to cover every scenario and every detail, it never ends.
Work on process' like good backups and being able to recover from them. That will bring you back from the majority of your disasters. A restorable copy of current data is your greatest asset. Most everything else depends on how much money the company wants to spend. Redundant systems and generators, etc... cost money. I like to say it is similar to a scale. On one side is your paranoia, on the other is the budget. Each company has to find its' own balance.

One other thing, I highly recomend the current off-site resume. Most companies without a DRP never recover from a major disaster like the loss of a datacenter.
Rita C Workman
Honored Contributor

Re: DRP

To respond to the human issue that Mike mentioned.

I've had similar problem. Only UNIX person here, and no backup. What I was finally able to get approval on was that I selected someone to start training as my backup.
For me, I picked a pretty smart young DBA, who also has a good head on his shoulders.
I think allowing input in the selection as helped, because there are always persons wanting to be the 'Admin', but not wanting to take the responsibility and work that goes with it.
Now since I'm one for documentation, alot of issues are covered. And documenting is definitely something you need to have done.
One of the things I plan to do in training him is to have him take care of the system (..I just watch..) and have him decide how to address things. I want him to place the calls to HP for support, this way he'll be used to doing it.
So I guess, as far as hardware and data restoration (although we chose going with remote site with MC/SG and SRDF for data mirroring to avoid disaster downtime), have your key people backed up by people who can handle it - or at least know where to go to find the answers.
People will be the ones to fix it....so make sure they are prepared for it.
Joseph T. Wyckoff
Honored Contributor

Re: DRP

 
Omniback and NT problems? double check name resolution, DNS/HOSTS...
Joseph T. Wyckoff
Honored Contributor

Re: DRP

In California we now have the concept of rolling blackouts on a daily basis - it is not that bad yet, but it could be.

Add that to your list of things to think about - not just the servers either. In my case above all of my servers were protected, as was my router... but gueess what, my hubs and switches were not. Actually some were...but the batteries had like 3 minutes of life because maintenance had been deferred...

What would happen if all of your servers worked, but could not talk to each other? Take another look at your network diagram, this time with an idea of heat/cooling, and power, and alternate routing...

Do you have enough tape drives? Are there any oddball drives (one of a kind?)

It seems about once a week I see a call where a tape drive is unavailable (different site, failed drive, etc) and a recovery needs to be done. Make sure that when you buy a drive technology, you buy more than one at each site...

Is your drive backwards / forwards compatible. The only way you will know is test...
Omniback and NT problems? double check name resolution, DNS/HOSTS...