Operating System - Tru64 Unix
1753425 Members
5057 Online
108793 Solutions
New Discussion юеВ

Re: Raid-5 is secure or not in tru64 with advanced file system

 
fung bing kwan
Occasional Contributor

Raid-5 is secure or not in tru64 with advanced file system

Dear all

My server is using a raid-5 (SWXCR RAID Controller) system (with tru64 and version hp unix 4.0). Before a week ago, after reboot the system, a harddisk seem was died. And I cannot mount the corresponding point. The whole Oracle db cannot be started which was located at such harddisk. It reported that the system was degraded. I was quite confusing that why only a harddisk was failed and caused the system availability failed ? (The mount point still exist, but there is nothing inside). And, what if I put another new harddisk to replace the bad one? Because other db is still running within the past week. Is the raid-5 system can re-build by itself even though there is a week time gap?

Thanks in advanced !!

Kwan
5 REPLIES 5
Han Pilmeyer
Esteemed Contributor

Re: Raid-5 is secure or not in tru64 with advanced file system

If you only lost a single disk in a RAID-5 set, then the data should still be available. The danger is always that you already lost another disk earlier and didn't notice it. Sometimes SWXCR controllers drop a LUN after some problems even though there is no real problem (e.g. after power up problems with the disks).

I would use RCU to check the status of the disks. Perhaps you only need to re-enable the LUN. Of course the failing disk (if it is a permanent failure) needs to be replaced.
fung bing kwan
Occasional Contributor

Re: Raid-5 is secure or not in tru64 with advanced file system

Dear

Thanks for your treasure advice. But, actually I am not familiar with tru64 system. Can you kindly brief me how to check the disk and re-enable the LUN

Thanks

Kwan
Han Pilmeyer
Esteemed Contributor

Re: Raid-5 is secure or not in tru64 with advanced file system

If I remember correctly, then you need to use the RCU (RAID Configuration Utility) for this type of array. It came on a floppy with the hardware and you have to run it from the console.

I don't believe that this array had an online utility, but I could be wrong. It's been a very long time since I had to do this.
Johan Brusche
Honored Contributor

Re: Raid-5 is secure or not in tru64 with advanced file system


If the platform kit was installed correctly at the time, then swxcrmon is the program that monitors de SWXCR, and swxcrmgr the GUI that lets you manage it from the OS.

If swxcrmon was properly configured, it should have sent mail to 'someone' about the failing RAID.
The binaries are normally installed in /usr/opt/swxcr*

The'other' db instance is probably on another RAID logical unit in another filesystem and another moountpoint.

When the offline disks are replaced or reenabled in their RAID-set, you will have to restore the data of the not-running instance from backup(tape?) and hopefully the Oracle redo-log can then help to fix the db-data.

Explaining all this in detail via this forum largely exceeds the possibilities of this medium. (It does not replace the manuals)

___ Johan.

_JB_
Michael Schulte zur Sur
Honored Contributor

Re: Raid-5 is secure or not in tru64 with advanced file system

Hi,

like said before, one failed disk should not kill the drive. You can use the swxcrmgr to see the status of the disks involved. With that tool you can also set the disk to optimal again with all risks or try to rebuild the set with a replaced disk.
What error does a mount attempt return?

greetings,

Michael