Operating System - Linux
1832267 Members
3411 Online
110041 Solutions
New Discussion

RHEL4 System unresponsive when a disk crash on smartarray

 

RHEL4 System unresponsive when a disk crash on smartarray

Hello,

Here is my problem :

On a BL25PG1 I have a smartarray 6i with the battery write cache and 2x36Gb disks in mirror.
A few days ago one of my disk crashes and the systems stop to reponds.
It was still pingable but no more ssh,...

Here is what I have in the message log :

Nov 18 21:48:38 servername Event Log Daemon:[4412]: SCSI bus speed downshift occurred:Port 0 Box 1 of Embedded Array Controller.
Nov 18 21:49:23 servername last message repeated 2 times
Nov 18 21:49:23 servername Event Log Daemon:[4412]: Physical drive failed: SCSI Bus 1 Target 1 of Embedded Array Controller.
Nov 18 21:49:27 servername Event Log Daemon:[4412]: Logical drive 1 of Embedded Array Controller, has changed from status OK to INTERIM RECOVERY MODE
Nov 20 13:47:35 servername syslogd 1.4.1: restart.

The system stop to respond just after the disk blown.

RedHat thinks the problem is hardware related.

Anyone with the same behaviour? Problem?
Any idea?

Regards,

Charles Castelain
7 REPLIES 7
Jean-Yves Picard
Trusted Contributor

Re: RHEL4 System unresponsive when a disk crash on smartarray

Hello,

The most important information is missing :
how is your array configured ?

I had several crash of disk in bl20G3 without any problem on Linux.
I had the 2 physical disks set into 3 mirrored partitions ( 3 logical disk seen by linux)
I run proliant support pack 7.1 or 7.4
My Linux is RedHat Enterprise 3 Update 2 or RH4U0

Jean-Yves Picard

Re: RHEL4 System unresponsive when a disk crash on smartarray

Hello,

On a BL25PG1 I have a smartarray 6i with the battery write cache and 2x36Gb disks in mirror.

And for the fs :

/boot of 100Mb primary part
swap of 8gb primary part
/ of 8gb primary part
rest as vg00

Regards,

Charles Castelain
TANHM
Advisor

Re: RHEL4 System unresponsive when a disk crash on smartarray

Hi Charles,

I think u can logged a call to the vendor, ask them to perfom a diag test on it and make sure the array controller are being configure properly. Please make sure the hardware engineer have a good knowledge on linux.

I believe 90% is due to hardware issue. I used to encounter such situation and the root cost is the raid card configuration problem.

Re: RHEL4 System unresponsive when a disk crash on smartarray

Hello,

Yes. This is in my plan and what have been suggested by RedHat.

But do I need a special support contract since this system is just covered by it's one year warranty...

Do you often have the problem?

Regards,

Charles Castelain

Re: RHEL4 System unresponsive when a disk crash on smartarray

Hello,

What do I need to check at the configuration level of the smartarray?

When I setup a server I just use ilo.
Stop the boot to enter the smartarray config.
Choose my raid setup.
Create my "volume".
Save my config.
Reboot.

Is there something I must verify at the server level?

Start my kickstart install.

Right now the only problem I have found are the following :

Impossible to install if a ram stick is dead even when the system claims it had disabled it.
Impossible to install RedHat4 update 3 on a bl25p via ilo. It bombs the blade when it tries to format the disks.

All the other installations are coming fine and smoothely.

Regards,

Charles Castelain
TANHM
Advisor

Re: RHEL4 System unresponsive when a disk crash on smartarray

Hi Charles,

Then u need to check with the reseller or HP, depends on whether u bought directly from HP.

I think u just need to have a normal h/w contract, i believe any new h/w will come with a contract, in your case, i think is cover as it is a h/w related issue (h/w configuration problem).
George Liu_4
Trusted Contributor

Re: RHEL4 System unresponsive when a disk crash on smartarray

Most of that kind of problems are caused by kernel bugs. Try to use an old kernel.