ProLiant Servers (ML,DL,SL)
1820623 Members
1874 Online
109626 Solutions
New Discussion

Re: Lost access to Logicaldrive after change broken cache module

 
Ruben_Herold
Frequent Advisor

Lost access to Logicaldrive after change broken cache module

hi,

I have her an DL380p Gen8. Raid was configured as RAD5+ Hotspare. It seems that the cache module died on the controller. Controller reports an error on bootup.

After exchange the cache module I got the message:

1785-Slot 0 Drive Array not Configured

Configuration information indicates drives were configured on a controller with a newer firmware version. To avoid data loss, reattach drives to original controller or upgrade firmware. ArrayA.png

As you can see the controller is on the newest firmware available.
I have bootet into SSA via Intelligent Provisioning. Looks like the same. But in the Diagnostic report I found something:

In the section "Smart Array Ü420i in Embedded Slot -> Identify Controller:
ArrayB.png

Cache Size in MiB 2GB

But in Section  Smart Array P420i in Embedded Slot -> Cache Config Status:

 

ArrayC.png

Total Cach Memory Size 1 GiB...

13 REPLIES 13
TVVJ
HPE Pro

Re: Lost access to Logicaldrive after change broken cache module

Hello, 

Please refer to page 84 of the HP ProLiant Gen8 Troubleshooting Guide Volume II: Error Messages. The solution provided there requires to power off the server and swap the SAS port connectors. If not move the drives to their original positions, if it were changed.

Regards,



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[All opinions expressed here are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Ruben_Herold
Frequent Advisor

Re: Lost access to Logicaldrive after change broken cache module

Neighter SAS ports not disks has ben relocated to other locations. But I can try in the next day's

ksram
HPE Pro

Re: Lost access to Logicaldrive after change broken cache module

Hi,

 

Thank you for the POST.

 

Please refer : https://support.hpe.com/hpesc/public/docDisplay?docLocale=en_US&docId=c01647912

 

You may try steps mentioned on the article.

 

I guess a Power Cycle and reseating the Cables would help here and you may trry that.

 

Also do verify if by any chances the Cables were swapped / inserted incorrectly and if you can confirm the Bay / Drive Locations

 

Thank you

RamKS


I work for HPE.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

Accept or Kudo

Ruben_Herold
Frequent Advisor

Re: Lost access to Logicaldrive after change broken cache module

hi,

we have reseat the cables, nothing changed. We switched the ports, nothing changed. We power cycled the system also nothing changed.

I can't open the last link I got:

 

ArrayD.png

ksram
HPE Pro

Re: Lost access to Logicaldrive after change broken cache module

Hi,

In that Case it would be better to re-create Logical Drives.

Thank you

RamKS


I work for HPE.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

Accept or Kudo

Ruben_Herold
Frequent Advisor

Re: Lost access to Logicaldrive after change broken cache module

Hi,

sorry this is not acceptable. It can not be the case that if a cache modul failed all logical drives are gone!!

We have a bunch of HPE servers and use them since generation 1 also we have a bunch of storages like 3par (7200, 7440c) and some primeras. If it is true that a broken cache module will kill the whole array we can't no longer trust any of your raid systems.

This system is a test system, this is why it is not on support. But even then ir should be not possible to loose the whole array and logicaldrive cause an cache module failed.

sudhirsingh
HPE Pro

Re: Lost access to Logicaldrive after change broken cache module

@Ruben_Herold 

Based on the snap, write cache show 0% which indicates cache not enabled for the logical drive.

Since you have changed the cache, 

Enable caching for the logical drive from SSA,

Also modify cache read/write ratio,

Hope this helps!

Regards,

Sudhir



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Ruben_Herold
Frequent Advisor

Re: Lost access to Logicaldrive after change broken cache module

How should I do if the controller did not recognize the logicaldrive:

 

SSA.png

sudhirsingh
HPE Pro

Re: Lost access to Logicaldrive after change broken cache module

@Ruben_Herold 

All 12 drives are showing as unassigned, seems like array/logical drive information has been cleared from these drives.

What is the critical error/message on the controller ?

In your initial description you have mentioned about cache failure and then you exchanged cache , could you provide more information as what exactly you did ?

Server came with 512MB/1GB/2GB cache module? and did you replaced with exact same capacity ,,new cache ?

or did you swapped cache/ controller from any working machine to this server ?

Few suggestion:

1. Install a new cache module of same size present earlier.

2.Disconnect all the drives, then power cycle the server (without any hdd).

3. Power off and connect all the drives back.

4.Power on the server and check the status in SSA.

Check if you are getting POST message of 1779 ? also if failed array /logical drive information shows in SSA ?

If still nothing, i am afraid to say that you may have to recreate the array.

Regards,

Sudhir



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Ruben_Herold
Frequent Advisor

Re: Lost access to Logicaldrive after change broken cache module

hi,

ok once again what happed. Our monitoring  marked the server dead. We check it and found it in a boot loop.
Error Message was an error from the controller.  IML shows: " 38 Drive Array 09/14/2021 07:05 09/14/2021 07:05 1 Drive Array Controller Failure (Slot 0) ".

We replaced the Cache module with a other (Same Partnumber, Same Size). System boots up and shows us now the Message:

"1785-Slot 0 Drive Array not Configured

Configuration information indicates drives were configured on a controller with a newer firmware version. To avoid data loss, reattach drives to original controller or upgrade firmware."

Controller is on newest firmware.  We checked all cableing and also give the system a boot without disks and then reinsert the disks. Nothing changed

We tried to switch the sas ports. Still same message.  Post Message  still 1785 on every boot.

"If still nothing, i am afraid to say that you may have to recreate the array." <-- this is not aceptable without a good explanation how this could happend cause by a failed cache module. If every time a cache module dies we are at risk loosing access to arrays, we must think about using HPE hardware in the future.

sudhirsingh
HPE Pro

Re: Lost access to Logicaldrive after change broken cache module

@Ruben_Herold 

IML shows: " 38 Drive Array 09/14/2021 07:05 09/14/2021 07:05 1 Drive Array Controller Failure (Slot 0) ".

Above message indicates controller fault,

how did you determine that there is bad cache ? Was there any cache failure error logged in IML ? 

Also exchange cache was of same capacity? and was it from same model server/controller ?

Did you consulted any one before swapping cache from a working machine ?

 

 Please put back the original cache module and check the status.

 

Regards,

Sudhir



I work at HPE
HPE Support Center offers support for your HPE services and products when and how you need it. Get started with HPE Support Center today.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
Ruben_Herold
Frequent Advisor

Re: Lost access to Logicaldrive after change broken cache module


hi,

yes we contacted our hpe reseller, they told us cause now warranty on the server that the controller on board could be broken or the cache module.

We removed the cache module and the disks for testing an the controller came up normaly. Reinserting the old module goes to the same controller error. After inserting a new cache module with the same Partnumber the controlles comes up normaly. After reinserting the disks (same positions) we got the described error message.

We don't swapping cache from a working machine it was allready in a state that the controller did not come up.

Ruben_Herold
Frequent Advisor

Re: Lost access to Logicaldrive after change broken cache module

Any news on the issue?