HPE EVA Storage
1833757 Members
2827 Online
110063 Solutions
New Discussion

MSA1000: Rebuild problem with cluster servers attached

 
Matthias Rettl
Frequent Advisor

MSA1000: Rebuild problem with cluster servers attached

Hy there!

With FW Version 4.32 (Sep. 2003 but still actual!) the MSA1000 will NOT rebuild an array after a disk failure, because MSCS accesses the drives every few seconds. This is a known and documented issue (sic!) but there's no solution for that problem yet. With the MSA1500cs they fixed this problem in one of the latest FW revisions.

Does HP plan to fix this major problem with the MSA1000 either or is FW 4.32 the last ever firmware for this box?`

regards,
matt
15 REPLIES 15
Uwe Zessin
Honored Contributor

Re: MSA1000: Rebuild problem with cluster servers attached

Huh? I thought that was a limitation for the first-time parity initialization. Have you seen this problem in the field? (I don't do much with the MSA1000 and I haven't tried it myself)
.
Matthias Rettl
Frequent Advisor

Re: MSA1000: Rebuild problem with cluster servers attached

Yes, seen it often in field. The array stays at "rebuild pending" and nothing happens, as long as the MSCS is running on one of the nodes. Horrible. Most clients don't know anything about this issue. Next thing is, that the Hotspare drives will not work either... of course not, the array would try to rebuild on the hotspare but cannot.

I attached a "HP Best Practices Document" where you can find this information on page 2 ("In a cluster configuration, parity initialization of LUNs will not finish."...)

NEVER use MSA1000 in cluster configurations!!

Regards,
Matthias
John Kufrovich
Honored Contributor

Re: MSA1000: Rebuild problem with cluster servers attached

If you use ACU to configure the MSA the controllers Priority settings while default "Low". Not recommended, especially in a cluster environment.

If you use cli, the default settings are set to medium by default.

There is a fix in the next FW release of MSA1000.

Matthias Rettl
Frequent Advisor

Re: MSA1000: Rebuild problem with cluster servers attached

Hy John!

Even if rebuild priority is set to "High" the rebuild process will not start - i tried this first, of course.

You say "in the next FW version". Do you know, when there will be a new version? For MSA1500cs there are new releases every few weeks (i have a hp internal v4.94 and its working pretty fine). MSA1000 is still, as mentioned before, 4.32... unloved child or what?

thanks and regards,
matthias
John Kufrovich
Honored Contributor

Re: MSA1000: Rebuild problem with cluster servers attached

Not unloved, it's just a matter prioritizing test resources. It will be forth coming soon, fingers crossed.

Re: MSA1000: Rebuild problem with cluster servers attached

Does anyone have an ETA when the new firmware will be released.
Matthias Rettl
Frequent Advisor

Re: MSA1000: Rebuild problem with cluster servers attached

Feb, 30. 2017
or so

Tv_3
Respected Contributor

Re: MSA1000: Rebuild problem with cluster servers attached

I am concerned.

I'm starting a new implementation with an MSA 1000 as cluster storage and two DL 580?

Are you guys saying, that my disks will not be rebuild after after replacing a failed disk?
Chaos, Panic or Disaster??? Always Stay Calm, I will fix it.
Herman Sugeng_1
Frequent Advisor

Re: MSA1000: Rebuild problem with cluster servers attached

The problem is describe in EC030424_CW01 .
I have solved this problem by stopping all servers connected to the MSA1000, wait a while, when you see the disk is initializing (disk activity flikkering) or you can goin the serial port of the controller and check with CLI. And when it is finished (disk activity stops flikkering) or you can goin the serial port of the controller and check with CLI. Than you can start all the servers again.
Best regards.
Greg Carlson
Honored Contributor

Re: MSA1000: Rebuild problem with cluster servers attached

Matthias,

Did the background parity initialization ever complete on that server? As Uwe stated, the rebuild process is very different than the background parity issue. Background parity only runs during periods of no i/o. It is stated to make sure that completes prior to completing the cluster install. The rebuild will start regardless of disk i/o, and the rebuild priorty just determines how much processor power will be directed to the rebuild process.

The advisory should not apply to rebuilds, only to background parity initialization.

Ciao,
Greg
Lets Roll!
Greg Carlson
Honored Contributor

Re: MSA1000: Rebuild problem with cluster servers attached

Matthias,

Please run the ADU (Latest version 7.31.2.0) and post the report.
http://h18023.www1.hp.com/support/files/server/us/download/22912.html

Ciao,
Greg
Lets Roll!
Matthias Rettl
Frequent Advisor

Re: MSA1000: Rebuild problem with cluster servers attached

Greg,

i know, the advisory sais, it SHOULD not reflect to rebuilds, but it does in real live. the rebuild will not start (regardless of the "rebuild priority" setting) if there is no interruption of host i/o's!

what Herman does is the only way to start the rebuilds. anyone with a MSA1000 knows... except HP!

thanks for the ADU link, but i cannot try this out - i got rid of the MSA - thanks god!

best regards,
matthias
Tv_3
Respected Contributor

Re: MSA1000: Rebuild problem with cluster servers attached

As I see in the customer advisory EC030424_CW01 revision 0.


For a new installation, do not enable or install the cluster service for Windows 2000 Advanced Server or Windows 2000 Datacenter Server or do not configure RAID 4, RAID 5, or ADG logical drives as a cluster resource until the parity initialization process is complete.


How do you know when the parity initialization process is complete ???

Sincerly.
Chaos, Panic or Disaster??? Always Stay Calm, I will fix it.
Herman Sugeng_1
Frequent Advisor

Re: MSA1000: Rebuild problem with cluster servers attached

Hallo TV,
You can see it in:
1. ACU on the info of array there is a message telling ... background parity initialization is queued and will start, performance is degraded .... (sorry, do not have exact message)
2. CLI, use command show units, you will see "Parity Init Status: 0% complete" or "Mirror Init Status: 0% complete" in the logical unit affected. Normal it should be "Complete"
Best regards
Tv_3
Respected Contributor

Re: MSA1000: Rebuild problem with cluster servers attached

Bedankt Herman!

I understand what you meaning, now I remember the message. I have seen it before.


Chaos, Panic or Disaster??? Always Stay Calm, I will fix it.