Disk Group Failure

 
Jacky Wing
Regular Advisor

Disk Group Failure

I have an eva4000 with a diskgroup of 28 disks. One of the disks failed so we replaced it. the data leveling is complete but now we get an error in the disk group with a popup that says: "A hardware failure has occurred in this disk group. The advanced virtualization ...." .
I did the following:
1- restarted the command view service --> same problem
2- restarted the SMA ---> same problem
3- reduced the usage to about 2 TB out of 7 TB (I am running a test environment so I can do that) --> same problem.
4- I ungrouped 3 disks --> same problem.
5- If you go to the redundancy it says that there is a disk that lost some data because one disk was removed ???

well, all my vdisks are running with no problem, i browsed the whole EVA storage but did not find any vraid 0 disk. everything runs normal from the vdisk point of view, but I keep getting this error .

Any idea how to troubleshoot? and I didn't find any significant log in the error log. just that one disk is removed, ....
17 REPLIES 17
Johan Guldmyr
Honored Contributor

Re: Disk Group Failure

Hi, can you please paste/attach the whole error message about the "hardware failure"?

It should tell you specifically which vdisk has failed.

If you have a test environment, wouldn't you be able to accept the losses? Do you get that option?

Also, which firmware and CV version are you running?
Jacky Wing
Regular Advisor

Re: Disk Group Failure

I am running a test environment, this is why i can reduce the storage to 2 TB out of 7 TB. but I cannot wipe out the whole disk group.

I am running CV 6.0 with CR0D63xc3p-6000.

the redundancy error says: "Your storage system contains data that is not protected by disk drive redundancy. Refer to your system documentation or your HP representative before removing disk drives from your system.
Removing a disk drive may cause loss of data because the following conditions exist in your system:

â Data on one or more disk drives is currently unavailable. A disk drive may have been removed from the system.
"

the disk group error is attached.

I am thinking of restarting both controllers one at a time.
Johan Guldmyr
Honored Contributor

Re: Disk Group Failure

When you look on the disk group in CV - does it say that everything is OK?
Jacky Wing
Regular Advisor

Re: Disk Group Failure

It gives a warning and the error message I attached before. It also says that all the vdisks are ok in the disk group (in the vdisk tab).

Re: Disk Group Failure

Can you please send us the Controller Event Log and Configuration Log here ?

I would suggest you to upgrade your CV EVA to latest version,
https://h20392.www2.hp.com/portal/swdepot/displayProductInfo.do?productNumber=CommandViewEVA9.3

Also please let me know the controller firmware version ?

Is this the latest version as well ?

I work for HPE
Accept or Kudo
Johan Guldmyr
Honored Contributor

Re: Disk Group Failure

See two posts up Subhajit.

Upgrading to above 6.000 and CV 6.x may not be possible for a test environment (licensing etc..).

But I guess it's very likely that in one of the releases after 6.x there is a fix that relates to this.

Re: Disk Group Failure

As per the error message it seems to be redistribution of data is going on across the drives of the disk group.

If this is a false alert, then it may resolve by upgrading the CV EVA to latest version or by the upgrade of controller firmware.

https://h20392.www2.hp.com/portal/swdepot/displayProductInfo.do?productNumber=T4256-63136


I work for HPE
Accept or Kudo

Re: Disk Group Failure

Hello Johan/Jacky,

I am sorry that I missed it and you are correct that later firmware release have this fix.

Jacky can try with the trial version of CV EVA9.3 as well because this is a test environment.

Is it possible to build up a separate CV EVA server with CV EVA9.3?

Otherwise you can try to reboot the master controller and check.

I work for HPE
Accept or Kudo
Johan Guldmyr
Honored Contributor

Re: Disk Group Failure

Jacky: maybe you can provide a screenshot of the disk group page as well?
The configuration log is hard to get out of CV 6 unless you have a script, which I don't have. But maybe somebody with a good tool can help you with looking at the log.
Probably a good idea to do that before attempting to restart the controller.

But, if you would try restart controller, I would suggest you start with the one that is master (you can find which via the OCP - the display in the front of the controller).
On the one that is not master it says 'slave' in the OCP.

Command View only speaks with the master controller.