MSA Storage
cancel
Showing results for 
Search instead for 
Did you mean: 

VDisk Rebuild issue MSA G2 2321FC

 
adamdb_uk
Occasional Advisor

VDisk Rebuild issue MSA G2 2321FC

Hi,

    I've been attempting to assist someone rebuild a failed RAID5 VDISK in  a 2 enclosure MSA 2321FC system. Sadly when a new drive is inserted the rebuild runs for some time and eventually fails. When I  examine the logs I can see the reconstruction has failed as below

 

B9320 2019-05-30 12:45:55 44 W B Unwritable cache data exists for a volume. (volume: , SN: 00c0ffd811c500003261c44a01000000) It comprises 1% of cache space.
B9321 2019-05-30 12:45:55 44 W B Unwritable cache data exists for a volume. (volume: , SN: 00c0ffd811c500003261c44a02000000) It comprises 1% of cache space.
B9322 2019-05-30 12:45:55 44 W B Unwritable cache data exists for a volume. (volume: , SN: 00c0ffd811c500003261c44a03000000) It comprises 1% of cache space.
B9323 2019-05-30 12:45:55 44 W B Unwritable cache data exists for a volume. (volume: , SN: 00c0ffd811c500003261c44a04000000) It comprises 1% of cache space.
B9324 2019-05-30 12:45:55 44 W B Unwritable cache data exists for a volume. (volume: , SN: 00c0ffd811c500003261c44a05000000) It comprises 1% of cache space.
B9325 2019-05-30 12:45:55 18 W B Vdisk reconstruct failed. Command failed. (error code: 1). (vdisk: VDISKAA SN: 00c0ffd811c50000c358c34a00000000)
B9326 2019-05-30 12:45:55 172 W B A vdisk was quarantined. (vdisk: VDISKAA SN: 00c0ffd811c50000c358c34a00000000)
B9341 2019-05-31 11:50:37 44 W B Unwritable cache data exists for a volume. (volume: , SN: 00c0ffd811c500003261c44a01000000) It comprises 1% of cache space.
B9342 2019-05-31 11:50:37 44 W B Unwritable cache data exists for a volume. (volume: , SN: 00c0ffd811c500003261c44a02000000) It comprises 1% of cache space.
B9343 2019-05-31 11:50:37 44 W B Unwritable cache data exists for a volume. (volume: , SN: 00c0ffd811c500003261c44a03000000) It comprises 1% of cache space.
B9344 2019-05-31 11:50:37 44 W B Unwritable cache data exists for a volume. (volume: , SN: 00c0ffd811c500003261c44a04000000) It comprises 1% of cache space.
B9345 2019-05-31 11:50:37 44 W B Unwritable cache data exists for a volume. (volume: , SN: 00c0ffd811c500003261c44a05000000) It comprises 1% of cache space.
B9346 2019-05-31 11:50:38 172 W B A vdisk was quarantined. (vdisk: VDISKAA, SN: 00c0ffd811c50000c358c34a00000000)
B9347 2019-05-31 11:50:38 18 W B Vdisk reconstruct failed. Command failed. (error code: 1). (vdisk: VDISKAA, SN: 00c0ffd811c50000c358c34a00000000)

It seems I end up with this unwritable cache data error everytime this drive is replaced. It's always the same drive at the same slot.

I'm a bit stumped why this may be occurring. I've tried clearing the cache, clearing the metadata on the newly inserted drive and repeating the process of reconstruction again (with a couple of different replacement drives) but the outcome is always the same.

Details of the volume referred to above taken from the log dump.

<OBJECT basetype="volumes" name="volume" oid="157" format="rows">
<PROPERTY name="virtual-disk-name">VDISKAA</PROPERTY>
<PROPERTY name="volume-name">VDISKAA_v002</PROPERTY>
<PROPERTY name="size" units="GB">998.9GB</PROPERTY>
<PROPERTY name="size-numeric">1951171872</PROPERTY>
<PROPERTY name="preferred-owner">B</PROPERTY>
<PROPERTY name="preferred-owner-numeric">0</PROPERTY>
<PROPERTY name="owner">B</PROPERTY>
<PROPERTY name="owner-numeric">0</PROPERTY>
<PROPERTY name="serial-number" key="true">00c0ffd811c500003261c44a02000000</PROPERTY>
<PROPERTY name="write-policy">write-back</PROPERTY>
<PROPERTY name="write-policy-numeric">1</PROPERTY>
<PROPERTY name="cache-optimization">standard</PROPERTY>
<PROPERTY name="cache-optimization-numeric">0</PROPERTY>
<PROPERTY name="read-ahead-size">Default</PROPERTY>
<PROPERTY name="read-ahead-size-numeric">-1</PROPERTY>
<PROPERTY name="volume-type">standard</PROPERTY>
<PROPERTY name="volume-type-numeric">0</PROPERTY>
<PROPERTY name="volume-class">standard</PROPERTY>
<PROPERTY name="volume-class-numeric">0</PROPERTY>
<PROPERTY name="blocks" blocksize="512">1951171872</PROPERTY>
<PROPERTY name="volume-parent"></PROPERTY>
<PROPERTY name="snap-pool"></PROPERTY>
<PROPERTY name="virtual-disk-serial">00c0ffd811c50000c358c34a00000000</PROPERTY>
<PROPERTY name="volume-description"></PROPERTY>
<PROPERTY name="progress">0%</PROPERTY>

I am wondering if other drives which are part of this VDISK are also failing however I could not see anything from the log dump to indicate this during the reconstruction.

Firmware is M112R14 which I am aware is deprecated. I know there is an Oct 2013 firmware available (M114P01-01) which I may suggest they update to,

Has anyone ever seen this behaviour? Is there a workaround or fix? We are now considering deleting the entire VDISK, rebuilding from scratch and restoring to see if that will resolve it.

 

Any advice appreciated.

 

thanks

Ad

5 REPLIES 5

Re: VDisk Rebuild issue MSA G2 2321FC

"Unwritable cache data exists for a volume" - this means new writes came to Controller cache but not yet written to backend drives because Vdisk is not available at that point. It could be vdisk was quarantined or offline or QTOF state.

Somewhere you have mentioned that you have cleared Cache data which is a big mistake because this data you lost.

Coming to why vdisk reconstruction failed. This could be due to other drives in same vdisk having hardware errors which is not allowing the vdisk rebuild process to complete. This means MSA controller not able to read from some other drives part of same vdisk to reconstruct RAID data. In RAID 5 more than one drive having issue or failed means it's RAID failed and data recovery difficult.

In future I would always suggest to keep your MSA up to date in terms of firmware with respect to all components like Controllers, IO modules and drives. Also this type of case you should always take help from HPE support.

Right now I would suggest to figure out other faulty drive and delete the vdisk.  Then replace all faulty drives together. Create new vdisk and volumes as per your requirement. Then restore data from backup.

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

***********************************************************************************


Accept or Kudo
adamdb_uk
Occasional Advisor

Re: VDisk Rebuild issue MSA G2 2321FC

Thanks for the very detailed response. Your thoughts with regard to the likely reason for the rebuild failure are pretty much the same as mine. The main issue I face is trying to identify the other potentially failing drives within this particular vdisk as nothing was shown that I could see in the store.log zip file I was given for review with regard to other drives having issues. I guess it will be a case of flattening the vdisk. Rebuilding it and seeing if it throws any further errors. Thanks for the prompt response.

Regards
Adam

Re: VDisk Rebuild issue MSA G2 2321FC

I am not sure how much you can decode the store.logs file. You can try unzip the file and open with Notepad++ then take a look for all drives serial number which are currently present in the system. You can also get the drive serial number from SMU or GUI or CLI as well. Take each drive serial number and search. In results window look for medium error or hard error of the drive. To be more specific look for sense key. To understand more about sense you can refer the below guide,

https://en.wikipedia.org/wiki/Key_Code_Qualifier

You can also log a case with HPE support and review the logs to find drive errors because we have Internal tool to decode the logs.

We can also look for vdisk level error as well which may get captured as part of scrub process. 

If you don't have further query then kindly mark this as resolved so that others can get the update who are following this topic.

 

Hope this helps!
Regards
Subhajit

I am an HPE employee

If you feel this was helpful please click the KUDOS! thumb below!

***********************************************************************************

 


Accept or Kudo
adamdb_uk
Occasional Advisor

Re: VDisk Rebuild issue MSA G2 2321FC

thanks. I'll do a search of the log. Sadly being so old I don't think HPE support G2 anymore. The device is currently on a hardware only breakfix contract with a 3rd party as a result. Have encouraged customer to think about new hardware but that's out of my hands. I'll see if I can identify what other drives may be contributing to the reconstruction failure by searching for the error types you describe.

 

thanks again.

Adam

Highlighted
Shawn_K
HPE Pro

Re: VDisk Rebuild issue MSA G2 2321FC

Hi Adam,

One thing you might consider is to create the new vdisk using offline init.

mode online|offline
Optional. Specifies whether the vdisk is initialized online or offline.

• offline: You must wait for the vdisk initialization process to finish before using the vdisk;
however, offline takes less time to complete initializing than online. At the time of creation, a
vdisk using offline initialization can have either one volume or none. If you want the vdisk to have
more than one volume, create the vdisk with no volumes and then add volumes after initialization
is complete.

This will move any bad blocks on the drives to the "do not use" list. Additionally, if there is an additional drive that is marginal for available block it will flag the drive as bad and remove it from use. It will take a little longer to lay down the parity and stripes for the vdisk, but might help ensure that once the init is completed you are not using a drive with potential problems.

Cheers,
Shawn

I work for Hewlett Packard Enterprise. The comments in this post are my own and do not represent an official reply from HPE. No warranty or guarantees of any kind are expressed in my reply.

Accept or Kudo