StoreVirtual Storage
cancel
Showing results for 
Search instead for 
Did you mean: 

P4300 G2-- 'Cache 1' status is 'Corrupt'

achile23
Occasional Advisor

P4300 G2-- 'Cache 1' status is 'Corrupt'

Hello

my configuration is as follows:

- A drive bay (P4300 G2/8To) ----> Unique Storage system
- A cluster
- A single volume containing all the available space on the storage system, with RAID 5 configuration.

this configuration was miseen place first, to familiarize themselves with the storage solution, but now we have an error that prevents the volume to be loaded, and the iSCSI connection is mode: Reconnecting ... if pingue VIP Cluster was no answer!

the log file contains the following messages:

 

Warning,E00000101:EID_MAINTENANCE_STATUS_ON,The management group '____' maintenance mode is 'On'.
Warning,E00020101:EID_UTILIZATION_STATUS_EXCESSIVE,The cluster '____' utilization is 100.00%.  This utilization value exceeds 95.00%.
Critical,E00060200:EID_S_SERVER_STATUS_DOWN,The storage system '____' status in cluster '____' is 'Down'.
Critical,E00080100:EID_V_REPLICATION_STATUS_OFFLINE,The volume '____' status in cluster '____' is 'Offline'.
Critical,E01020101:EID_CACHE_CORRUPT,The 'Cache 1' status is 'Corrupt'.  Contact technical support for assistance.,Cache,Cache 1,......

 

please I need your help, how to solve the problem without losing data stored ?

13 REPLIES
oikjn
Honored Contributor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'

looks like you have two issues...  the first is that you have a corrupt cache module and that just needs to be replaced.  While that is in its current state you will not get drive caching wich will hurt performance, but if you weren't monitoring your san (like it appears you weren't), you would likely not notice unless you were getting complaints about performance.

 

The other problem, and the one that is likely why you are actually posting here is that it shows your SAN is 100% full.  When that happens, all drives become read-only because there is nowhere left for them to write and your servers start complaining about failed iSCSI LUNs.  There is only one fix for this, but you can do it through four different methods.  

 

Fix:  get some free space on your san.

 

method 1:  delete some snapshots or some LUNs to make space.

 

method 2:  convert any FULL provisioned luns you have to thin provisioning.

 

metho 3:  convert some thin provisioned LUNs from network raid 10 to network raid 0 just get gain some free space so you can clean up and delete stuff and then convert back to network raid 10.

 

method 4:  add an additional node(s).  Since your san is down, you probably don't want to wait for this, but if you can't free up space through method1 you don't have an option.  you can do this a little quicker if you have some physical hardware already with as much or more capacity as a single node and install a VSA version to it.  If you have the hardware, you don't need to buy a license for the VSA and can just use the trial install as a temporary solution until you get your SAN usage down.

 

 

No matter what you do, you are going to have to free up space or add more space to your san as now you have it 100% full. 

 

Unfortunately, if you have thin provisioned LUNs that are expanded out, but are actually with empty space, the only way to shrink them is to migrate your data off the LUN to something else, then shrink the LUN size in CNC, then expand it back and it will be smaller again... there is no support for reclaiming free space from an expanded LUN without shrinking it down, but that will cause data-loss on that LUN if you have anything on it to the point where you would want to reformat the drive on the server if you did that.

achile23
Occasional Advisor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'

you should know that for a reason, stop storage system was forced (manually), I think this is the reason that caused these errors.


for the message that the cluster is saturated, it is wrong, before the error occurs, there were about 6TB available, just 100MB was used.

as you know, I have juste one cluster in my SAN, so I can not detach the cluster momentarily as his manager is the only and the last manager, so i can't stop it, also you should know that delete the cluster will cause the delete of the volume, and so all data will be lost.

In my case, I can not perform a replication volume to another remote cluster, I have one management group and juste a single cluster.

I think the forced shutdown has corrupted the contents of the cache, I thought to remove the cache of its allimentation to empty its contents, and replace it again and restart the system, what do you think?

oikjn
Honored Contributor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'

I can't say I 100% follow.  Are you saying you only actually have 100MB of space used on the SAN and there should be 6TB free?  And you are saying that you don't have that 100MB backed up or can't recover that 100MB?

 

Can you take a screenshot of your LUNs and san space in CMC so we can actually see the structure?  it doesn't make sense what you are saying.

 

does the san say it has any space available in the cluster?

achile23
Occasional Advisor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'

I forgot to thank you for your help, and sorry if my English is not quite correct, and I hope you will help me a lot more than you'll blame me, I'm not an expert, I took my first steps in data storage, without having received any professional training, imagine the nightmare that is!.

 

Good

 

I swear before the error occurred (cache corrupt), I handle the logical volume from my server normally.

it is a fresh install, I had mounted the logical volume as a partition on my server after format space of 06To to GPT for the operating system can handle it, we hardly working on, and I am sure that the content did not exceed 150MB (space used). After the error cache, the CMC displays that the full 06To are almost used, I do not know how, but it is reality.

 

I tried to map the structure, back to work I will communicate more details.

oikjn
Honored Contributor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'

no problem.

 

can you show the storage system info on the right side of cmc?  

 

with one storage system you don't have the option to convert to network raid 0 to gain space... likewise, with no snapshots and only one LUN you really don't have an option to get space back.

 

can you say what the consumed space value is for your one LUN?  Is it 150mb or is it your entire SAN space?

 

If you don't have a backup of that 150mb, you might need to contact support to help you get access to the LUN again, but honestly, for 150MB, its probably faster to just delete it and start over.

 

BTW.  if you can at all afford it, get a 2nd node because it would save you from access problems like this when one node has a problem.

achile23
Occasional Advisor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'

 hello; Here are more details.
oikjn
Honored Contributor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'

firs thing you should do is right click on the volume you created, select edit, then change the property of the volume from "full provisioning" to "thin provisioning"

 

 

That should free up space on the SAN.

 

The cache issue is definitely something you need to talk to HP support about or if you don't have support, you have to figure out what model that card is and order a replacement from somewhere.

achile23
Occasional Advisor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'

thank you,

I wanted to tell you that when I opened the storage array, I noticed that the cache battery was swollen, what does it mean ?

 

I'll check the cache (interpretation of LEDs).

Gediminas Vilutis
Frequent Advisor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'


achile23 wrote:

thank you,

I wanted to tell you that when I opened the storage array, I noticed that the cache battery was swollen, what does it mean ?

 

I'll check the cache (interpretation of LEDs).


You cache battery is dead and you need to replace it. You can find replacement part number on battery 

oikjn
Honored Contributor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'

++ get that battery out of the server ASAP before it decides to leak and damage something harder to replace.

achile23
Occasional Advisor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'

the failure of the battery may be the cause of the error: cache corrupt?

 
or it's over, I must change the cache.
oikjn
Honored Contributor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'

not sure.  maybe thats the error message given when the battery is bad or maybe the cache actually got damaged when the battery failed.

 

 

Please let us know if you find out its just the battery error message or if its actually the a bad cache unit.  I would guess its the battery since I didn't see a specific "battery bad" error and you can visually see the battery is bad.

vlho
Advisor

Re: P4300 G2-- 'Cache 1' status is 'Corrupt'