Storage Boards Cleanup
To make it easier to find information about HPE Storage products and solutions, we are doing spring cleaning. This includes consolidation of some older boards, and a simpler structure that more accurately reflects how people use HPE Storage.
HPE StoreVirtual Storage / LeftHand
cancel
Showing results for 
Search instead for 
Did you mean: 

Serious Wrong "Overprovisioned" Error on VSA 2014

SOLVED
Go to solution
banaie
Occasional Advisor

Serious Wrong "Overprovisioned" Error on VSA 2014

Dear Friends,

I have faced a serious problem with VSA 2014. It's provisioned on two BL460 servers that are attached to D2220 DAS.

Each storage has about 7TB of storage & I'm using them in a cluster RAID 0 (14TB of real storage). I have created a thin provisioned and presented to vmware as 50TB.

 

The problem is I have been using this solution for three months & I never go above 4TB on the storage.

 

But, as vmware shows I haven't used a lot & now even after removing a lot of files vmware shows that it's almost empty:

 

But, CMC shows it's full and it's using 14TB!!

 

Now, the storage has stopped working & I can't copy any file or use it in any way. I really need an urgent help.

 

I have migrated all the important virtual machines but, the overprovisioned error still persists. I even pachted the system to the latest updates.

 

Please, help me on this problem.

 

Thanks

13 REPLIES
Mr_NKC
HPE Pro

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

 There could be a possibility that the changes made to the TP volume on VMware haven't got refreshed at the VSA software end. May be you could manually delete the volume from CMC. 

Regards,
NKC

I work for HP

If this post is helpful, thank me by clicking on the KUDOS! "White Thumbs Up" below
banaie
Occasional Advisor

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

Thanks a lot for your reply. I can do that for this time. Because, I have saved all my important data. However, this solution is becomming more & more operational in our servers. Then, I'm worried that this problem rise again in a bad situation where I can't do anything.

 

Is there any solution for me to solve this problem without deleting the volume? Can't I force the storage to refresh its data or validate the data again to solve the problem?

 

 

Thanks a lot

oikjn
Honored Contributor
Solution

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

you are going to have to delete.  THere is no SCSI UNMAP feature to the SAN so once thin space is claimed, it cannot be reclaimed other than by deleting the LUN.

 

If you have the data backed up and it really is mostly empty, maybe you could roll the dice and resize the LUN to shrink it down and hope it doens't take out part of your data.  Then you can create a new LUN and move your data to that one.

 

 

For the love of god, use NR10 for any production data.  Having production data on a Raid0 anything is just irresponsible.  It appears that you really don't need all the raw storage so I see no reason why you can't use NR10.

 

 

In the future, you should consider many smaller LUNs instead of a single large LUN as that limits the affect of a situation like this or one where the filesystem gets corrupted. 

 

I'm not an ESX guy so I can't say for sure, but I"m pretty sure there is a format style you can chose that forces writes to the disk and that will end up expanding thin provision LUNs...  don't pick that one ;)

 

When I overprovision my SAN, I generally create one full provisioned LUN called "emergency space" that I leave with enough space for me to be able to delete to regain my storage functionality in case I'm not paying attention and the SAN fills to 100% and then by deleting it I can get production going again and deal with the response (spool up a VSA or buy another node or trim data).

banaie
Occasional Advisor

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

Thanks a lot for your enlightening reply. However, I thought it's more easy to present the storage as a single LUN & manage it in the vmware.

Then, you say I should try using different LUNs for different data. Is it the correct aproach.

About the RAID 0, as I have raided all the hard disk physically on the server and I presented it to the server as a single storge, I thought I don't need to re-RAID it in the VSA, too. Should I present hard disks (9*900GB & 3*250GB) to the server separately & try to raid them in the VSA?

 

Many thanks again

oikjn
Honored Contributor

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

the VSA software doesn't do raid and only does Raid0 for disks presented to it, so you should continue to provide hardware raid volumes to each VSA for it to use as storage.  This will provide you with disk level redundancy.  What you are missing is the node level redundancy which is the biggest selling point for the StoreVirtual SAN.  When you have a SAN LUN configured for NR0, the LUN requires ALL nodes to be online and functioning to provide service to that LUN, when you have a NR10 LUN, at least one (sometimes more depending on the situation) Node can go offline and the LUN will keep functioning.  This allows for online software updates to be non-disruptive.

 

CMC should be shouting at you giving you warnings that you are using NR0 and to use at your own risk.

 

your environment isn't THAT big...  I would suggest that you limit each LUN to 2TB in NR10, or something like 2.3TB or 1.7TB if you want to make all LUNs equal size and have some spare space.

 

Either you haven't moved all your data over, of you have some serious room to grow with your space.  If you have room to grow, I would simply create three 1TB thin NR10 LUNs and leave the rest of the space unprovisioned for future use (you might want to provide direct iSCSI access to a guest VM for something like SQL clustering or file server clustering in the future.

 

I don't know if its possible, but if you can afford the downtime on the SAN, your environment is small enought that I would suggest you delete it and set it up with a better naming convention.  VSA-1 for your managment group AND cluster is just confusing if you ever add a 2nd cluster.  name the management group something like VSA-SAN or VSA-MGT-GRP or COMPANY-SAN-1.  Unfortunately this can't be renamed so you either have to delete the group or leave it as is.  The cluster can be renamed and at the least it should.  Call it something like VSA-Cluster-1 or Cluster-1 or anything other than the same name as your management group.  At my company we used COMPANY-SAN -> COMPANY-Cluster-# -> COMPANY-VSA-#

 

As far as disk Raid and then network raid on top sounding really inefficient... it is.  If you use NR10 and R10, you end up with a 1:4 ratio of usable to Raw space which is just aweful, but what you get for that is something that is just 100% bulletproof, so your question becomes, what costs more... a couple disks and a little power or system downtime and for most places with <50TB usable space required, the answer is easy...  AVAILABILITY.  The answer is the same over 50TB, but then you can get that availability through other SAN solutions that might be more efficient than this one.

banaie
Occasional Advisor

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

Thanks again for your precious points.

I was thinking that presenting more LUNs to vmware might create a performance issue. Then, I was trying to present it as a whole LUN & try to manage it on different vmware clusters that has access to the same storage. I thought it might help me avoid wasting unused storage that would be left on each LUN.

 Moreover, one of my friends told me that Thin Provisioning does not work correctly in the VSA (and it's the cause of my problem with the storage). Then, you don't supoort this argument & say that it's perfectly safe to use this feature?

oikjn
Honored Contributor

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

it works and it doesn't.  The problem is that the SAN does NOT support space reclaimation, so once a thin volume expands, its exapanded forever.  If you do a complete secure format to a thin LUN it will expand to 100% capacity and stay that way.  So if your definition of thin provisioning requires space reclaimation, then yes it is broken.  My biggest problem with this SAN is that is lacks this feature, but its something you can work around once you know its limitation and was definitely a contributing factor in your problem here.

 

That said, you SHOULD ALWAYS use thin provisioned volumes on this SAN because it is required for creating think provisioned snapshots.  What you should be careful of is your overprovisioning rate.  To present 100TB of space when you only have 10TB unless you know your usage pattern really welll and how it interacts with the SAN because as you saw, you can quickly "run out of space" even when you should have plenty of space available.  There is generally no good need to have anything so highly overprovisioned unless you are talking about a huge number of very small LUNs.

 

Your intuition about consolidating LUNs is actually backwards... the more LUNs you make, the more connections the nodes can make to the SAN so the more load balancing they can do...  of course this assumes you are within the limitations of your switch capacity and iSCSI connection limits.  The drawback you mention about wasted space for free space on each LUN is what you should be able to make up with thin provisioning.   If you do this I would suggest you also create a single thick LUN that is something like 50GB in size just as an emergency space that you can quickly delete if you do manage to fill your SAN again.  I would also suggest that you design your LUN sizes based on the assumption that you would size them as if you were using FULL provisioning and oversize them only as little as required for your application, then make sure that if you leave at least enough available space on the SAN to cover one LUN filling up its free space and a little extra.  For example, lets say you have 10 LUNs at 100GB in size that you want to keep with 30GB free space in them.  That means you have a total of 700GB of expected data and 300GB of expected free space.  Given what I said above, you should be able to get away with something like 750GB of usable storage which would be overprovisioned to present like 1000GB.  This assumes you get some experiance with what causes and doesn't cause LUNs to expand... like don't try a secure erase to a LUN as that will quickly fill it to max capacity on the SAN.

 

 

Bart_Heungens
Honored Contributor

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

Hi,

 

Know that HP announced at VMworld recently that with the next update on the Lefthand OS software space reclamation will be in there... Release date not specified, I assume somewhere beginning of next year? To be continued...

 

 

Kr,

Bart

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
My blog: http://blog.bitcon.be
oikjn
Honored Contributor

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

I'd love that if it were true bart.  Hopefully they are better at holding their schedule than they were for v10.5... or v11... or v11.5.  I think they average 9 months late on their projections :(

 

I'm at a coin flip on if I'm going to renew support for the upcomming year or try and migrate to something else as even the built in iscsi target in windows is getting close to feature pairity!

Bart_Heungens
Honored Contributor

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

All the previous versions I know there were 'rumours', now it is shown on slides at VMworld... So lets hope indeed, lot of customers waiting for it... And recently with the firmware update on MSA that also supports space reclamation from day 1 at release, StoreVirtual must follow for sure...

--------------------------------------------------------------------------------
If my post was useful, clik on my KUDOS! "White Star" !
My blog: http://blog.bitcon.be
banaie
Occasional Advisor

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

Thanks to your discussion, I think I might know what made this problem for me.

We are using VMware Horizon View on the storage to give away virtual desktops.

As I know, it uses linked clones to save the space. Can it be the case when it creates the disk on the storage & deletes it, it doesn't delete from the storage & makes this problem?

 

Is there anyway to avoid this problem? Because, if it is the case, I think it would occur often for me.

oikjn
Honored Contributor

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

I"m not familiar with that so I can't say one way or the other, but if it writes to a new spot on the LUN and then deletes it and then writes to another new spot on the LUN and deletes it and rinse and repeat, it COULD cause a thin LUN to expand depending on how it was written.

 

You might want to check to see if there is a way to take advantage of the LUN clone feature of the SAN to make that function happen at the SAN level as snapshots and clones are almost always more efficient at that level.

 

Best thing to do is create a smaller LUN, test and monitor to find out what is going on, but make sure you don't overprovision until you know you have your disk growth under control.

 

For me, I have a couple VMs whose write profile just causes LUNs to expand to 100% capacity so I had to isolate them onto their own smaller LUNs that I know will consume 100% of their provisioning and then the rest of my LUNs all have been steady with their growth following almost exactly with what their actual data growth has been.  The trick at the beginning was identifying those expanding LUNs and limiting their max size before they consumed all SAN capacity.

banaie
Occasional Advisor

Re: Serious Wrong "Overprovisioned" Error on VSA 2014

Thanks a lot.

Now, I'm almost sure that the problem arises with thin provisioning for VMware Horizon View. Because, it uses space reclamation to save more space & as it doesn't work on the VSA, it makes the storage to expand.

 

As a proof for this statement, I have seen the vmware view usually initiates Wipe Flex-SE command on the storage from time to time. It should be the exact problem maker!

 

I think if I limit the storage for the vmware view to a limited storage that it can't expand so much, it should help me solve the problem & avoid it.

 

 

I'm really grateful for your porints specially you "