MSA Storage
cancel
Showing results for 
Search instead for 
Did you mean: 

Slow write operations on Storage MSA 1040

 
makarand178
Visitor

Slow write operations on Storage MSA 1040

Hi,

We have RAID5 mechanism configured & having huge write operations on storage. We have a mount point with 10TB size. There are huge write operations happening on this mount point. Normal the speed of write operations goes around 300 MB/s but suddenly it degrades to 25 - 40 MB/s without any change in any configurations. Can we anyone suggest what is the root cause & how this can be improved?

9 REPLIES 9
HPSDMike
HPE Pro

Re: Slow write operations on Storage MSA 1040

Well, could certainly be a lot of things including MSA errors or stuff completely outside MSA (network, hosts, new patches applied, etc). Is this a single host that writes this traffic? Is it a VM? did that VM get moved to a different physical host than it was on before? I'll private message you a link to an FTP site if you want to drop your MSA logs I can see if I notice anything obvious in them. You can also consider opening a case with HPE support. 



I work for HPE. The comments in this post are my own and do not represent an official reply from the company. No warranty or guarantees of any kind are expressed in my reply.

Accept or Kudo

makarand178
Visitor

Re: Slow write operations on Storage MSA 1040

Thanks a lot for your response.

As of now, there is a single host writing into those mount points(in future, two host would write).  It is not a VM. We checked the speed using 'dd' command & found that speed of writing is getting decreased to 5-10 MB/s.

Let me know if I need to check anything else.

HPSDMike
HPE Pro

Re: Slow write operations on Storage MSA 1040

I private messenged you with an FTP link to upload the logs. I don't see them out there as of now



I work for HPE. The comments in this post are my own and do not represent an official reply from the company. No warranty or guarantees of any kind are expressed in my reply.

Accept or Kudo

makarand178
Visitor

Re: Slow write operations on Storage MSA 1040

I have uploaded the logs for object name : mmr_kc-0127549. MD5 sum for the same is 2ed126edde40b8b4763d688ada98157b

Not able to follow th instructions for other object name : mmr_kc-0134161. After write clicking on System from configuration panel, not able to see any 'Tool' option to download the logs

HPSDMike
HPE Pro

Re: Slow write operations on Storage MSA 1040

Looking through the logs, the array currently looks healthy (though I am not a tech supprot engineer).  You have 24 x 2TB NL disks. The disks are broken up into two linear vdisks with a variety of volumes created on the vdisks. At the time of your capture, it looks like all host traffic was coming in to volumes on vd01 vdisk (controller A). This means that, at the time, only 12 disks, one controller, and two host ports were participating in the workload. Depending on the workload type (random/sequential) and block size, (8k vs 256k) I calculate that the array should be able to perform between 9 and 906 MBPS with 12 disks. Has your host workload changed? Data fragmentation that would now make it more random?

Looking at the event logs, it appears all sorts of stuff was happening towards the middle to end of December (management ports up and down,  host ports up and down, power supply errors, etc). What was happening during that time and is that when you noticed the issue starting? 



I work for HPE. The comments in this post are my own and do not represent an official reply from the company. No warranty or guarantees of any kind are expressed in my reply.

Accept or Kudo

HPSDMike
HPE Pro

Re: Slow write operations on Storage MSA 1040

Also, for what it's worth, the firmware is quite old. GL225R003 is the most current and can be found here

http://h41111.www4.hpe.com/storage/msafirmware.html

 



I work for HPE. The comments in this post are my own and do not represent an official reply from the company. No warranty or guarantees of any kind are expressed in my reply.

Accept or Kudo

makarand178
Visitor

Re: Slow write operations on Storage MSA 1040

Your analysis is 100% correct. Due to some network issues, other host is kept disabled. Hence, all data is getting written from Controller A. Workload has not changed. No data fragmentation was performed on the disks.

We went live into production in mid of December. After that, till end of december, there were many issues from network, application & power perspective. Now everything has been sorted out but storage started responding very slow. Since 31st Dec 17, everything was running fine as per our expectation but at around 4th/5th Jan 18, we observered one of the 10TB mount point is having very less write speed. As a work around, we moved the write operation on other 10TB which was already getting utilised. Now we are in a situation where both 10TB mount points are giving very poor performance.  Other mount points of 500GB are doing fine & giving write speed of appox 300 MB/s. Today, we shifted complete load onto some other mount point of 500GB & it is owkring fine. 

1. Is firmware upgrade can be a solution? 

2. Would defragmentation help here? If yes, is it risky to perform on production live env? Can you give stpes & precausions?

3. Would mount point with big size(10TB) can be a cause of this issue? If yes, is it possible to breakdown the 10TB mount point into multiple without existing data corruption.

4. Can you share steps(commands) to identify how much space is left to create new mount points?

Thanks for your support(I hope I amnot asking too much..!)

makarand178
Visitor

Re: Slow write operations on Storage MSA 1040

Can you share the steps to upgrade the firmware? Is it possible to upgrade the firmware wih production live or I need to take the downtime?

HPSDMike
HPE Pro

Re: Slow write operations on Storage MSA 1040

The firmware update can be done online so long as host has working paths to both controller A and B. There may be some performance degradation during the upgrade, and for a little while after the upgrade, so it is best to conduct the upgrade during off-peak time. I would suggest downloading the Windows GUI Updater and running it from a Windows device that has network access to the management ports. The GUI installer will do the whole thing for you; you just type in the IP address, user name, and password. It makes sure everything is done in right order. Instructions are shown on the "Installation Instructions" tab of the link I provided in case you need them. 

Remember that your MSA is configured with "linear" storage. This means you are configuring your RAID sets (vdisks) and creating volumes on the vdisks. I can have one volume on the vdisk or 100 volumes on the vdisk but if one is being crushed (I/O wise) then all of the volumes on the same vdisk should experience similar behavior. This is because, at the end of the day, the same 12 disks are handling the workload from all volumes. Also, a given controller owns each vdisk. Since you have two vdisks, it's easy in your case. Controller A owns one vdisk and Controller B owns the other. So, your best best would be to spread out your heavy hitter volumes (or mount points) between the two vdisks so they each have 12 disks to work with and their own paths (via a different SAS cable and controller). Or, I supposed you could take the alternate approach. If you have one volume that is completely crushing the performance and impacting all other mount points/volumes then you could consider moving the rest of the volumes to the other vdisk so that they could operate properly even if one is getting crushed. At the end of the day it comes down to properly planning the infrastructure to support the application requirements. Your maximum performance (for any one volume) is currently only 12 x nearline 7200 RPM disks. For large block sequential stuff (backups, video, etc) then that might be OK but for smaller block and random stuff then that isn't a lot of performance potential. You may get to the point where you need to redesign your approach and start using SSDs or faster spinning disks. Maybe you move to virtual storage and install more spindles so that the volumes get to benefit from more overall devices/spindles than they do now. If this is all you can afford then maybe you rearchitect to put all 24 disks in the same vdisk (RAID50) or move to virtual storage and use pools with multiple RAID5 11+1 vdisks, etc. 

Do you have firm application characteristics/requirements that you can use to make sure you have the appropriate configuration in place? Best wishes. 



I work for HPE. The comments in this post are my own and do not represent an official reply from the company. No warranty or guarantees of any kind are expressed in my reply.

Accept or Kudo