MSA Storage

Serious Performance Issue on SAN Infra

New Member

Serious Performance Issue on SAN Infra

SAN #1
- 5xProLiant servers (4xDomino R6, 1xFile/Print)
- 5xFCA2101

SAN #2
- 5xProLiant servers (unknown, probably 2xDCs)
- 5xFCA2101

SAN Router/Switch:
NSR N1200
Switch 2/16-EL (pri)
Switch 2/16-EL (backup)

SAN Storage:

We have major problem on SAN#1 whereby we are getting very poor disk performance, which results on freezing (system juz no response for few seconds but rare occurence) or activities try long time to response. We have monitored and diagnosed that it is not the server resources issue since all servers have 4 processors, 2GB of memory and the utilization is less than 30% of CPU and less than 70% of memory at peak.

We have also observed the following error messages captured in event log:

51: An error was detected on device \Device\Harddisk1\DR3 during a paging operation.

9: The device, \Device\Scsi\CPQKGPSA1, did not respond within the timeout period.

These error messages keep appearing randomly at all different time of the day, at no fix interval.

Would appreciate experts out there to take a look and give some suggestions. We have tried approaching the local HP techsupport but it's useless as no solution to the problem is provided.

Thank You.
Stephen Kebbell
Honored Contributor

Re: Serious Performance Issue on SAN Infra

Hi Anthony,

I'm not sure I understand your SAN Layout. You have listed 2 SANs, but only 1 MSA1000. Could you perhaps provide a diagram of your SAN showing how everything is connected?

Thanks and regards,
New Member

Re: Serious Performance Issue on SAN Infra

Dear Stephen,

Pardon me. I am still trying to get the exact diagram from the relevant team.

The two SAN arrays are actually sharing the same NSR N1200, thus 10 hosts shares the same NSR.

For SAN#1, it has it's own MSA1000 and MSL5052SL. For SAN#2, I have no idea as that array is out of my authority, which is why I am awaiting response from the other team.

I am now trying to narrow down if it's a SAN box problem or a SAN switch/router problem, that is severly degrading performance day by day.


Re: Serious Performance Issue on SAN Infra

How is this SAN zoned?
Have you looked at the "porterrshow" on the switch?
What is the switch firmware level at?
What is the firmware and driver level for the FCAs?
How is the NSR zoned? Are FCAs zoned for both the NSR and the MSA? If so, are backups running while the FCA is involved in writing data to the MSA?

This could be a port storm, or some other major problem in the fabric.
Experience is the name everyone gives to their mistakes. Oscar Wilde.