- Community Home
- >
- Storage
- >
- Midrange and Enterprise Storage
- >
- HPE EVA Storage
- >
- SAN got slow suddenly
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-11-2009 07:54 AM
тАО03-11-2009 07:54 AM
SAN got slow suddenly
I noticed recently that the Data Protector backups were taking 4 to 5 more time to complete for the same amount of data.
To troubleshoot the issue I tested a rman backup directly to the server's disks and it is also taking more than twice the time it was taking before the issue.
This is happening since last month. Any idea?
Thank you,
Eric Antunes
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-11-2009 08:10 AM
тАО03-11-2009 08:10 AM
Re: SAN got slow suddenly
There are a lot of possible reasons for that, including, storage utilization by other hosts, location, disk failures, MPxIO related issues.
Check the service times for your disks, transfer rate, etc. If you have historic performance data, compare them.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-11-2009 08:31 AM
тАО03-11-2009 08:31 AM
Re: SAN got slow suddenly
This is a 1.5 Tera bytes SAN.
It is part of an EVA 4000 with 2 HP-UX 11iv1 servers, 2 Windows Servers (Win Server 2003) , 2 32-bits Blades (Win Server 2003) and 1 64-bits Blade (Win Server 2003).
It has the following hardware:
- 1 "HP 2-ch U320 SCSI HBA" (whatever that means);
- 1 "HP StorageWorks 4/8 Base SAN Switch";
- 1 "HP Universal Rack 10642 G2 Shock ALL";
- 1 "Mod PDU 16A HV WW ALL";
- 1 "HP EVA4000-A 2C1D Array";
- 8 "HDD 146GB FC 10K 1' ALL ADD-ON ALL";
- And, finaly, last December we added 1 "HDD ~292GB (not sure about the exact size) FC 10K 1' ALL ADD-ON ALL".
Eric
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-11-2009 10:52 AM
тАО03-11-2009 10:52 AM
Re: SAN got slow suddenly
- 8 "HDD 146GB FC 10K 1' ALL ADD-ON ALL";
- And, finaly, last December we added 1 "HDD ~292GB (not sure about the exact size) FC 10K 1' ALL ADD-ON ALL".
Since you only have 9 drives listed they must be in the same disk group. You will have 1373 GiB of total space in the Group (Protection level of None), but fully 20% of the space is held on the 300GB drive.
20% of space meants 20% of the data which means 20% of I/O. Your 300GB drive is badly hot-spotted.
Your best solution is to get two 146GB drives, add them to the Disk Group, and remove the 300GB drive. That configuration will add about the same space but each of the 10 drives will have 10% of the space, data, and I/O.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-11-2009 10:56 AM
тАО03-11-2009 10:56 AM
Re: SAN got slow suddenly
├В┬┐Is the source and destination storage the same? or the backup is local?
Have you tried doing read and write performance test only with dd, using /dev/zero and /dev/null?
├В┬┐What is your disks service time (avserv) as reported by sar -d 5 1000, during the backup?
├В┬┐If you have multiple paths, have you verified which path is using, and tested forcing the usage of the second path? We had a performance problem related to one HBA failure in a multi HBA host.
Also, if the other hosts connected to the storage changed they load, they can affect the overall storage performance. The evaperf tool may help you to check the status of your storage.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-12-2009 04:21 AM
тАО03-12-2009 04:21 AM
Re: SAN got slow suddenly
After all we have one more "HDD 146GB FC 10K 1' ALL ADD-ON ALL". So we have the following:
- 9 "HDD 146GB FC 10K 1' ALL ADD-ON ALL";
- 1 "HDD ~292GB (not sure about the exact size) FC 10K 1' ALL ADD-ON ALL".
Yes, our 292Gb disk gets rougthly 18% of the total I/O. You made a good point but we can't add 146Gb disks anymore because they were discontinued.
I noticed that this same 292Gb disk has an older firmware (H02) than the other disks (H03).
Ivan:
"Is the source and destination storage the same? or the backup is local?"
Yes, the source and destination storage was the same: from SAN to SAN, not local.
"Have you tried doing read and write performance test only with dd, using /dev/zero and /dev/null?"
What do you mean with "only with dd"? No activity on the server besides dd?
"What is your disks service time (avserv) as reported by sar -d 5 1000, during the backup?
Bad, vey bad:
"If you have multiple paths, have you verified which path is using, and tested forcing the usage of the second path? We had a performance problem related to one HBA failure in a multi HBA host."
We have only one HBA. Have we a bottleneck here...?
Will check the evaperf tool this afternoon ...
Thanks,
Eric
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-12-2009 04:28 AM
тАО03-12-2009 04:28 AM
Re: SAN got slow suddenly
Average c9t0d0 0.46 0.63 13 220 0.04 0.42
Average c4t0d1 14.95 1.24 56 1678 0.50 5.07
Average c4t0d2 0.60 0.50 11 681 0.01 1.09
Average c4t0d3 0.01 0.50 0 0 0.00 5.58
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-12-2009 04:31 AM
тАО03-12-2009 04:31 AM
Re: SAN got slow suddenly
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-12-2009 06:36 AM
тАО03-12-2009 06:36 AM
Re: SAN got slow suddenly
Check also your rman parallelism configuration, check also your overall system resource usage when you run your backup.
I would like to see some performance data collected with sar during your backup session. Disk, memory, and cpu.
Also, please specify which one is your SAN disk on the output of sar -d.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-12-2009 09:26 AM
тАО03-12-2009 09:26 AM
Re: SAN got slow suddenly
Here are the stats collected during the rman backup (paralellism=4).
The SAN physical volumes are c4t0d1 and c4t0d2.
Eric
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-12-2009 11:50 AM
тАО03-12-2009 11:50 AM
Re: SAN got slow suddenly
I have found this:
============================================
What is your database server doing during the backup? Is the CPU busy? Does it page a lot; are you getting large amounts of I/O wait? Setting filesperset=100 may be a mistake. If you├в re seeing a high %WIO, try setting this to the number of disks on your system (or less). Unless your disks are more than 10 years old, serial reads on one or two disks should be able to easily exceed the bandwidth of most networks.
It is recommended that 2 channels per device are allocated, but test to verify the performance improvement.
============================================
See also:
http://www.sc.ehu.es/siwebso/KZCC/Oracle_10g_Documentacion/server.101/b10734/rcmtunin.htm
Maybe, there is some tuning that you should do to the rman backup process.
Please also report the results of (preferibly over a file system and a destroyable raw device):
FS
date; time dd if=/dev/zero of=testfile bs=8k count=131072
RAW
date; time dd if=/dev/zero of=/dev/rdsk/
FS
date; time dd if=/dev/zero of=testfile bs=256k count=4096
RAW
date; time dd if=/dev/zero of=/dev/rdsk/
Again the sar ouput during the tests. Remember, writing to the raw device will destroy it's data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-12-2009 11:23 PM
тАО03-12-2009 11:23 PM
Re: SAN got slow suddenly
"You made a good point but we can't add 146Gb disks anymore because they were discontinued"
It's true, 146GB 10K disk are discountinued, but you can add 2 or more 146GB 15K disk, which is better solution then 300GB.
364621-B22 / 146 GB 15K rpm dual-port 2/4 Gb/s FC-AL 1-inch (2.54 cm) drive.
Regards
Simon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-16-2009 01:33 AM
тАО03-16-2009 01:33 AM
Re: SAN got slow suddenly
Here are the "sar -d" stats collected friday morning while doing some rman archivelog backups.
c4t0d1 and c4t0d2 (SAN physical devices) are 10 times slowest than the local physical device - c9t0d0 - which has all the redolog activity.
This is not always slow (my thursday stats were ok) but it is frequently slow.
Eric
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-16-2009 01:35 AM
тАО03-16-2009 01:35 AM
Re: SAN got slow suddenly
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-20-2009 09:42 AM
тАО03-20-2009 09:42 AM
Re: SAN got slow suddenly
Do you have any EVAperf data? Ivan comments are accurate on the disks sizes. I know that HP best practice is to keep different size disks in different disk groups. There are several reasons behind that.
Although equally important, I think you need to determine if the problems you are having are SAN or Host based. The only way to do this is look at performance logs from each and then start ruling out one or the other.
On the SAN side, I would look at the I/O of the disk group. 9 disks do not give a lot of I/O capability. If you are pushing the I/O capacity of the disk group then you will probably run into high disk queues/busy. An example would be 9 146 GB 10K drives will give you approx 484 I/O's total at about a 60/40 Read/Write configuration and about 661 I/O's @ 80/20 R/W. That really isn't a lot of I/O's.
As far as having 1 300GB 10k drive in there, this can lead to performance issues on the Array eventually. RSS ID's are formed in sets of 6 - 12 disks. Within that RSS ID, you probably have a 146 GB drive that mirrored itself to the 300GB drive. You are probably losing half the space on the 300GB drive in addition to it taking 20% of the I/O. Also when you set up a disk group, an EVA will take the largest drive in a disk group and take double that space (the EVA is assuming worst case RAID1) for protection. If you aren't using any type of protection then keep an eye on your space to make sure that you can lose at least 300GB of space.
My recommendation would be to run some EVAperfs and perfmon from the host side. There are hundreds of possibilities here that could be the issue but you need to know what your EVA is doing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО03-26-2009 09:25 AM
тАО03-26-2009 09:25 AM
Re: SAN got slow suddenly
It is surely a SAN issue since almost all server's I/O got slow suddenly.
I will explore - if allowed - all the suggestions.
Thank you,
Eric Antunes