- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Bad I/O performance help
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-07-2003 09:04 PM
тАО05-07-2003 09:04 PM
Throughput on my SAN (XP512) also looks OK - there's no FC problems, and performance issues are located to this one host/application only.
The symptoms are:
* woeful I/O performance which is measured by Oracle seeing waits for I/O requests (sorry - not a DBA, I have no idea how this was measured)
* terrible backup performance (Omniback currently backing up 300Gb of data in about 10-12 hours to 3 tape drives, which I would expect to take 5-6 hours).
Attached are sar buffer and disk statistics for a single day. As I read it my disk stats are OK, I have some busy disks but nothing hot. rcache stats I find confusing - they indicate poor buffering for large parts of the day, however some other advice I've received is to mostly ignore rcache/wcache stats for Oracle hosts, as the SGA does most of the buffering for the database. Note that I've included only 03:00-09:00 for 'sar -d' as being a typical range of data. Backup completes at about 4:00 am, the reporing jobs run from 3:00am until late morning.
This host is set to use dynamic buffercache (dbc_max_pct=20% or 1.8Gb). Other advice received from Oracle metalink is that the buffercache shouldn't be any more than about 250mb in size. HPUX rule of thumb seems to indicate about 400mb. Is it possible that a large buffer such as this one is causing performance problems (eg: poor read stats)?
I've also seen various suggestions about disabling buffercache for Oracle index vxfs filesystems. Does anyone have any experience of this, what performance improvements are likely?
Any other feedback you can offer would be appreciated.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-07-2003 09:15 PM
тАО05-07-2003 09:15 PM
Re: Bad I/O performance help
If its not between 1.0 and 2.0 physical memory I've seen oracle run like a dog. This experience is on systems with less memory.
I've attached a sar script that collects more data into file, which I'd upload to oracle and hp support for further input.
With regards to your specific suggestion, I have not tried this.
What is the Raid level on the san disk? If its not raid 10 (1/0) there is a substantial performance hit.
That kind of raid consumes a lot of disk, but I've seen substantial performance boost by going Raid 10 on the data.
SEP
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2003 12:09 AM
тАО05-08-2003 12:09 AM
Re: Bad I/O performance help
For SAN we use disk striping (64Kb).
cf. this link
http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0x998c402f24d5d61190050090279cd0f9,00.html
Rgds,
Jean-Luc
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2003 02:49 AM
тАО05-08-2003 02:49 AM
Re: Bad I/O performance help
Swap is set to 1.5 time phys memory. This should be adequate for this host, we NEVER page.
Raid level on the array is RAID-5. This is probably irrelevant as the array is a fairly hefty XP512.
I've heard a lot of talk about setting buffer cache at about 400mb, but no evidence to suggest that setting large values will cause performance issues. Has anyone any information which will confirm or deny this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2003 03:02 AM
тАО05-08-2003 03:02 AM
Re: Bad I/O performance help
Has this performance problem gradually or suddenly appeared?
Also :-
Recommended Uses: RAID 5 is seen by many as the ideal combination of good performance, good fault tolerance and high capacity and storage efficiency. It is best suited for transaction processing and is often used for "general purpose" service, as well as for relational database applications, enterprise resource planning and other business systems.
For write-intensive applications, RAID 1 or RAID 1+0 are probably better choices (albeit higher in terms of hardware cost), as the performance of RAID 5 will begin to substantially decrease in a write-heavy environment.
Might be worth looking at raid 1+0.
Paula
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2003 03:58 AM
тАО05-08-2003 03:58 AM
Re: Bad I/O performance help
First, let's look at your Maxssize,Maxdsize,maxtsize and their 64 bit counterparts.
Here is ours from a small
(3 TB) server.
maxdsiz 0X50000000
maxdsiz_64bit 0X400000000
maxfiles 2048
maxfiles_lim 2048
maxssiz 0X800000
maxssiz_64bit 0X40000000
maxswapchunks 2048
maxtsiz 0X4000000
maxtsiz_64bit 0X40000000
If yours are smaller than this you may want to adjust you tuning parms....
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2003 04:17 AM
тАО05-08-2003 04:17 AM
Re: Bad I/O performance help
I don't have any experience with this, just something I remembered from class.
I would also look at network issues, such as half duplex vs. full duplex, if your backups are going over the network. We have run into this issue in the past.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2003 05:19 AM
тАО05-08-2003 05:19 AM
Solution1 - disk queus. sar's 0.5 disk queue is generally MeasureWare zero, anything above 0.5 is a real queue. nearly ALL your disks had queues against them., some really quite large
2 - Service times. The XP512 probably has caching, a typical LUNs service time would be 1-4 ms, nearly ALL of your LUNS were far greater than this. At a guess you are probably usin 10,000 or 15,000 rpm disks. In their RAW state they should be producing between 8 & 5ms service times. You have time of up to 25ms!! I know you are using RAID5 LUNs, but chosing RAID5 someone thought it would be BETTER that RAID1+0 or just RAID1. These service times are not very good. ** see later
3 - Root disks, I'm guessing that your root disks are c1t6d0 & c2t6d0. These are strugling with average queue of 3-4 & service times of 12ms. I assume these are a straight mirrored pair. I suspect that either there is swapping of database IO's are going on on these disks.
A previous reply suggested that RAID5 was OK for OLTP databases. I would disagree. RAID5 is slow for small random writes/updates. This is because a small write on one disk within a stripe will cause the WHOLE RAID5 stripe to be read & then an updated parity & data update to be written. As an example assume 6 disks are in RAID5 with a stripe size of say 64kB. A 2kB write/update will trigger a full stripe read, 384kB. & then write back 64kB parity & 2kB data. Thus 2kB has caused 450kB of activity! OLTP (On-Line Transactional Processing) is precicely this. updating customer details, placing orders etc.
What could be done???
o Investigate why there are such high queues on the root disks (assuming c1t6d0 &c2t6d0 are the ones).
o Check out the disk activity more throughly over time. I would personally use MeasureWare.
o Take some advice (HP) about the best striping & RAID methodology. We use VA74xx and get 1-3ms service times NO disk queues. the VA74xx is a cheaper & not as good as the XP512 so I would expect more out of XP512.
o If you can get measureWare check out
GBL_PRI_QUEUE
GBL_RUN_QUEUE
GBL_DISK_SUBSYSTEM_QUEUE
GBL_MEM_QUEUE
GBL_IPC_SUBSYSTEM_QUEUE
GBL_NETWORK_SUBSYSTEM_QUEUE
Generally speakin if you are bottle necked on IO GBL_RUN_QUEUE will be > 1 and GBL_PRI_QUEUE <0.5 or 0. This simply means that tehre are lots of runable processes, but they are NOT in contension for CPU (thus are contending for IO)
You can also analyse each of your disks in more depth. I've attached the report file I would use & a script to sort through it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2003 05:21 AM
тАО05-08-2003 05:21 AM
Re: Bad I/O performance help
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2003 05:49 AM
тАО05-08-2003 05:49 AM
Re: Bad I/O performance help
1. The disk stats show high busy%. This is caused by the number of requests as well as the number of blocks being transferred.
I noticed that they are all going over controller c5. You should set up another controller path to the devices and use both of them for everything, to share the load and incidentally help with redundancy. If its just 1Gbit fibre then all these requests down a single strand could be a bottleneck. 2Gbit fibre is obviously better. However don't take my word for it, check the fibre stats.
2. The high ratio of requests to blocks may indicate a too-small read-ahead parameter in Oracle. Ask your DBA to try increasing this.
3. RAID 5 is especially awful for sequential operations and backups are the biggest example. This is because the seek latency for each block is multiplied by the number of disks it is spread over. A backup will also reduce the efficiency of your xp cache.
4. If you are using filesystem files to store the Oracle data, consider mounting the VXFS filesystem using options: mincache=direct,convosync=direct.
Then your writes will go direct to disk, bypassing buffers. You can then reduce the HP-UX buffer memory (dbc_max_pct) and give it to your Oracle SGA.
HTH, Steve
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2003 06:53 AM
тАО05-08-2003 06:53 AM
Re: Bad I/O performance help
Average c5t0d1 92.41 1.74 232 3693 8.55 20.57
Average c5t0d2 60.04 0.64 231 3688 5.35 7.13
Average c5t0d3 92.54 1.79 232 3698 8.76 20.70
Average c5t0d4 45.20 0.64 128 2042 5.26 8.18
Average c5t0d5 79.40 0.74 127 2039 5.68 21.60
Average c5t0d6 48.44 0.64 127 2033 5.29 9.02
Average c5t0d7 72.00 1.66 175 2799 8.35 25.66
Average c5t1d1 71.81 1.68 175 2798 8.38 25.66
Average c5t1d2 33.17 0.70 176 2804 5.39 5.59
Average c5t1d3 63.04 2.24 94 2485 9.95 23.05
The XP512 has a notorious performance problem where Processor 1 handles all even numbered disks in the array and Processor 2 all odd numbered disks that MANDATES good performance from properly load-balancing the I/Os across even and odd LUNs. This is stripping the luns through LVM, as well as a round robin PVlink configuration. Pri Alt Alt Pri or the Autopath utility.
Can you attach your I/O stats from an 'fcmsutil' report? You seem to be heavily using only using one HBA and ignoring the other. This is the c5 issue.
fcmsutil /dev/td0 (1,2,etc.) stat
This problem with the XP512 exists before cache, so adding more cache will not help. Its all about load balancing and your load indicates all I/O through one HBA, c5.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2003 03:02 PM
тАО05-08-2003 03:02 PM
Re: Bad I/O performance help
It appears to be that disk layout is probably my issue.
For those of you with XP512 knowledge, I'll describe the layout of the filesystems. Note that we had assistance laying out the filesystems from the disk array vendor, making the assumption they knew more than us...
The XP512 serves 10 LUNs to this volume group, these LUNS are LUSE OPEN e*4 (4*14Gb). At least one of these LUSEes has 2 members on the same array group. The filesystems are then striped across 3 luns (stripe size 8Kb) using LVM. I've since found out that the XP has a cache line limit of 64k, so 8k stripes probably aren't using the cache effectively. See the following link for more info.
http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0xa5f2e822e739d711abdc0090277a778c,00.html
The application degradation has been slow. One factor which might account for this is the growth experienced over the last 6-12 months (after the app was moved to XP512). Growth has been 'steady', rather than 'explosive'.
The array is laid out RAID-5 per HP's advice. As this array forms the backbone of our SAN, and have allocated 5 of the 8Tb already, migrating from RAID-5 to RAID-1 isn't an easy task, certainly not on the top of my 'quick fix' list. I probably wouldn't even attempt this for one array group.
Backups are not network based, but to local SCSI DLT7000's. FC is 1Gb, link utilisation from EFC manager shows 5% max watermark. Everything is all down c5, as I havn't investigated how to get LVM to dynamic pathing (ie after a reboot c5 ALWAYS becomes the primary). The XP CHIP port utilisation for 'c5' (CHIP port 1A) sometimes bursts to 50%, average load is generally less than 5%. There are 5 HPUX hosts on the SAN which can access CHIP 1A.
c1t6d0 and c2t6d0 are the mirrored root disks, primary swap is 4Gb and is part of root. There is no swapping at any time of the day (sar -w shows zeros across the board). Yes, I'm disturbed by the usage of these 2 pv's, and am yet to identify the cause (this seems to be the norm for all my HPUX servers).
I have Glance installed, but not measureware. I havn't really got into using glance yet and find it not very helpful. I'm an old school admin who gets a lot of info out of a pile of text...
The things I hate about my current setup:
* LVM striping - this gives me no flexibility to alter disk layout online. For example, if I want to pvmove one of the pv's I need to do this offline.
* stripe size probably isn't large enough. Our initial testing obviously wan't robust enough to simulate full user loads.
* LUSE volumes need to be setup very carefully. Once they are allocated and in use you have no ability to easily alter the config. For example, I suspect that some of my ldevs on a busy LUSE are on the same array group as another busy lun on another application/host. Collating this information will be a time consuming business.
My next course of action is to lay out an alternate disk configuration, arrange some application outage time and run some load tests and compare the results between the 2 layouts.
Things were simpler back in the days of SCSI attached storage.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2003 04:06 PM
тАО05-08-2003 04:06 PM
Re: Bad I/O performance help
Also regarding stripes: Use pv groups with 4 disks per group, add another groups as you expand. See /etc/lvmpvg:
vg01
pvg1
c1t9d0, etc.
Regarding Alternate PV links after reboot: Use pvchnage and vgexport / vgimport to get the disk aligned in a pri > alt > alt > pri sequence.
pvchange -s n/y /dev/dskc/cXtYdZ
vgexport -p -v -s -m /tmp/mapfile /dev/vg##
vi /tmp/mapfile (* order your disks pri alt alt pri *)
(* -p for preview *)
mkdir /dev/vg#3
mknod ....
vgimport -p -v -s -m /tmp/mapfile /dev/vg##
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-08-2003 04:37 PM
тАО05-08-2003 04:37 PM
Re: Bad I/O performance help
Assume that the fastest way to the lun is through the controller that owns it. If so do you get the same performance if you attempt to access the disk through the other controller? Consider this link:
http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0x840e31ec5e34d711abdc0090277a778c,00.html
When the even disk processor gets a transaction intended for an odd disk on the XP 512. Does it attempt to write and query the other processor when it fails to handle the transaction?
Here's a good LVM example:
http://forums.itrc.hp.com/cm/QuestionAnswer/1,,0xd770abe92dabd5118ff10090279cd0f9
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-09-2003 01:38 AM
тАО05-09-2003 01:38 AM
Re: Bad I/O performance help
Example: If your block size from your DB is 8K (default) and the block size from your lvol is 1k you will have 8 I/O reads. If your block size from your lvol would be now 8k, too. You will have only 1 I/O read. Make this sense ?
Read the man page from newfs_vxfs and mkfs_vxfs and have a look on the paramater -b (newfs) and -bsize (mkfs) for the block size.
So, your oracle db files should always reside on a file system which has the same block size as you have defined for you database. As we all know you can't change the block size after you created DB or filesystem. So, my advice to you will be - backup you DB and create an new filessystem with same block size as the DB and then recover the data to that filesystem. It will increase your performance a lot.
Regards
Roland
PS: As you can see in the man page from mkfs_vxfs the default block size depends of the size of the file system
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО05-11-2003 05:34 PM
тАО05-11-2003 05:34 PM
Re: Bad I/O performance help
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО10-28-2004 06:09 AM
тАО10-28-2004 06:09 AM
Re: Bad I/O performance help
Just stumbled accross this old thread and it was good reading.
Curious how things are going many many months later.
From this reading I am glad we went with EMC for our disk array.