- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: scsi queue depth and hpux
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-09-2008 01:11 PM
тАО07-09-2008 01:11 PM
We recently moved our storage from netapps to XP SAN (10K).we are facing some I/O issues possibly related to large LUNs and OS tunable kernel parameters - scsi_max_qdepth , scsi_maxphys
We are seeing a high number of blocked processes with vmstat (vmstat 1). The disks have a very low average queue (sar -d 1 3) and a good enough response time (15 ms on avg). Although most LUNs are a 100% busy, though, which could spell trouble.
Reading HP's forums and manuals, I found that we can tune the kernel parameter, scsi_max_qdepth, which by default is 8. I changed it, to 128, to see if we could relieve atleast some of the blocked processes.
However, we have not seen any change in the blocked processes at all.
Would you be able to coment on this problem?
Thanks,
-S
References:
http://publib.boulder.ibm.com/infocenter/dsichelp/ds8000ic/index.jsp?topic=/com.ibm.storage.ssic.help.doc/f2c_aghpfqudep_laxtvw.html
http://forums12.itrc.hp.com/service/forums/questionanswer.do?admit=109447627+1215615388033+28353475&threadId=634542
http://forums11.itrc.hp.com/service/forums/questionanswer.do?threadId=1057014
http://docs.hp.com/en/B3921-60631/scsi_max_qdepth.5.html
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-09-2008 01:28 PM
тАО07-09-2008 01:28 PM
Re: scsi queue depth and hpux
The vmstat "b" column does not neccessarily mean disk ( it is anything other than CPU, paging, memory io, disk, semaphores, etc..etc.. )
What do your disk queues look like ?
How many HBAs ?
Bottle neck in the SAN ?
What does the array performance look like from the array's perspective ? If there is a performance issue on the array you can tune the OS until you are blue and never get anywhere.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-09-2008 02:09 PM
тАО07-09-2008 02:09 PM
Re: scsi queue depth and hpux
I don't - on my 10K XP's I use the "lotsa lun" theory. Reason - on a large lun - you've got only one OS structure (the scsi queue) to push all that I/O down. When you use "lotsa luns", you've got lots of scsi structures to push data down. And with that, I rarely have to increase the scsi queue depth.
However, before you begin to think about relaying that stuff out - what the application you're running - can it cache more disk I/O for you ? What about the OS buffers? What are they set at? If they are low, it may be feasible to raise them. Also, what about the layout of the disks in the XP? Is it all Raid 5? Some applications don't do so well with R5 for performance. What about the cache in the XP itself? Do you just have the default amount? You can install lots and lots of cache in your XP, and I generally recommend that sysadmins put as much in there as possible. Why? Well, if you didn't think what you were doing was I/O intensive (and therefore sensitive), you wouldn't have bought an XP storage server. Following along that line of thought - If you didn't need throughput, you could have gone cheaper.
So, I think you need to think about *WHY* you're using up so much I/O, and what you can do to reduce it before you decide what you've laid out isn't good enough.
All of that being said, 10K XPs are nothing less than awesome in most any layout, so I'm afraid that messing with it may not yield much gains as I already feel you're at the point of diminishing returns for incremental configuration changes. So, I'd see if I could reduce I/O through tuning and buffering before I'd go tearing apart the storage configuration. And lastly, did you get any help from HP in laying out the disks in that server at least somewhat optimally for that style of hardware, or did you just cut it up on your own? I'm assuming here that you've already had some best practices already applied here in the layout of the storage server.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-09-2008 02:44 PM
тАО07-09-2008 02:44 PM
Re: scsi queue depth and hpux
To check on process wait states system wide use UNIX95 and the -o arguement for 'ps'. For example:
UNIX95= ps -ef -o etime,pcpu,vsz,rsz,pid,ppid,comm,state | sort -rn | head
Note state, etime, pcpu, vsz especially. Also, move the arguements around since you're sorting on the first arguement only.
Paste in anything you find interesting. I.e, the highest consumers. What I've seen in the past is one or two processes will appear at the top of highest consumers in several areas.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-09-2008 03:52 PM
тАО07-09-2008 03:52 PM
Re: scsi queue depth and hpux
We have a single HBA in our server. (fc 1 0/0/4/1/1 fcd CLAIMED INTERFACE HP 2Gb Dual Port PCI/PCI-X Fibre Channel Adapter (Port 2))
The disk queues are only about 0.5 to 1, but on some LUNs can go up as high as 10-15 during peak activity. Like I mentioned in my previous post, the utiliztion for 5 of these LUNs (datafiles/redologs/undo and temp) is about 100% all the time. Something seems fishy here, doesn't it ?
There about 4 LUNs that are large on this server. 2x600GB 2x1TB. The rest are 100 GB LUNs.
@Two Proc
Originally the plan was to get all 100 GB LUNs and use them, however, things did not go according to the plan and we are with what we have at this time. The server in question is running an Oracle database 9i.
Our storage engineer carved the LUNs himself. I am sure hoping he followed all the best practices for doing that. I think we have 4GB of cache on one controller and this is a 2 controller XP 10k we have here. So, I doubt if this is an issue, however, I am not saying it cannot be.
@Michael
I am not sure I follow you when you say "use UNIX95" See I am a new inductee into the HPUX world and consequently may not understand a whole lot of jargon. I will research for it on the web though. Also, the ps on my system does not have an "-o" switch. Man page does mention it either. Am I missing something?
Thanks for you inputs. BTW did I mention the forums websites is really cool? But I hate it when it timeouts on you after you have typed a loooong never ending response to the posts . And have to type the whole thing again out of memory.
-S
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-09-2008 04:06 PM
тАО07-09-2008 04:06 PM
Re: scsi queue depth and hpux
You use UNIX95, which is an environment variable, to activate the -o switch.
Trust me Luke
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-11-2008 01:52 PM
тАО07-11-2008 01:52 PM
Re: scsi queue depth and hpux
1) single pathed server (multiple paths with balancing sw may relieve the stress and/or at least move it to somewhere more visable)
2) small number of large luns ( large luns may impede, one fs stack, one lvol stack, one device stack.( lots of 1s here ). Too many small luns are a nightmare as well. Make happy in middle.
3) disk queues ( caused by single path or disk performance at array ? )
Everything together may be adding a little bit to your issue.
Any stats from the array to report ( sorry if I missed them ).
Are these reads or writes ? or % of. Large number of writes filling the 2GB array cache may just disable the cache ( need to review array stats to be sure ).
Other options.
If you cannot make the hw faster then reduce its load by tuning the application. Maybe this is already done ?
As you can see there is an open road that takes a lot of monitor, tweak, monitor.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-14-2008 01:12 PM
тАО07-14-2008 01:12 PM
Re: scsi queue depth and hpux
Are these in R01, or in R5? If any of this is R5, you're probably getting hurt big time. On the data areas - you may be hitting disk hard because of sorts on disk. Are you seeing lots of sorts on disk? (Ask DBAs). If so, you've probably got some stuff out there that needs tuning - get the DBA team to run statspack to start getting some data targets. You could also use Enterprise Manager Console for an easy way to look at the top running sql on the box, or the top running sessions on the box. All of this will give you an idea of what processes out there are taxing the server.
For hot disks try moving THOSE to R0/1 instead of R5. You'd see evidence of this if your cache hit ratio is low, OR, your I/O's per second across your database is high. If this is the case, once again it is time to start tuning the worst offending stuff in the database. You'd probably also do well with more cache in the database buffer cache areas. However, review tuning targets thoroghly FIRST before making this step.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-14-2008 04:50 PM
тАО07-14-2008 04:50 PM
Re: scsi queue depth and hpux
vgdisplay -v | grep -i -e 'PV Name' -e 'PV Status'
And also:
strings /etc/lvmtab
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО07-14-2008 06:45 PM
тАО07-14-2008 06:45 PM
Re: scsi queue depth and hpux
That's just a horrible response time.
Random seeks on a single very roughly equate the rotation time. For a 10Krpm drive with no cache assigned is roughly 1000*60/10000 = 6ms, and for 15Krm what would be 1000(milli)*60(seconds)/15000(rpm) = 4ms
>> The server in question is running an Oracle database 9i.
So what does Oracle tell you about the reason for the block processes. Maybe they are waiting for the client processes to ask for more work (SQl waits)? They would be blocked. Maybe they are waiting for the transactions to commit (logsync)... the processes would be blocked.
For now _please_ forget about the system management tools other that IO/sec and MB/sec.
First and foremost ask Oracle. What are its top wait events, What is the IO response time it is seeing (reads? writes?)?
Later, you may go back to the system tools to help understand, explain and correct what Oracle is reporting.
>> possibly related to large LUNs
Small or large Luns is interesting...eventually.
That's likely a tweak ( less than 10% impact). I suspect there are bigger issues.
>> We have a single HBA in our server.
Now that would worry a lot me from a performance perspective and more so from a sales/design perspective. Who made that decision, based on what? You may need a hose, but you have sold given a straw, to empty a keg!
A 2gb HBA can do 250 MB/sec on a good day.
Just 10 busy discs can deliver that.
You have 200+ disks right? 10x more!
The Data Bandwidth for an XP1000 is 8500 MB/sec... 40x more than a 2gb fibre.
250MB/sec @ 8KB/IO = max 30,000 IO/sec (much less with protocol overhead).
XP1000 cache performance is rated @ 700,000 IO/sec... 20x more.
Hope this helps some,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting