Operating System - HP-UX
1748280 Members
3925 Online
108761 Solutions
New Discussion юеВ

Re: very poor performance, prealloc command

 
Rajeev jain
Advisor

very poor performance, prealloc command

I ran the prealloc command "$ time prealloc test $((1000*1024*1024))" to write a 1GB file on RAID-10 internal drives and on SAN drives (8GB cache + RAID5), the response time I received were very unsatisfactory. The local disks writes completed in 10 seconds and SAN disk in 30 seconds. I have a Sun server connected to the same RAID group which writes a 1GB file in 8 seconds. These systems have no aplication OR database running on them.

HP support has pretty much raised their hand as they couldn't find any errors.

I have a rx3600 with Hitachi AMS200 Storage with 2 X 2GB FCs.

I would highly appreciate if someone could post similar results from their environment and post their hardware config to the level of detail I have listed above.

If anyone has experienced a similar issue and know of any suggestion would be appreciated as well but I am really interested in knowing the result of prealloc command.

Thanks

24 REPLIES 24
Bill Hassell
Honored Contributor

Re: very poor performance, prealloc command

prealloc is probably the slowest disk writing program I have ever seen. There is nothing to fix -- that's the way it works (or crawls). It proves the concept that asking HP-UX to run faster doesn't solve badly written code. Use dd and /dev/zero like this:

writing:
timex dd if=/dev/zero of=/var/tmp/test bs=1024k count=1000

reading:
timex dd if=/dev/rdsk/cZZtYYdXX of=/dev/null bs=1024k count=1000

Note that dd is by far the fastest method to read or write (as long as you override the default 512byte block size) but is a lousy test for performance, especially for smart disks (RAID, arrays, virtualized storage) as dd is single threaded. Only one CPU can run the code and only one channel will get through to the disk. And of course, large cache sizes in arrays will make the measurement unstable, that is, the first run will be much longer than subsequent runs.

A much better test is to run 10 or 20 copies, or run the xdd freeware program to generate multiple tasks.


Bill Hassell, sysadmin
Rajeev jain
Advisor

Re: very poor performance, prealloc command

prealloc was my first test. I ran $time cp test test1", whereas test=1GB.

On HP it takes little over 2 mins and on sun about 15 seconds.

I ran dd which is shows poor performance compare to prealloc.

It would be very helpful if you could run this in your environment so I have something to compare.

NESTER:root(/vm/guest/kalimdor)# timex dd if=/dev/zero of=/vm/guest/kalimdor/test bs=1024k count=1000
1000+0 records in
1000+0 records out

real 45.75
user 0.00
sys 0.16
Hein van den Heuvel
Honored Contributor

Re: very poor performance, prealloc command


prealloc is a utility command and it may, or might not have been implemented as a high-performance command. It stated goal is NOT to write fast, but to create a file optimized for fast sequential reads and writes. It is probaly using SYNC commands to garantuee the IOs made it out the the storage and the storage actually allocated disk chunks fro 'smart' controllers like and EVA which only promiss space, but postpone allocation.

For prealloc only the end goal counts, not the path!

I suspect you are using prealloc as a method to evaluate the storage / filesystem performance potential. Correct?
As you may have discovered this is a treacherous method. To only proper way to measure performance it under actual load. Anything else may or might not hit or avoid good or bad attributes.

Surely it does not matter how fast you 'dd' or prealloc or tar or xyz goes.. unless that's all your application is doing.

Now I'll admit that the behavior would concern me also, but I'd be more inclined to look for explanations and alternatives and label it 'poor performance'.

Things I would check
- comparative DD results with if=/dev/zero for : bs=8k count = 128000 and for: bs=1024k count=1000... but 1GB is not enough!
- compare with RAW IO
- scsi queue_depth settings
- file system fragmentation ( only test on a clean file-system )
- LVM settings... just a single PV I hope?

On the SUN side...

When the pre-allocate returns, is all the IO actually done? (sync).

For example when i use a simple 'time dd' to write 1GB, that finished in 3 seconds. That is, the comamdn returns. But the actual IO still has to start! Looking with glance, the 'u' page. I can see the Io kick in several seconds later... for a minute long (slow single drive).

>> knowing the result of prealloc command.

On a single, clean U160 drive on my RX2600
# time prealloc /blah/test.tmp "$((1000*1024*1024))"
real 1:04.5

And GLANCE shows the same IO all along while busy, which drops to null when done, Unlike the prior DD experiment.


Hope this helps some,
Hein van den Heuvel ( at gmail dot com )
HvdH Performance Consulting.


Rajeev jain
Advisor

Re: very poor performance, prealloc command

Did you run dd and prealloc in the same directory?
Hein van den Heuvel
Honored Contributor

Re: very poor performance, prealloc command

Hmm, I don't understand that question.
Directory is utterly irrelevant.

But yes i did run them on the same file system, which is what matters.

And I did pre-delete the file before re-running

Also, I ofcourse had 4GB of filesystem cache.

Did you check the Sun cache and HP cache settings?

When I trimmed the filecache down to min=200MB, max=250MB, than the very same dd on the very same directory ( :-) ) took 40 seconds, with almost no IO after the dd command returned.

I'm sure you can figure out your lesson from there.

Hein.







Dennis Handly
Acclaimed Contributor

Re: very poor performance, prealloc command

>Hein: prealloc is a utility command and it may, or might not have been implemented as a high-performance command.

It calls prealloc(2) and that writes 8Kb (the filesystem blocksize), chunks, then fsync.

>It is probably using SYNC commands

Yes, one fsync(2) at the end.
Steven E. Protter
Exalted Contributor

Re: very poor performance, prealloc command

Shalom,

Maybe an OS patch will help with malloc.

Memory leak detector:
http://www.hpux.ws/?p=8

Performance monitor scripts
http://www.hpux.ws/?p=6

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Laurent Menase
Honored Contributor

Re: very poor performance, prealloc command

prealloc is better than dd on the fact it writes 8kb blocks from kernel, without needing to copy data to user level. It is a syscall.


You probably will need to look at the queues length of your FC, the number of path to the lun.

But you should pursue with support and ask them to elevate.

Hein van den Heuvel
Honored Contributor

Re: very poor performance, prealloc command

>> But you should pursue with support and ask them to elevate.

WHY?
There is nothing broken except the end user expectation. Hire a consultant yes, but support no.
There is a misguided belief that a simple system tool can provide useful performance information accross vendors without understanding of all the parameters involved. Specifically the size of the file system cache was not mentioned, yet is critical for dd experiments.


just a thought... how does the filesystem cache gets flushed left on its own, without fsync instructions? Will strictly write out in order of arrival, sweep low to high, or take a random approach. If it is not ordered, then a single fsync at the end is not good enough to guarantee the intended effect of prealloc. Storage subsystem may end up allocating storge segments for the file out of order.

Hein.