1832620 Members
2782 Online
110043 Solutions
New Discussion

Re: raw vs filesystem

 
jok llamera
Advisor

raw vs filesystem

hi all,

I have gone through a series of statistics in my 8way HP server and found out that the time it takes when writing to a Raw device is 90 times longer compared when wrirting to a filesystem. I have try it on LOCAL drive and EMC subsystem and they have the same result. Could somebody till me why this happen?

regards,
joks
Excelence is not an act but a hobby
17 REPLIES 17
Sridhar Bhaskarla
Honored Contributor

Re: raw vs filesystem

I think it really depends on how we access it. It bypasses the kernel buffers and provides rather direct access to the device. Oracle does it's own caching so will have more control over the timing of the I/O and hence will benefit from it.

By the Way, how did you test it?

-Sri
You may be disappointed if you fail, but you are doomed if you don't try
Stefan Farrelly
Honored Contributor

Re: raw vs filesystem

Your results are the complete opposite to what would be expected. Accessing raw volumes should 2+ times faster than accessing a filesystem.

Heres and example from one of our servers;

time dd if=/dev/vgemc/rlvol1 of=/dev/null bs=1024k count=50
50+0 records in
50+0 records out
real 1.3
user 0.0
sys 0.0

time dd if=/dev/vgemc/lvol1 of=/dev/null bs=1024k count=50
50+0 records in
50+0 records out
real 8.9
user 0.0
sys 2.8

Raw was the first test, 1.3s, and nonraw the 2nd, 8.9s. Thats what you should get. Exactly how did you do your tests ??
Im from Palmerston North, New Zealand, but somehow ended up in London...
Ravi_8
Honored Contributor

Re: raw vs filesystem

Hi,

system write/read raw devices in characters and to filesystems in terms of blocks. so characters read/write should be faster than blocks. i don't know why it is happening reverse in your case.

never give up
jok llamera
Advisor

Re: raw vs filesystem

hi fellas,
I use timex and have write using dd to Raw and filesystem. About the reverse proportion; that is what Im asking you guys if you have any idea..coz I am also confused..
Excelence is not an act but a hobby
jok llamera
Advisor

Re: raw vs filesystem

hi fellas,
I use timex and have write using dd to Raw and filesystem. About the reverse proportion; that is what Im asking you guys if you have any idea..coz I am also confused..

thanks
joks
Excelence is not an act but a hobby
Stefan Farrelly
Honored Contributor

Re: raw vs filesystem


an you send us the exact output from your tests - command and results ? Then we can take a look at it.
Im from Palmerston North, New Zealand, but somehow ended up in London...
Santosh Nair_1
Honored Contributor

Re: raw vs filesystem

Just a suggestion, but have you tried enabling async i/o in the kernel? That might help speed things up.

-Santosh
Life is what's happening while you're busy making other plans
jok llamera
Advisor

Re: raw vs filesystem

attached statistics..
Excelence is not an act but a hobby
Stefan Farrelly
Honored Contributor

Re: raw vs filesystem

Hi Joks,

well, your stats document proves what youre saying about writing to raw being slower than a filesystem. Still no clues as to why.

Can you please provide the exact commands you do to test it ?

Can you try a raw dd from the lvols (raw and filesystem) as listed in an earlier reply of mine and lets see the times from that.

Cheers,

Stefan
Im from Palmerston North, New Zealand, but somehow ended up in London...
A. Clay Stephenson
Acclaimed Contributor

Re: raw vs filesystem

Hi Jok:

On newer boxes and especially under HP-UX 11.11, I find that the differences in I/O rates between raw and cooked are typically very small and typically cooked does outperform raw with dd based tests. I've never seen rates that differ as much as yours and I suspect some flaw in your testing method. You really need to look at the transfer rates with Glance.

I should note that Stefan's test is bogus in that it is using /dev/null and thus is reading 0 bytes and creating a zero length file. A better test is to use /dev/zero which will produce an unlimited supply of ASCII NUL's.

To do this test fairly, you have to make sure that the raw device is stripped like the cooked filesystem. For example, if you are writing to a cooked file system whose underlying logical volume is actually a stripped device and comparing that to a single raw disk then you will get very skewed results.
You have to make sure that you are comparing apples to apples.

The other thing to consider under oracle is that you are very rarely doing sequential I/O so that your tests should take random I/O into account. Typically the real advantage to using raw i/o is Oracle applications is the avoidance of double-buffering in both the SGA and the UNIX buffer cache. It is generally better to use raw/io and reduce buffer cache and use the freed memory to increase the SGA where ORACLE really likes to do its buffering.
By far the easist way to test both raw and cooked I/O is to use the OnlineJFS mount options convosync=direct,mincache=direct,delaylog,nodatainlog. These options bypass the UNIX buffer cache and I have never been able to measure a performance difference between this and true raw i/o. To convert to cooked i/o, remove the convosync=direct,mincache=direct options and you are good to go.

Regards, Clay

If it ain't broke, I can fix that.
Stefan Farrelly
Honored Contributor

Re: raw vs filesystem


Clay said;

"note that Stefan's test is bogus in that it is using /dev/null and thus is reading 0 bytes and creating a zero length file."

Clay - I hope you are going to apoligise for saying the above when you are clearly mistaken. Read my earlier reply in full and see the dd results - it is in fact reading 50x1024 blocks !! Its not writing any but thats because its a read test, not a write test.

I hope in future you will read questions and replies properly before claiming other peoples replies are bogus!

Stefan
Im from Palmerston North, New Zealand, but somehow ended up in London...
Wodisch
Honored Contributor

Re: raw vs filesystem

The times they are a-changing,

well, bad times, really, if our kings are fighting ;-)

Stefan, you could use output redirection for "dd" instead
of "of=" to make reading easier for Clay;
Clay, I do know a place where they sell glasses;

...Wodisch, trying to use the "clown hat" to calm our
kings a little bit...
Dear kings, we love you all!
jok llamera
Advisor

Re: raw vs filesystem

hi all, I may clarify, I do the test by writing to it only.

For RAW;
timex dd if=/tmp/sample_file of=/dev/vgname/rtest_lv bs=8192

For Filesystem, I have convert LV to vxfs then mount it with delaylog option only.
timex dd if=/tmp/sample_file of=/tmp/new_mounted_dir/sample_file bs=8192

I will try the mount option thing..to simulate no caching.

joks



Excelence is not an act but a hobby
Stefan Farrelly
Honored Contributor

Re: raw vs filesystem

Hi Joks,

How big are your sample files which you are using ? Do get meaningful results you need to use files from 50MB to 100MB and when you do the tests only do them once as for 2nd and subsequent runs they are in cache somewhere and thus faster.

eg.

For testing a mounted filesystem use;

prealloc 50000000

Then to test out copy times only do this test once;

time cp /

Then divide 50MB by result in seconds (should be around 1-2 secs) to get transfer rate. If you repeat the cp multiple times it will be faster on all runs after the first one as now the file is in Unix buffer cache, so if you want to repeat rm the file, prealloc a new one at a different size and then time cp it.

As for testing out raw filesystem times, using dd from a file to a raw lvol (unmounted) gives an I/O error every time - thus it is not a good test to do. Instead raw dd from one lvol to another - and ensure each lvol is on a different disk and controller path.
Then time the dd (do at least 50-100MB) and again, only repeat once as for 2nd and subsequent runs the data is in cache on your disk subsystem so results not accurate. Divide copy size by results in seconds, and im sure you will get a faster transfer rate. Ive never seen a site/setup where raw dd is slower than doing a cp !!
Im from Palmerston North, New Zealand, but somehow ended up in London...
A. Clay Stephenson
Acclaimed Contributor

Re: raw vs filesystem

Jok:

Your test is skewed by trying to read from a filesystem and then write to a raw device. A much better test is to read directly from a pseudo device (/dev/zero) and thus get one component of the i/o removed so that you are only testing the write component of disk i/o.
I would do something like this:

dd if=/dev/zero bs=64k count=256 of=/dev/rdsk/c3t5d0 (or /var/tmp/myfile for cooked).
If you do not have a /dev/zero device create one
by mknod /dev/zero c 3 0x000003.

Stefan:
You are correct, I mentally interposed Jok's request for a write test and your read test. If one were testing read i/o rates, /dev/null is the correct device, but for testing write i/o rates as Jok requested, /dev/zero is the device to use.

Regards, Clay


If it ain't broke, I can fix that.
Daniel Galante
Occasional Advisor

Re: raw vs filesystem

Hi all,

I have the same problem in my RAC environment.
I did the tests using this script:

echo "WRITE test:"
echo ""
echo "Filesystem:"
time dd if=/dev/zero bs=64k count=256 of=/daniel/write_test
echo ""
echo "Raw Device:"
time dd if=/dev/zero bs=64k count=256 of=/dev/vg00/rlv_testedaniel_raw
echo ""
echo "READ test:"
echo ""
echo "Filesystem:"
time dd if=/dev/vg00/lv_testedaniel bs=64k count=256 of=/dev/null
echo ""
echo "Raw Device:"
time dd if=/dev/vg00/rlv_testedaniel bs=64k count=256 of=/dev/null

Script result as follow:

WRITE test:

Filesystem:
256+0 records in
256+0 records out

real 0.1
user 0.0
sys 0.0

Raw Device:
256+0 records in
256+0 records out

real 2.0
user 0.0
sys 0.0

READ test:

Filesystem:
256+0 records in
256+0 records out

real 0.8
user 0.0
sys 0.1

Raw Device:
256+0 records in
256+0 records out

real 0.4
user 0.0
sys 0.0

My environment:

Server: RP7410
OS: HPUX B.11.11
Storage: VA7400 (Autoraid)


TKS,


Daniel Galante
Steven E. Protter
Exalted Contributor

Re: raw vs filesystem

This thread, based on my question will be helpful.

http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=234347

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com