Operating System - HP-UX
1752800 Members
5456 Online
108789 Solutions
New Discussion юеВ

big io difference between raw device and file system

 
Frank Eberlein
Occasional Contributor

big io difference between raw device and file system

Hi,

I have problems with my io performance with oracle 9.2 running on HP-UX 11.i and read access, e.g. full tablescan.
After some tests I found out that I have a big difference between read access to a raw device and a file system. From my experience and also other threads raw devices are 15-20% faster than file systems, but in my case the difference is up to 400%.

Does anyone have an idea what the reason could be??
If you need additional informations just let me know.

Thanks in advance.
9 REPLIES 9
Frank Eberlein
Occasional Contributor

Re: big io difference between raw device and file system

Additionally I should that that the storage behind is a XP1024 and will be accessed via software secure path, but I am not sure if that is relevant for that problem
spex
Honored Contributor

Re: big io difference between raw device and file system

Frank,

How big is your OS buffer cache? Oracle buffer cache? What type of fs were you using? Which 'mkfs' options were used to construct it? Which 'mount' options were used to mount it?

PCS
Frank Eberlein
Occasional Contributor

Re: big io difference between raw device and file system

First some additional specs from my system:
Its a rp8400, 8 cpus and 48 gb memory.

The OS buffer cache in glance show me the value 9,57 GB, which seems to be little high, the Oracle buffer cache is 6 GB.

The kernel parameter for buffer cache dbc_max_pct = 20.

We are speakin about vxfs filesystems, they were create via "newfs -F vxfs -o largefiles /dev/$VG_name/rlvol1" command.

I use 2 different ways of mounting, one way was only with "rw" option, then second way was "rw,delaylog,nodatainlog,largefiles,mincache=direct,convosync=direct", but both had the same bad performance
A. Clay Stephenson
Acclaimed Contributor

Re: big io difference between raw device and file system

You said: "The OS buffer cache in glance show me the value 9,57 GB, which seems to be little high, the Oracle buffer cache is 6 GB."

20% of 48GiB = 9.6GiB so your buffer cache looks dead on. I think your fundamental problem is that you buffer cache is much too large. In general, 11.11 buffer cache should be no larger than about 1600MiB and you are 6X larger than that.

I would set bufpages to a non-zero value to disable dynamic buffer cache and then your buffer cache will be pinned at a constant value regardless of the amount of installed memory. bufpages=409600 will set a static buffer cache of 1600MiB.

You also should increase your scsi_max_qdepth especially if you are running the default value of 8.

Finally, (and just to avoid extremely Bizarro world behavior) make sure that your timeslice has not been set to 1.

If it ain't broke, I can fix that.
Frank Eberlein
Occasional Contributor

Re: big io difference between raw device and file system

Thanks for the information about the buffer cache.

Can you explain me what the kernel parameter scsi_max_qdepth exactly did? i found only a explaination with 1 sentence which is not totally clear for me.

The parameter timeslice is set to 10.

Additional parameters are:
bufpages 0 - (NBUF*2)
nbuf 0 - 0

Should I set the parameter on only for bufpages or make it sense (due to the original formular) to change also the nbuf parameter?
Alzhy
Honored Contributor

Re: big io difference between raw device and file system

Frank,

Despite what's been said in various forums, RAW storage will still come out on top over Filesystems.

BTW, in your Filesystem based tests - do the filesystems mount with forced directIO option?

log,largefiles,mincache=direct,convosync=direct

The above supposedly negates the effect/need for a filesystem buffer cache as it makes your filesystems appear/near raw...

But true, RAW or not - you will need to have that huge buffer cache trimmed down. How low? It depends on what "other stuff" you are running on your server aside from DB serving -- perhaps mulitple SAMBA mounts?


Hakuna Matata.
Bill Hassell
Honored Contributor

Re: big io difference between raw device and file system

The massive buffer cache (at 11.11 I am assuming) is creating huge system overhead while searching for records. This is very likely the biggest difference between raw and filesystem I/O. Raw will be faster but a lot depends on what is raw and what is still filesystem.

Now bufpages (and nbuf) are normally set to zero (Glance will report their internal values, use kmtune -q bufpages -q nbuf to see the initial values. When they are both zero, the dynamic buffer cache is enabled and will vary between min and max percentages. At 11.11 and earlier, setting bufpages (never nbuf) to a fixed value keeps the buffer cache at a fixed amount of RAM. While you could also make dbc_min_pct = dbc_max_pct and get the same result, it is still a percentage and any change to installed RAM will change the buffer cache.

Now at 11.23 (aka, 11i v2), *everything* changes. dbc_minpct and dbc_max_pct can be changed on the fly (and you can watch it change with Glance). Also, the buffer cache code was heavily rewritten to improve performance, especially for multi-GB buffer cache sizes. In part, the changes include less overhead in locating and locking buffers, and the syncer is now multi-threaded (one per CPU). In actual tests, buffer caches between 8 and 16 Gb have performed extremely well on IA64 boxes with 11.23.

The debate about raw versus filesystem data will continue on but improvements in the buffer cache may narrow the differences. The real big question is backup and restore methods. (hint: dd is not a solution)


Bill Hassell, sysadmin
Alzhy
Honored Contributor

Re: big io difference between raw device and file system

But Amgigos, would a LARGE buffer cache matter IF the cooked filesystems are mounted to enable it be near-raw (via the vxfs directIO directives?). Will Oracle actually use that large buffer cache if it "sees" the cooked filesystems as if it were RAW?

I have very large HP-UX 11i environments average 16cpusx96GB RAM that are both Database servers as well as fileservers and my studies have shown that I really need a large buffer cache... Some of these environments are still cooked (but with DirectIO enabled) and I have not seen my large buffer caches actually get in the way between Oracle and the filesystems..


Hakuna Matata.
Frank Eberlein
Occasional Contributor

Re: big io difference between raw device and file system

Nelson,



I made theses filesystem tests when they are mounted with the following options "rw,delaylog,nodatainlog,largefiles,mincache=direct,convosync=direct".



When i get a downtime window from the customer on sunday morning I will reduce the os buffer cache and see what are the hopefully performance improvements.

Currently I am investigating for the best value for that systems. I have a complex environment with a 4 node cluster and on every server 4-6 packages each running 1-2 databases and also samba on it.

But the applications are splitted to another cluster-environment, therefore we should only take a look for oracle db.



Our backup and restore environment is data protector 5.2, if you need more informations on that please let me know.