Re: Achieving Extreme Throughput

Stephen Andreassend · ‎05-28-2002

I want to do as many commits as possible per sec in Oracle.

In TPC benchmarks, they are apparently getting 000s of commits per sec with async IO.

I cannot get anywhere close to 1000, if I am using async IO I shouldnt need to use a disk cache.

Are these TPC benchmarks distorting the rate of trasactions ie commits by using batch commits in message queuing servers (eg tuxedo), or do I have an issue with my async IO?

With regard to this info:

# mknod /dev/async c 101 0x00000#
#=the minor number can be one of the following values:

0x000000 default
0x000001 enable immediate reporting
0x000002 flush the CPU cache after reads
0x000004 allow disks to timeout
0x000005 is a combination of 1 and 4
0x000007 is a combination of 1, 2 and 4

Oracle specifies 0x000000 in all its async IO guides, but for extreme throughput tests, is it feasible/valid/safe to try 0x000001? If I have a highly available XP256 or EMC with redundant multiple write-back caches, surely my IOs would be safer but not necessarily faster to the Oracle log writer?

I want to ask here before I risk corrupting our test bed databases due to lost IOs.

Thx
Steve

Vincent Fleming · ‎05-28-2002

First off, there has never been an EMC Symmetrix with a mirrored cache - they keep only one copy of data in the cache.

Forget about tweaking your async I/O - if you want real performance you have two real choices... Cache LUN (on your XP) or SSD.

If you put your logs into cache via Cache LUN (it locks them there), you will see up to a 8x performance increase. (that's a real-world number)

If you have an XP256 now, you're limited on cache that you can dedicate to Cache LUN (it holds only 16GB [did I just say "only"?], where the XP512 has 32GB, and the XP1024 has 64GB. You can do a lot with 64GB of cache.

An SSD is a Solid State Disk, and are available from HP and other vendors. I believe HP supports the Solid Data brand. I've personally used Imperial Technologies, and can recommend their products.

The problem with the SSD is that they do not mirror the data in memory like the XP. I think they are a bit faster, though.

Either one will produce sub-millisecond I/O response times consistantly, which is really what gives you serious performance.

Good luck!

No matter where you go, there you are.

Stephen Andreassend · ‎05-28-2002

OK thanks for that response!

So when you say 8X speed increase, did that actually translate into 8x as many commits per second?

A FC60 cache, XP256 cache, VA7400 cache, Model 12H cache, EMC Symmetrix cache, all of these can act as "cached LUNs" right? I have always referred to the term "controller cache" rather than a "cached LUN", is there a difference?

The HVD10 (or was it the SC10) can take Solid State disks, and can apparently do 30K IOs/sec, but whether that translates into 30K commits/sec is questionable.

Andrew S Babb · ‎05-28-2002

A couple of points on the XP128/256/512/1024 and CacheLUN's.

A CacheLUN sits in front of an LDEV. CacheLUN's can be either a complete LDEV, in which case the entire LDEV is cached in memory permanently, the other approach is having an amount of memory assigned to cache a particular LDEV. I would suggest the former version for CacheLUN's.

As Vincent eluded to, the XP256 has only 16Gb memory. This is an issue, because the CacheLUN is mirrored as with any writeable memory on the XP.

Therefore, if you have 2 Redo Log Groups, with 2 Members per Group and each Log file is 500Mb, then you would need (500Mb Log File * 2 Members * 2 Groups * Mirror Cache) = (500 * 2 * 2 * 2) = 4Gb of memory must be assigned to the CacheLUN.

I am currently working with a XP512, and one of the tests we are wanting to perform is to evaluate the effect of CacheLUN's for Online Log Files. We will be requiring an extra 16Gb of memory in the XP512 to perform this test, as we have a 4 node RAC cluster, each instance requing 4Gb for log files.

To go to the other question as to how the TPCC figures are achieving there figures, you will always see a TP monitor listed. The other thing that they all have in common (at least at the Unix level) is that they have a _serious_ number of controllers addressing the storage. For instances, HP's latest figures on tpcc.org is using 25 fibre channel controllers. The other thing they need to get these results is a large number, 28 for the Superdome test, of clients.

Hope this helps

Stephen Andreassend · ‎05-28-2002

Thx for that reply also.

If all our testing goes successfully, we will be deploying a 2 node RAC cluster using an EMC Symmetrix 8000 series (24GB redundant cache). In future we may want to deploy more nodes to do real-time billing operations, hence we will need to do a lot of commits/sec. We most likely not use TP monitors or Oracle XA.

Unfortunately, our testing does not use a disk array (dual FC10 with brocade switch) but I had hoped async IO would overcome this performance bottleneck. So I am forced to do batch commits.

Can you make any comment on the /dev/async minor number? There is a hidden Oracle parameter to turn off Commit sync waits, but this could be even more risky (esp. if we are talking about billing operations).

BTW:
In case you have any doubts about RAC, the active/active performance is excellent, esp. if you load balance by using 9i list partitioning to avoid pinging. Pinging is now do-able (0.8ms average) but you still want to avoid it. For availability do not link Oracle to the HMP drivers unless you have dual HF switches, Oracle has no error handling for the HF clics going down. In actual fact, Oracle linked normally (UDP over HF) was just as fast as HMP for pinging.

Stephen Andreassend · ‎05-28-2002

Btw, do these Cache LUN options offer better commit performance for redo logs than the highend EMC Symmetrix 8000?

Vincent Fleming · ‎05-30-2002

To answer your last question,
yes - significantly.

The figure I quoted as 8x performance really is 8x transactions per second. Really truely.

It's pretty common knowledge now that the XP is a significantly better performer than the EMC 8000 series - even without Cache LUN.

Look into the new XP1024 - it has nearly 10x the internal bandwidth of the best Sym out there. (15GB/s vs 1.6GB/s for the Sym!) The caching algorithms are way better, too, so you really get a performance kick with the XP.

Even the new little XP (XP128) has way more bandwidth... 7.5GB/s vs. EMC's 1.6GB/s. You should see the benchmark data on these guys!

This must be *sooo* embarassing for EMC. ;-}

As for the SSD's, Imperial Technologies quotes a consistant access time of 0.035ms per I/O. (reads and writes) That's pretty sweet, if you ask me. I still have some worries about how safe the data is kept... I know it's safe in the XP, but these SSD's aren't built with the same fault tolerance.

Good luck!

No matter where you go, there you are.

Telia BackOffice · ‎06-04-2002

Do you have any experience with Compaq Storageworks EVA cabinets? how is the performance compared against XPs or EMCs?
Compaq ensures that it is better than the XP512.

Stephen Andreassend · ‎06-04-2002

Heres one aspect I need to investigate.

Can I do Oracle commits to normal disks as fast as with an XP512 with Cache LUNs if I am using asynchronous IO?

Doesnt the control return to Oracle as soon as it reaches the async driver? Or is the driver not truely asynchronous and Oracle still must wait for disk IO?

If control does return to Oracle after hitting the async driver, then I should be able to do commits as fast as an XP512.

Thx
Steve

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

Re: Achieving Extreme Throughput

Achieving Extreme Throughput