Around the Storage Block
1748169 Members
4118 Online
108758 Solutions
New Article
ToddPrice

Re: Comparing Oracle IO workloads for OLTP and DSS with HPE Primera and 3PAR Storage

Enterprise databases must be available to process transactions and service them quickly. Data centers often run many databases simultaneously when reliability and performance are a must. Even in the new paradigm of AI and Big Data, transactional and relational databases remain critically important as well as needed for newer technologies such as machine learning.

Implementations for reliable and high-performance storage are growing every day. The OLTP test demonstrated how HPE Primera storage can easily handle hundreds of thousands of IOPS from multiple database servers and still keep the latencies at the array less than one millisecond. This enables database servers to be scaled up as well as scaled out. The throughput test demonstrated how an HPE Primera array is capable of massive data transfers for large payloads.

 

 

HPE Primera comparison to HPE 3PAR arrays

Figure 1 shows the overall results of the OLTP tests. Two main I/O performance tools were used. The first tool was vdbench, which was used to validate the server/storage setup with some simple 100% small-block reads (8k) for maximum IOPS and large-block reads (256k) for maximum throughput. The results were compared to the expected values to validate the setups.

b1.png

 

Figure 1. HPE Primera 600 series compared to HPE 3PAR 8450 for Oracle on Synergy Gen10 compute nodes

In our vdbench tests we observed 8k block size random read IOPS that were twice what is specified for an HPE 3PAR 8450 array with a similar configuration and workload profile. Large 256k block throughput was 1.75 times higher than specified for the HPE 3PAR 8450 array.

The maximum throughput is close to the limit of the HPE Synergy chassis we configured for the test but not near the HPE Primera limit. To push the HPE Primera array to its throughput limits, we introduced a single HPE ProLiant DL580 Gen9 server with sixteen 32 Gb Fibre Channel ports. Using the ProLiant DL580 Gen9 server, we were able to generate 3.5 times the throughput of the HPE 3PAR 8450 array.

The second tool is a publicly available Oracle workload based I/O characterization tool named Silly Little Oracle Benchmark (SLOB). The SLOB tool allows you to characterize I/O with an Oracle database. It is generally used for OLTP workload profiling. Even though the tool has “benchmark” in the name, it’s more commonly used for calibrating or characterizing I/O in a particular server/storage environment.

Some of the vdbench tests used up to seven HPE Synergy 480 Gen10 compute nodes at the same time. Each Synergy compute node had 16 cores. Hyper-threading was disabled because the benefit is minimal at high I/O rates. The two SLOB workloads observed for maximum IOPS were 100% read and 80/20% read/write mixed workloads.

Figure 2 shows the single-server configuration for the HPE ProLiant DL580 Gen9 server. Eight dual-ported Fibre Channel host bus adapters (HBAs) were installed in an HPE Primera array with 32 Fibre Channel host ports.

b2.png

 

Figure 2. HPE ProLiant DL580 Gen9 server and HPE Primera array configurations

 

The details

  • The vdbench tool was used to validate the HPE Synergy and HPE Storage infrastructure. The 100% read tests validated the maximum IOPS and throughput for the HPE 3PAR 8450 array. The 80/20 read/write tests approximated a typical IOPS profile for an OLTP workload. All volumes were thin provisioned with RAID 6.
  • The HPE 3PAR 8450 array configuration: Four nodes, 24 host ports, 48 SSDs, nine 2.25 TB virtual volumes for Oracle data, four 15 GB virtual volumes for redo logs, and eight 16 Gb Fibre Channel dual-ported HBAs for a total of eight 16 Gb ports.
  • The HPE Primera array configuration with Synergy: Four nodes, 16 host ports, 48 SSDs, nine 2.25 TB virtual volumes for Oracle data, four 15 GB virtual volumes for redo logs, and eight 16 Gb Fibre Channel dual-ported HBAs.
  • The HPE Primera array configuration with DL580 Gen 9: Four nodes, 32 host ports, 48 SSD’s, forty 250 GB virtual volumes for Oracle data, four 15 GB virtual volumes for redo logs, and 8 32 Gb Fibre Channel dual-ported HBAs.

The maximum IOPS and throughput (read only) for the HPE 3PAR 8450 array were used as a baseline. Figure 3 shows the test procedure for the OLTP characterizations with vdbench and the SLOB tests.

b3.png

 

Figure 3. Testing procedure

First, the vdbench tests were performed. Then the Oracle database was installed on four Synergy compute nodes. SLOB was also installed on each of the four servers. SLOB tests were run for an 8k OLTP I/O profile.

The schema objects of SLOB consisted of 64 schemas, 34 GB each. They ran on all four database Synergy compute nodes for a total 9 TB data set. Each server had a data set size of 2.25 TB.

b4.png

Figure 4. SLOB tablespace within the ASM DATA group striped across four HPE virtual volumes; each square is a schema defined within the tablespace

The following table lists the results for the vdbench tests as well as the SLOB tests.

Table 1. Test results

Test

Storage array

I/O baseline and result

Service time

vdbench 100% random read 8k

(four compute nodes)

HPE 3PAR 8450

Maximum IOPS

(storage configuration limit)

0.48 ms

vdbench 100% random read 256K

(four compute nodes)

HPE 3PAR 8450

Maximum GB/sec

(storage configuration limit)

NA

vdbench 100% random read 8k

(seven compute nodes)

HPE Primera

Twice the IOPS of the
HPE 3PAR 8450 array

(storage configs limit)

0.78 ms

vdbench 100% random read 256K

(seven compute nodes)

HPE Primera

1.75 times GB/sec of the
HPE 3PAR 8450 array

(Synergy uplink interconnect limit)

NA

vdbench 100% random read 256k

(DL580 with 16 x 32Gb FC)

HPE Primera

3.5 times GB/sec of the
HPE 3PAR 8450 array

NA

 

Considerations for read cache on the HPE Primera array

You can use the HammerDB tool (www.hammerdb.com) to generate high amounts of throughput on the storage array with certain queries. Increasing the number of parallel virtual users multiplies the throughput and can also take advantage of the massive amount of cache in the HPE Primera array.

Up to 3.5 times more throughput was observed because of read cache hits on the array. The HPE Primera cache with the new centralized memory architecture can boost performance significantly and reduce the need to access the same data multiple times from the flash storage area as well as assist with data read-ahead for large sequential data transfers.

Figure 5 shows which queries are best for generating storage throughput. This example uses only one virtual user, but many can be run in parallel. In general, queries 14, 6, 4, 15, and 2 (in that order) are the five best choices.

The TPC-H results were not used for benchmarking purposes but for IO characterization purposes.

b6.png

Figure 5. TPC-H queries—One virtual user

The tables most used for high throughput are the ORDERS table and the LINEITEM table. Figure 6 shows how the data is distributed in the TPC-H schemas.

b7.png

 

Figure 6. TPC-H object distribution—Space gigabytes

 

 

Useful HPE Primera CLI commands for monitoring throughput and cache activity

The following command line utilities are helpful in determining the nature of the throughput on and HPE Primera array:

  • statport –host (show port throughput from the host port perspective)
  • statport –disk (show port throughput from the back-end SSD perspective)
  • statcache       (view data access and cache hits)
  • statcache –v (view more detail on cache hits from a volume perspective)

You can use these commands to monitor the throughput of the array as well as to see the effect cache memory has on the workload. HPE Primera centralized cache can provide additional benefits to the workload.

Consult your HPE Primera product specialist for assistance in using the command line utilities or refer to the HPE Primera 4.0: Installing the Command Line Interface reference guide.

Summary

The HPE Primera array provided a 112% improvement in the 8k mixed workload over the HPE 3PAR 8450 array using the SLOB tool.

The 8k read I/O characterizations using vdbench showed an improvement of twice the number of IOPS for the maximum amount capable from the HPE 3PAR 8450 array.

The vdbench comparison of large block reads between the maximum capability of the HPE 3PAR 8450 and the Primera array showed the Primera array performed up to 3.5 times faster than the HPE 3PAR 8450.

HPE Primera storage arrays are ideal for Oracle workloads. They offer much improved performance, a 100% Uptime Guarantee , and other features such as ease of use and support for HPE InfoSight. For a full list of features and other details of HPE Primera, refer to https://www.hpe.com/us/en/storage.html.

Please put any questions you might have in the comments section of this blog. Any general discussions are always welcome!

Best regards,

Todd – HPE Technical Marketing Engineering – Oracle Solutions

 

0 Kudos
About the Author

ToddPrice

Expertise in Oracle Database Technologies on HPE Storage

Comments
nbarry

Why wasn't a Synergy frame with 32Gb/s FC configured? The new Brocade 32Gb FC switch module would have easily eliminated that bottleneck- which FC connectivity option was used, VC-FC? or 16Gb Brocade modules?

Hi nbarry,

Our goal in the Synergy portion of testing was only for the OLTP portion of the test. We did push the IO to obeserve the limits on the Synergy setup and also looked into installing the 32Gb FC, but it really came down to the fact we could not justify ordering the addition FC modules when outr intent for addional throuput testing was to test maximum throughput DL580.

We tested OLTP on Synergy and max throughput on the DL 580 G9. The main focus of the project was the new Primary storage. I hope this helps and thanks for providing your input!

Best Regards

Todd