ProLiant Servers (ML,DL,SL)
1754020 Members
7737 Online
108811 Solutions
New Discussion юеВ

Re: DL385G7 offering poor performance

 
Gary_Richards
Occasional Advisor

DL385G7 offering poor performance

Hi,

We've recently purchased a number of DL385G7s. With a single 12 core CPU running at 2.33GHz, 32G of RAM, additional P410 storage controller (with 1GB cache and battery unit). 2x 146G SAS disks hooked up to the onboard controller in RAID1, 4x 146G 15k SAS disks hooked up to the external controller in RAID10.

Initially, the system I/O performance seemed very bad when compared to some DL360G7's purchased at the same time. Simply installing Ubuntu on these systems took close to 15 mins, compared to 5 mins on the DL360's (the DL360's are single 6 core running at 2.66GHz, 12GB RAM, 2x 146G SAS disks in RAID1).

After some searching of the Internet, I realised that our DL385's didn't come with the battery unit for the storage controllers. On swapping some into the DL385's temporarily for testing, I/O performance was much better and installing our OS went down to a much more sensible time. Not quite as fast as the DL360's, but much better than before. So I thought i'd solved the problem. Slower cores for a fairly single threaded install process would explain a slight difference in install time.

At this point we began testing the DL385's as MYSQL servers. MySQL was installed and our database was loaded. We had problems at this stage too. Loading our database on the DL385's (even onto the RAID10 with the 15k disks, or the RAID10 reconfigured as a RAID5 with 4 disks, or a RAID1 with 2 disks), takes considerably longer on the DL385's that on the DL360's. For example, one of our databases loads in approximately 4m20s on the DL360's but takes over 6m20s on the DL385's. This was sightly worrying, we'd already seen I/O problems with the DL385's that we thought had been resolved by the installation of the battery units. However, the DL385's still seem significantly slower on various I/O operations. Especially considering the DL385's have 4x 15k disks in, compared to 2x 10k disks in the DL360's. It's almost 50% longer to load the database.

Anyhow, we went on, we configured some of the other servers to run our applications and pointed them at one of the DL385's for its database. Our QA dept then performed various testing against our application. The results that came back were quite shocking. Our testing has a way to work out time spent doing SQL operations and the results returned suggested (similarly to loading of the database) that time spent doing SQL statements was almost 50% more on the DL385 that on the current production system's that we're migrating away from.

One of the database servers in the system that we're migrating away from has a pair of Dual Core AMD Opteron's, running at 2.6GHz, with 16GB RAM, 4x 146G SAS disks in RAID10.

Obviously quite concerned at this point, loading a database dump takes almost 50% longer on the DL385's, our applications are taking around 50% longer during any SQL than our older systems. I wondered how to proceed. The obvious solution was to try one of the DL360's, i'd already tried loading a database dump into one of the DL360's and timing was fairly good. I pointed our application at one of the DL360's as its DB server and had our QA team re-run their tests.

As expected by this stage, the results from the tests were much much better than the results obtained with the DL385's. When compared with the systems we're migrating away from, results were mostly better than the old systems, with a few results being similar.

So what's the problem?

I understand that the 12 core CPU's are slightly slower than our old Opteron's or the new Xeon's in the DL360's. But surely a small drop in CPU speed doesn't warrant a 50% drop in performance? The DL385 has more RAM, has faster disks, etc. etc. But I cannot make the DL385 perform anything like I would expect it to when compared with the other servers.

I've read suggestions online that consolidating database servers onto a DL385 would be an ideal way to use it, which was partially my plan for a development environment. However I can't even make one instance of MySQL on the database server perform anywhere near our old servers or our DL360's. So currently I have a fairly expensive potentially very powerful server that doesn't perform anywhere near as fast as I expect it to.

So far all I can conclude is that there's something very wrong with I/O on the DL385's and I don't know what. The storage controller batteries are charged, the OS reports that the cache is now 25% read/75% write compared to 100% read/0% write like it did before installation of the battery units. I've tried different DL385's and they all seem to offer the same performance.

Suggestions are very welcome on what I may be doing wrongly. But i'm currently at a loss.

Thanks
23 REPLIES 23
Michael A. McKenney
Respected Contributor

Re: DL385G7 offering poor performance

I would start by verifying the caching features are enabled for the RAID arrays. I would upgrade the firmware on the server and the drivers. How much RAM is being used? How large is your paging file? Do you have DNS configured properly and resolving? DNS issues can really slow a server. Number of cores is meaningless unless your applications support them.

12 x 2.33 vs 6 x 2.66 will be slower. Many Linux applications are not SMP enabled. You did slow your CPUs down. Did you see about enabling the L3 cache in your OS? Did you install MySQL in a separate partition than the OS?

You might need to do RAID 1 for OS and RAID 10 for apps/data to get better performance.
Gary_Richards
Occasional Advisor

Re: DL385G7 offering poor performance

Hi,

I'm pretty sure the caching features are enabled:
=> controller all show config detail

Smart Array P410 in Slot 5
Bus Interface: PCI
Slot: 5
Serial Number: XXXXXXXXXXXXXXX
Cache Serial Number: XXXXXXXXXXXXXXX
RAID 6 (ADG) Status: Disabled
Controller Status: OK
Chassis Slot:
Hardware Revision: Rev C
Firmware Version: 3.66
Rebuild Priority: Medium
Expand Priority: Medium
Surface Scan Delay: 15 secs
Surface Scan Mode: Idle
Queue Depth: Automatic
Monitor and Performance Delay: 60 min
Elevator Sort: Enabled
Degraded Performance Optimization: Disabled
Inconsistency Repair Policy: Disabled
Wait for Cache Room: Disabled
Surface Analysis Inconsistency Notification: Disabled
Post Prompt Timeout: 0 secs
Cache Board Present: True
Cache Status: OK
Accelerator Ratio: 25% Read / 75% Write
Drive Write Cache: Disabled
Total Cache Size: 1024 MB
No-Battery Write Cache: Disabled
Cache Backup Power Source: Capacitors
Battery/Capacitor Count: 1
Battery/Capacitor Status: OK
SATA NCQ Supported: True

Smart Array P410i in Slot 0 (Embedded)
Bus Interface: PCI
Slot: 0
Serial Number: XXXXXXXXXXXXXXXX
Cache Serial Number: XXXXXXXXXXXXXXX
RAID 6 (ADG) Status: Disabled
Controller Status: OK
Chassis Slot:
Hardware Revision: Rev C
Firmware Version: 3.66
Rebuild Priority: Medium
Expand Priority: Medium
Surface Scan Delay: 15 secs
Surface Scan Mode: Idle
Queue Depth: Automatic
Monitor and Performance Delay: 60 min
Elevator Sort: Enabled
Degraded Performance Optimization: Disabled
Inconsistency Repair Policy: Disabled
Wait for Cache Room: Disabled
Surface Analysis Inconsistency Notification: Disabled
Post Prompt Timeout: 0 secs
Cache Board Present: True
Cache Status: OK
Accelerator Ratio: 25% Read / 75% Write
Drive Write Cache: Disabled
Total Cache Size: 512 MB
No-Battery Write Cache: Disabled
Cache Backup Power Source: Capacitors
Battery/Capacitor Count: 1
Battery/Capacitor Status: OK
SATA NCQ Supported: True

Without the battery units, Accelerator Ratio would read: 100% Read / 0% Write. Whereas you can see above both of my controllers now read 25% Read / 75% Write.

As I understand it, Drive Write Cache is something different and is a write cache on the drives themselves. And it's probably not a good idea to turn this on. Regardless the storage controller settings are no different to the DL360's that are out performing them.

The above also shows that i'm on the newest version of the storage controller firmware. I upgraded all firmware on all servers before I began any testing.

Despite the DL385 having signficantly more RAM than the DL360's, i've configured them with the same mysql config, allowing them to use up to about 10GB of RAM if they need it. Even when everything's been running for a few days i'm not actually using anywhere near all of the RAM on either server with this configuration.

DNS is fine on all my servers.

I understand that number of cores is meaningless if you're not using them. I also understand that MySQL 5.1 isn't very SMP capable either. However my tests clearly show that MySQL running on a 6 core 2.6 GHz Xeon system vs a 12 core 2.3GHz Opteron system aren't even close to each other. I would have expected perhaps a slight drop in performance, but for almost all SQL to take 50% longer on the 385 vs the 360 just doesn't seem right at all.

Monitoring the server whilst we run the tests, there's barely any CPU usage on either machine, so it's not like i'm maxing out even a single 2.3GHz's core of the Opteron (or a single core of the Xeon). Therefore it seems that speed of the CPU's themselves are not the actual problem. It seems that something else is wrong.

On the 385's MySQL is installed on the same partiton as the OS (2x 10k 146GB SAS Disks RAID1, connected to the onboard storage controller). However mysql's data (/var/lib/mysql) is on a separate partition (4x 15k 146GB SAS Disks RAID10, connected to the offboard storage controller with 1GB cache).

On the 365, MySQL is installed on the same partition as the OS (2x 10k 146GB SAS Disks RAID1). /var/lib/mysql is on the same partition as the OS.

I even tried with /var/lib/mysql on the same RAID1 as the OS on the DL385's (so that it matched the DL360) but performance was slightly worse than on the RAID10.

No matter what I seem to do so far, the DL360 significantly out performs the DL385.

I haven't looked at enabling L3 cache, assuming that it would already be enabled? I can't seem to find any information on how to tell whether it's enabled or not. Nor any information on how you would go about enabling it.

I do understand that some applications do not really benefit from multi threading. Which is why I had hoped that this server would provide a nice way to consolidate anywhere from 6 to 12 development database instances on one piece of hardware, however I can't even make one instance perform anywhere near as close to the environment we're moving away from.

I'm really struggling at the moment. As suggested it seems that there's something very wrong with the DL385's. Because there's no way I can currently use this as a single DB server, let alone virtualise a number of development DB servers on this hardware. So it's currently next to useless for me :(
Gary_Richards
Occasional Advisor

Re: DL385G7 offering poor performance

For what it's worth, comparing dmesg on a DL385 and a DL360, I see:

DL360:
[ 0.005062] CPU: Physical Processor ID: 0
[ 0.005063] CPU: Processor Core ID: 0
[ 0.005065] CPU: L1 I cache: 32K, L1 D cache: 32K
[ 0.005067] CPU: L2 cache: 256K
[ 0.005068] CPU: L3 cache: 12288K
8< snip >8

DL385:
[ 0.181191] CPU0: AMD Opteron(tm) Processor 6176 SE stepping 01
[ 0.190000] Booting processor 1 APIC 0x11 ip 0x6000
[ 0.040000] Initializing CPU#1
[ 0.040000] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[ 0.040000] CPU: L2 Cache: 512K (64 bytes/line)
[ 0.040000] CPU 1/0x11 -> Node 0
[ 0.040000] CPU: Physical Processor ID: 0

Which suggests that if the Opterons have L3 cache's, then the Linux kernel isn't detecting it.
Gary_Richards
Occasional Advisor

Re: DL385G7 offering poor performance

I stand corrected, dmidecode --type cache tells me:

Handle 0x0730, DMI type 7, 19 bytes
Cache Information
Socket Designation: Processor 1 Internal L3 Cache
Configuration: Enabled, Not Socketed, Level 3
Operational Mode: Varies With Memory Address
Location: Internal
Installed Size: 10240 KB
Maximum Size: 10240 KB
Supported SRAM Types:
Burst
Installed SRAM Type: Burst
Speed: Unknown
Error Correction Type: Unknown
System Type: Unknown
Associativity:

Which suggests the L3 cache is detected/enabled.
Joshua Small_2
Valued Contributor

Re: DL385G7 offering poor performance

Have you done any tests with an application like iometer?

MySQL's performance is patchy at best.
M. Meckel
Occasional Advisor

Re: DL385G7 offering poor performance

Hi,

just to throw in some thoughts:

Is the filesystem type for /var/lib/mysql the same? xfs? ext3?

Really really important is the right partition layout for best IO performance.

If you did partition your RAID10 then you have to make sure the partition starts at a multiply of the RAIDs stripe size.

Best way to do that is using fdisk with -uc option or when -c isn't available use -u and select a multiple of 2048 as starting unit (2048 x 512 bytes block size gives 1MB).

Google for partition alignment.

e.g. RAID10 with 4x 146GB disks:

% fdisk -luc /dev/cciss/c0d0

Disk /dev/cciss/c0d0: 293.6 GB, 293564211200 bytes
255 heads, 32 sectors/track, 70265 cylinders, total 573367600 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0638ab25

Device Boot Start End Blocks Id System
/dev/cciss/c0d0p1 * 2048 307199 152576 83 Linux
/dev/cciss/c0d0p2 307200 42250239 20971520 8e Linux LVM
/dev/cciss/c0d0p3 42250240 573367599 265558680 8e Linux LVM

*maybe* this is an issue.

You should try iozone benchmark as well.

Greetings,
Marcel.
Gary_Richards
Occasional Advisor

Re: DL385G7 offering poor performance

Hi,

I don't have any stats from iometer, but I do have some from bonnie++

Currently I have 3 test systems

System1: DL360
1x 2.66GHz 6 core X5650 Xeon
12 GB RAM
2x 146G 10k SAS RAID1, 512MB cache

System2: DL385
1x 2.33GHz 12 core 6176 SE Opteron
32GB RAM
2x 146G 10k SAS RAID1, 256MB cache

Filesystems on both system 1 & 2 are exactly the same, they have 3 partitions. /boot (256M ext2) and the rest LVM. LVM filesystems are swap and / (ext3).

System3:
DL385 2x 146G 10k SAS RAID1, 256MB cache
4x 146G 15k SAS RAID10, 1024MB cache

System 3 has the same config as system 1 & 2 on its RAID1. The RAID10 is then mounted as /var/lib/mysql (ext3)

No special things have been done to the filesystem yet. It seemed pointless trying to make tweaks to get a performance benefit when the two systems built (almost) the same as each other still don't perform anywhere near close to each other.

Various outputs from bonnie++ attached. As far as I can tell, they look fine and suggest that when comparing system 1 & 2, results are similar. When comparing system 3's RAID1, results are similar. When comparing the RAID10, results are further improved.

Which suggests I/O isn't the problem here?
Joshua Small_2
Valued Contributor

Re: DL385G7 offering poor performance

I would agree with that suggestion.

Unfortunately MySQL performance varies dramatically on the storage system used, its individual instance configuration settings, the specific data in the database and how that created index statistics, OS patches and the phase of the moon.
Gary_Richards
Occasional Advisor

Re: DL385G7 offering poor performance

I've tried the partition alignment suggestions, they appear to have made no difference whatsoever to performance.

I've even read up on configuring the raid stride size extended option of ext3 filesystems. Again, this doesn't seem to have achieved anything.