1826055 Members
4554 Online
109690 Solutions
New Discussion

Re: Performance Problems

 
David Bell_1
Honored Contributor

Performance Problems

All,

I'm having some performance issues with a particular HP9000/L2000. I've looked at a lot of things but haven't found anything in particular, perhaps I'm missing something! I know that running 32 bit Oracle on a 64 bit OS is part of the problem along with the memory implications, however, I believe the performance should be better. Could you have a look at the parameters in the attached file and see if you see anything glaring? By the way, each instance is running in it's own memory window. The problem si users reporting sluggish response even when the SPU shows a high percentage idle. In addition, they occasionally will get booted off the host (seession closed). Thanks for your time,

Dave
22 REPLIES 22
Ken Hubnik_2
Honored Contributor

Re: Performance Problems

What speed are your network interface cards set to 10 or 100 ??? Half or Full duples?? It could be a network issue??
Jeff Schussele
Honored Contributor

Re: Performance Problems

Hi Dave,

One thing I'd do is change swap priority.

Make the vg00 swap device priority = 1 & the external devices priority = 0 & let them round-robin if they need to.
You may be slowing normal OS activity by swapping on the same device the OS resides.

Rgds,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
David Bell_1
Honored Contributor

Re: Performance Problems

Ken,

The network interface is set to 100FD manually on the Host as well as the switch. It is running at 100FD. I'm not ruling out network problems at this time, howver, I'm looking elsewhere first. This could potentially be a layer2/layer3 problem, however, there are several other hosts of the same type on this switch. Further, I've changed from LAN 0 to LAN 1 with the same problem. The LAN patches are all up to date per HP.

Thanks,

Dave
David Bell_1
Honored Contributor

Re: Performance Problems

Jeff,

Thanks for the tip. I will make that change as soon as I can schedule a reboot. I see your point, however, the primary swap (lvol2) is typically configured as part of the root volume group during installation. Are you saying that the swap should be installed on a different disk (away from the OS) when possible?

Thanks,

Dave
Vincent Fleming
Honored Contributor

Re: Performance Problems

Dave,

You look a bit I/O bound, too - your wio% is too high. Best is < 10.

You could probably help this by striping the drives; it looks like you're pounding 2 drives only. If you can spread this load out, you'll see lower wio times and better performance overall.

Can you tell us about how you have your LVM configured and what you're using for disks?

-Vince
No matter where you go, there you are.
Jeff Schussele
Honored Contributor

Re: Performance Problems

Hi Dave,

No, you're correct, you must have a swap partition in vg00, but there's nothing that says you can't adjust priorities such that it's the last used.

The fact that users occasionally get dropped though, does tend to point to the network being involved in some fashion however. I just wanted to point out to you that you are swapping to vg00 & that alone can have an impact.

Cheers,
Jeff
PERSEVERANCE -- Remember, whatever does not kill you only makes you stronger!
RAC_1
Honored Contributor

Re: Performance Problems

I would do pri=1 for secondary swaps. (Not all but at least two)

What I achive eith this?
Interleaving should start with this. Swapping load gets divided between primary swap under vg00 and secondary swap.

I would also increase dbc_max_pct to 10 and watch.

As mentioned not to forget about NIC and switch speed settings.
There is no substitute to HARDWORK
Vincent Fleming
Honored Contributor

Re: Performance Problems

Dave,

I just noticed that you have 2 VA7100's... Would they be c9t0d2 and c15t0d2?

Are you sharing these arrays with other hosts?

Is it in AutoRAID mode?

-Vince
No matter where you go, there you are.
David Bell_1
Honored Contributor

Re: Performance Problems

Vince,

I believe I may have some I/O bottle necking going on. I agree with your assumptions of %wio, hence the start of this thread. The configuration is as follows:

Array 1:

(15) 73GB mechanisms
1024 cache
(6)LUNS - (4)140GB + (2)75GB

Array 2:

(12) 73GB mechanisms
1024 cache
(4)LUNS - (4)140GB

The Volume groups are configured to utilize one LUN from each array using a 64K stripe for all LVs. There is the primary link and three alternate links. The exception is that the two 75GB LUNS are configured in the same VG as they are not often utilized. The reason you see higher I/O on that particular LUN(s) is due to the fact that that VG represents a particular instance of Oracle and others are either not utilized or nearly idle at this time. I'll admit that the free space (not assigned to LUNs)utilization is minimal and could stand some improvement, however, it does have the Hot Spare for overhead of 66GB.

Jeff,

Thanks for the clarification on the "swap".

Anil,

I have thought about the increase of dbc_max_pct, however, after reading many posts on this subject, I tend to lean closer to the 5 and below.

Thanks for all the replies so far.

Dave
Vincent Fleming
Honored Contributor

Re: Performance Problems

Dave,

Have you made sure that the primary PV-link is controller MC/1 for all LUNs on both arrays?

You can also increase your queue depth and see what happens - the VA can handle a higher queue depth than other arrays. ('scsictl -m queue_depth=16 /dev/rdsk/c5t0d0') I would suggest between 9 and 12, but I think you could go as high as 30, if your access patterns are bursty.

Could you post the output from armdsp -e and armdsp -a for the arrays?

Also... if I get what you're saying, the load changes as the DBs become active - so the two I pointed out was just coincidence... others become active when/if these subside?

Thanks,

Vince
No matter where you go, there you are.
David Bell_1
Honored Contributor

Re: Performance Problems

Vince,

First, since I missed it on my last post, NO I'm not sharing these arrays with any other hosts.

The Primary PV-Link for each LUN is NOT MC/1. This is because I spoke with several folks about this including some HP folks. I know that all writes occur on MC/1, however, it was felt that the I/O going through MC/2 across the 800MB internal backplane would be faster than having ALL data (including reads) going through MC/1 only. If this is incorrect, I have some changes to make. As to the Q-Depth, I've heard it tossed around but I've never actually made any adjustments to it.

Our access is based upon 3 development instances of Oracle. Some days are relatively light and other days are excessively heavy. This makes it difficult to gauge where the difficulties lie. Basically, the instances are active but may not be being accessed all the time. So if DEV is being accessed, the devices you saw (LUN2/Array1 + LUN2/Array2)may be relatively high while LUNS 3,4,5, and 6 are relatively low. The next time it could be CRM fairly high so LUN 4 from each array may be high. Sometimes, they're all being used so it's all quite high. When that happens, typically the CPUs will max out. No one expects perfect performance out of these instances as they are for "development". However, they need them to be relatively proficient which apparently they have not been!

Can I adjust the q-depth while the array is accessed or should it be quiesced?

I appreciate your insight Vince. I've attached the requested information. Let me know if there's problems with it.

Thanks,

Dave
David Bell_1
Honored Contributor

Re: Performance Problems



Dave
Dave Chamberlin
Trusted Contributor

Re: Performance Problems

Hi Dave,

A agree that you do have some IO issues. I am not sure what you are running though - oracle 11.5.8 is an oracle applications level - not a database instance. If you are running database instances - what do the initORA parameters look like? The 32 bit Oracle may not be an issue. I run 32 bit Oracle on my machine with 3.5G RAM,only using about 900M for my SGA. I would recommend adding more disk cache buffering - bringing it closer to 300M (but not higher) by changing dbc_max_pct to 5.
James Odak
Valued Contributor

Re: Performance Problems

One thing that may or may not help is your max inodes ..its huge 18k and you are using only 5k .. from what i have gathered ..inodes are only needed for NFS file systems and since you are using VxFS(except for stand of course) you do not need this so high ..8k would be better ... as explained to me having such a high max may effect performace as it reserves some resources in the event that max is achived

Good luck

Jim


FYI i believe that is controlled by the ninode kernel parameter
David Bell_1
Honored Contributor

Re: Performance Problems

Dave,

With regards to the Oracle, excuse my ignorance with the numbers. While I have some exposure, the DBAs do the primary work here. The Oracle Database being utilized is 8.1.7.4. That being said, I'm not sure what the initORA paramaters look like as they were set up prior to my arrival by the DBAs and previous admin. I will look into this as well. Thanks for the insight.

As to your point about dbc_max_pct, I will give the 5 % marker a try. As I explained to Anil in a previous response to this thread, I have read a lot on this and the answer seems to vary a bit. However, overwhelming it appears that 5 and below is best. I will bring it up to 5 and see what benefit it has. Sorry about the "4" points, it should have been "5" I didn't notice it until it was already in.


James,

Thanks for the tip on max inodes. I know it's pretty big as well. It does run quite a bit higher when all the databases are active. I don't know at this point whether it bumps the 18K or not, I'm guessing it's a bit high. I'll check that as I'm going and once I have opportunity to make some changes and reboot.

Dave
Vincent Fleming
Honored Contributor

Re: Performance Problems

Dave;

Sorry it's taken a while to get back to this;


OK... start out by getting the VAs to the current FW revision; it has performance enhancements in it. (I think it's HP18 this week).

Any LUNs that you access via MC/2 will be slower than the ones accessed via MC/1. I would try all the LUNs through MC/1 and see how it goes. If, as you say, the databases are bursty (usually not becoming active at the same time), then your performance will probably improve.

Load balancing VA controllers should only be used when throughput is more important than latency (not usually the case with databases). (ie: use both controllers for sequential bandwidth, but not for databases)

Let us know how you make out.

Vince
No matter where you go, there you are.
Steven E. Protter
Exalted Contributor

Re: Performance Problems

maybe a little bigger
msgseg 2048

maybe a lot bigger
nflocks 3000

maybe a lot more shared memory segments for oracle
shmseg 300

Oracle is a MASSIVE user of shared memory resources.

Long term, think about a 64 bit version of Oracle. I know I have done it it hurts but ours is purring quite nicely during load testing.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Giri Sekar.
Trusted Contributor

Re: Performance Problems

Hi:

OK. Here is one thought. Sometimes the switch OR SSR may be set to ip-level instead of data level. Make sure that is not the case.

Thanks

Giri Sekar.
"USL" Unix as Second Language
David Bell_1
Honored Contributor

Re: Performance Problems

Vince,

Sorry about the points. That scrolling mouse seems to get me in trouble in these windows. I will try switching them over after I have performed the upgrades to firmware. Thanks very much for the insight.

Steven,

Thank you for the kernel suggestions, I'll be making some tuning changes hopefully this weekend.

Giri,

I'm not too sure which switch you were referencing. If you could be a bit more detailed I would appreciate it.

Thanks all,

Dave
Chris Vail
Honored Contributor

Re: Performance Problems

I'm going to concure with Steve Protter here, and really urge you to upgrade to 64 bit Oracle. The CPU's have too much idle time: you need to put them to work.
I didn't see that disk I/O was much of a problem: the most I saw was ~6500 blocks/sec, which is NOT very fast. Fiber attached drives should be double that speed or more when fully loaded. I've seen speeds of 55,000 blocks/sec on a Clariion array, so 6500 is underwhelming.
You didn't post the size of the Oracle SGA. This is a big deal, and may need some fine tuning by you and the DBA's. It should probably go up (you may need to add RAM).
But really, the CPU's aren't doing any real work, because Oracle is only working in 32 bit mode.
I had a similar problem on a 12 processor Sun box a few years ago. The DBA's fussed and fought it, but the boss made 'em do it, and system throughput did increase a lot: processor idle went down, and disk I/O went up--on the same hardware.
This is particularly important in multi-processor systems. Remember that the CPU's have to communicate amongst themselves (called mutex locking). Oracle is not taking advantage of the entire available bandwidth, so of course its going to be both slow and idle--its only using a 32 bit word.

FWIW: this time my advice is free.....
I usually charge for it.

Chris
David Bell_1
Honored Contributor

Re: Performance Problems

Chris,

Thanks for taking the time to review the information. The SGA is approximately 1.75GB as each instance is using memory windows.

I'll make the adjustments, again, hopefully this weekend. If not, it may be next. At any rate, thanks to all who've helped out and I'll post my results when the changes are completed.

Thanks again,

Dave
David Bell_1
Honored Contributor

Re: Performance Problems

All,

I have completed the changes with less than overwhelming results. The firmware upgrade appears to have had the largest impact since it changes the end to end checksum.

I'm still of the belief, and some have concurred, that going to the 64 bit Oracle will provide a significant boost. Until the DBA's are in a postion to perform this upgrade, I'll simply have to take metrics for comparison and wait.

Thanks to all who replied, the information was extremely helpful. Consider this thread closed.

Dave