- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Need help troubleshooting performance issue
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2009 07:43 PM
тАО11-02-2009 07:43 PM
Re: Need help troubleshooting performance issue
From the output it is showing:
kernel ( pid=12326 ) --> using top cpu
java process (pid=18018) --> using top memory
swap utilization: --> normal.
disk i/o --> to be measure at that exact time of the issue. Or to be measure historically during runing heavy jobs.
- Also this data shows it was taken when cpu utilization was around ~55%. and not during 100%
You ca Prepare a script or multiple in advance and get ready to run during the performance crunch to pin point the cause.
Hth,
Raj.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2009 08:25 PM
тАО11-02-2009 08:25 PM
Re: Need help troubleshooting performance issue
Here the question would be:
- Did you see any increased load at that time. i.e may be more oracle process or more java process or more application than usual scenario, or more batch was executed.
No increase every process that was running during the problem was running earlier in the day.
- How many cpu do you have . What is the model of the server.
16, Montecito based Superdome,
- How many process wa runningduring that time, and how many process runs at usual load.
a modest increase in active processes, for most of the day active processes were 1800 ~ 2000. During the 30 minute problem the processes jumped up to 2400 ~ 2500, then back down to 2000.
- what was the load factor at that time. Obviously it would be more than 1, 2 ..
A big increase in load >6,
- What measureware 'extract' report shows the historical data of cpu/mem/io/swap/network in/out etc.
From above we can narrow down the cause,
I have attached a text file of global metrics during a 30 minute period that the problem happened.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2009 08:36 PM
тАО11-02-2009 08:36 PM
Re: Need help troubleshooting performance issue
What is this process?
1049892 R 18018 1 java : First in virtual memory and gone to init. Is that normal for it to go to init or should it have a parent pid?
Tnis is a SAP Netweaver processes. I don't know if its normal but when I look at that process its PPID is always init.
What is this process?
90.82 R 18669 18375 jlaunch : 2nd in cpu activity only behind the kernel.
Its a 2nd Netweaver process, both have VM profiles of > 6 GB.
Question to Others:
Is it normal for 'kernel' to be consumming the most CPU time?
kernel is a SAP application process. Yes its normal. SAP and Oracle consume a lot of this server. Its normal for most system resources to be hogh ~80%. I'm pretty sure its one of 5 processes that pushed the server over the edge, the 3 SAP processes, a Oracle Enterprise Manager Process, or a Backup process. The server goes back to normal when the OEM process is stopped.
So was it that one process, and if so what did it do to over consume the server, or was it a bad combination of 5 processes that all decided at that moment to increase their load?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2009 08:58 PM
тАО11-02-2009 08:58 PM
Re: Need help troubleshooting performance issue
Would you attached the totals of the sar -d report?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2009 09:18 PM
тАО11-02-2009 09:18 PM
Re: Need help troubleshooting performance issue
>> a modest increase in active processes, for most of the day active processes were 1800 ~ 2000. During the 30 minute problem the processes jumped up to 2400 ~ 2500, then back down to 2000.
- Well, 2000 to 2400 increase in process number are good amount of bump of processes, and it will consume large amount of resource. And in this case the processes are cpu intensive as cnsuming more cpu.
>> A big increase in load >6,
- This is a huge load for hp-ux system, I have seen 3 to 4 load factor makes the server freeze.
- 16:30 to 16:55 cpu utilization was 100%
- at that time only noticeabe change is little bit increase in swap usage : 4%.
That means the increased number of processes are consuming more cpu.
- next ste would be track down the process details, application details and try to figure out is it normal for those extra process to consume 70% of the cpu.
As it was bumped 30% to 70%.
I have seen a 128 monteito cpu SD performs low with increase in load. So the team who is putting the load on the server keep asking us how much is the load and accordingly they increase the load.
- If you get a difference between the current process and increase in process ( ps -ef ) , notify the application team that this 400 process caused cpu to go from 30% to 70%. And verify if it is normal . If it is normal , then the system may need more 'horse power'.
Hth,
Raj.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-02-2009 09:24 PM
тАО11-02-2009 09:24 PM
Re: Need help troubleshooting performance issue
bumped 30% to 70% --> to be read as "30% to 100% "
[That means 70% sudden increase of cpu reources.]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-03-2009 02:17 AM
тАО11-03-2009 02:17 AM
Re: Need help troubleshooting performance issue
total 1251063 691602 559321 55%
You actually have 524 Gb of memory and an extra 700 Gb of device swap?
You probably should remove lots of that device swap.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-03-2009 05:23 AM
тАО11-03-2009 05:23 AM
Re: Need help troubleshooting performance issue
So you've got a process bottleneck, a cpu bottleneck and a disk bottleneck. But which process caused it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-03-2009 05:47 AM
тАО11-03-2009 05:47 AM
Re: Need help troubleshooting performance issue
16:50 - cpu
17:05 - free memory drops from 651 x 10**6 MB
-to- 9 x 10**6 MB -or- about 85% less of normal if 651 is normal.
So the first thing to happen was a disk bottleneck, based upon you MWA data.
Since paging jumped astronomically this explains the disk bottleneck.
A high priority page in request will cause the processing to stop until the page is found.
This is most certainly an application issue -
What in the application causes high priority requests?
What was running / happening / were the users doing at 16:45 ?????
Note: 16:45 - end of day - some monster report / select statement
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО11-03-2009 06:15 AM
тАО11-03-2009 06:15 AM
Re: Need help troubleshooting performance issue
I am thinking this is a global select statement. Why? Its not just one high priority request, its a lot of high priority requests. So many that memory filled up and still incomplete, still lookiing for more, when the box crashed.
I am also thinking this was run from a power user in SAP. It fits. Basic users aren't going to have the high priority privileges. But you can verify this with SAP as to user priorities.
And that leaves and SAP admin. And its going to be a fight to get it out of a college.