- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: How to get further info on a process that is h...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-24-2005 07:21 AM
03-24-2005 07:21 AM
How to get further info on a process that is hanging a cpu
Two days ago we started getting an Oracle process that suddenly grabs 100% of a cpu and holds onto it. This is hanging the Oracle database and not allowing further Oracle access. We can't connect to Oracle at all until we kill this process at which time the database becomes available again.
I know that there is a ton to look at as far as SQL and things like that. I'm interested to know what other deep, dark info I can pull from the system about a specific process. Nothing is being logged to any system logs about this as it's an Oracle spawned process.
Any hints on getting further information about a process given it's PID?
Thanks in advance!
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-24-2005 07:36 AM
03-24-2005 07:36 AM
Re: How to get further info on a process that is hanging a cpu
Try tusc/truss the process.
http://hpux.connect.org.uk/hppd/hpux/Sysadmin/tusc-7.7/
Regards,
Robert-Jan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-24-2005 07:38 AM
03-24-2005 07:38 AM
Re: How to get further info on a process that is hanging a cpu
Do you have glance/measureware?
lsof ?
Is this an application written in house?
live free or die
harry d brown jr
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-24-2005 07:40 AM
03-24-2005 07:40 AM
Re: How to get further info on a process that is hanging a cpu
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-24-2005 07:43 AM
03-24-2005 07:43 AM
Re: How to get further info on a process that is hanging a cpu
it lists open files, sockets, ... for a process
http://hpux.ee.ualberta.ca/hppd/hpux/Sysadmin/lsof-4.74/
live free or die
harry d brown jr
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-28-2005 12:52 AM
03-28-2005 12:52 AM
Re: How to get further info on a process that is hanging a cpu
For example, assume PID causing problems is 2311.
Start sqlplus session
execute following sql:
execute the following:
>select spid, addr, serial#, terminal from v$process where spid = 2311;
Make a note of the value for ADDR. Assume value returned is 3234.
Execute the following:
>select * from v$session where PADDR = 3234;
Record value for SID and SERIAL#;
Assume sid is 322 and SERIAL# is 31223
You are now ready to turn on tracing.
Execute the following:
>execute sys.dbms_system.set_sql_trace_in_session (322, 31223, TRUE);
A trace file will now be created in your Oracle Admin/udump directory. i thing the filename will contain the PID of the oracle process.
Wait a few minutes and then turn off tracing.
Here's sql for turning off tracing.
>execute sys.dbms_system.set_sql_trace_in_session (322, 31223, FALSE);
Now run tkprof to see the sql that executed.
ex:
% tkprof
For more information on tracing see the oracle web site.
When you look at the oracle trace report file you should notice some sql statements that are heavy hitters. The next step would be to trace the sql to the application.
jack
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-03-2005 06:18 PM
04-03-2005 06:18 PM
Re: How to get further info on a process that is hanging a cpu
I have had exactly the same issue here.
Previous replys have requested you do this or that to the database, but, as you wrote, the database is hung while the process in spinning.
I have used trus only to find no system call activity at all. Oracle logging has not produced any smoking gun.
The only solution I have found is to bounce the database. Unfortunately, the problem will occur again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-03-2005 10:15 PM
04-03-2005 10:15 PM
Re: How to get further info on a process that is hanging a cpu
What you get with the combination is a case of tusc showing:
write(18,.......
now using "lsof -p
Depending what's going on, you may want to see what it's doing using the same commands.
Can't remember where you get tusc offhand (one of the unsupported HP sites), but lsof is available from vic.cc.purdue.edu/pub/outgoing/tools/lsof (from memory).
The main problem you're likely to have with this is that the process is likely to generate a *LOT* of system calls at 100% CPU. As for actually fixing this issue, I'd get back to whoever wrote the process in the first place and ask them what it's doing. The list of open files (lsof) may well be a big help to them in identifying why it's going wrong, since it may be getting screwy information at some point causing it to do all sorts of extra work. I'd also recommend checking out any reads and writes on other sockets since the process may be doing all sorts of work on behalf of another process, and it would be good to know what that's doing.
Get a copy of Ethereal on there as well (www.ethereal.com). This will allow you to look at the contents of the packets which are passed between the various processes. Now that truss has shown you what ports the process is talking on, lsof has told you who it's talking to, Ethereal can show you what it's talking about. Look at the packet contents and try to reconstruct what's going on. You'll probably spot something daft. The only problem is if the only daft stuff is right at the start of the problem and you come into this once the problem's occurred - you may not catch this without some serious effort...