- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: diaglogd running away - 90% cpu util; single b...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2003 12:11 PM
тАО04-14-2003 12:11 PM
diaglogd running away - 90% cpu util; single bit memory error
BTW, after poking around in logs, stm logs show single bit memory error. Could that have set this off? I'm going to call HP Response Center about single bit error.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2003 12:15 PM
тАО04-14-2003 12:15 PM
Re: diaglogd running away - 90% cpu util; single bit memory error
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2003 12:18 PM
тАО04-14-2003 12:18 PM
Re: diaglogd running away - 90% cpu util; single bit memory error
Stuart
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2003 12:19 PM
тАО04-14-2003 12:19 PM
Re: diaglogd running away - 90% cpu util; single bit memory error
Diaglogd can be busy logging these events. So fix memory first. If will not help look through logs again to understand what's diaglogd doing. If will not find - try updating Online Diags, GR and HWE patch bundles to at least Sep 2002
Eugeny
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2003 12:33 PM
тАО04-14-2003 12:33 PM
Re: diaglogd running away - 90% cpu util; single bit memory error
Start with LOGTOOL which will look at your I/O then try lsof:
STM > TOOLS > UTILITY > RUN > LOGTOOL > FILE > VIEW > RAW SUMMARY.
Note the first and last dates of transactions and calculate the difference. If the difference is short, like 4 hours, then this is important to note. Now read down the report of hardware addresses and observe the integer numbers in parenthesis. Anything over 150 in this 4 hour period should be called into HP for replacement.
lsof -p pid (* diagmond *)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2003 12:51 PM
тАО04-14-2003 12:51 PM
Re: diaglogd running away - 90% cpu util; single bit memory error
The HP Response Center says no problem on the single bit errors. I only had 4 single bit errors. This is no cause for alarm. If we had 100 single bit errors, they would ask us to reboot, and see if they went away.
The way memory works on the V2200, they have extra memory, like disk tracks, which they can vary on or off. During a reboot process the system runs memory tests and if it detects a bad memory area it can vary it off line and vary on a spare memory area in it's place, and we would keep right on going. No need to replace DIMMS, until they run out of spare memory areas.
What probably happened on the diaglogd running away is this:
1. diaglogd, the logger, gets called by diagmond, the monitor, when it needs to log something.
2. Our diagmond had gone away for some reason, after calling diaglogd, and diaglogd got confused.
3. I just killed diaglogd and restarted all diagnostics, (as suggested above):
/sbin/init.d/diagnostic start
and now diagmond is running again. And he calls diaglogd to log.
Thanks for help!
Stuart
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2003 12:56 PM
тАО04-14-2003 12:56 PM
Re: diaglogd running away - 90% cpu util; single bit memory error
We used to see lots of the single bit memory errors on our V-class boxes and we learned to just not worry about it. I think HP finally released a patch that stopped all the single bit errors from going to the syslog file.
JP
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2003 12:56 PM
тАО04-14-2003 12:56 PM
Re: diaglogd running away - 90% cpu util; single bit memory error
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2003 01:09 PM
тАО04-14-2003 01:09 PM
Re: diaglogd running away - 90% cpu util; single bit memory error
This is the procedure that I've run into for I/O errors, but not single bit errors.
Perhaps you should try another call to the response center since you're sometimes apt to get a different answer depending on who you talk too, especially if its a front line engineer. While you're more likely to get a backline engineer on the second call since the front line is obligated to send the case back and escalate after 20 or 30 minutes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО04-14-2003 01:27 PM
тАО04-14-2003 01:27 PM
Re: diaglogd running away - 90% cpu util; single bit memory error
We had three V-class boxes here for several years, and we got them early in the life cycle when we had to go through all the problems with replacing the DIMMS and the memory carriers to get the memory problems solved, so I'm real familiar with the single-bit memory errors in the V-classes. I've made several calls to HP about the single-bit memory errors in the V's, and they say that as long as it is just a few errors, don't worry about it. The memory is designed to trap and work around the single-bit errors. If enough of them show up, it will quit using that part of memory. My experience with the errors matches what HP is telling him. If you just have a few, don't worry about it. If you have a hundred or more, you are about to lose some memory. Now, you can hop on the phone and replace the bad DIMM whenever you see a single-bit error, but it really isn't worth the trouble.
JP