- Community Home
- >
- Servers and Operating Systems
- >
- Legacy
- >
- Operating System - Tru64 Unix
- >
- Re: The end of a long rope...
Operating System - Tru64 Unix
1753259
Members
4941
Online
108792
Solutions
Forums
Categories
Company
Local Language
юдл
back
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Discussions
Discussions
Forums
Forums
Discussions
юдл
back
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Blogs
Information
Community
Resources
Community Language
Language
Forums
Blogs
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-10-2007 10:17 AM
тАО01-10-2007 10:17 AM
The end of a long rope...
OSF1 section9 V5.1 2650 alpha
So I'm using MRTG in conjunction with rrdtool, running on the above AlphaPC. 100% of the calls MRTG/rrdtool tracks is SNMP, and I am monitoring ~25 machines--a mix of Alphas, HP & Dell. I've daemonized the MRTG session to run only one instance, and I use a .cfg file full of 'Include:' statements to load the .cfg files for each machine monitored. Each server has disk space (on several different drives/mountpoints), CPU load, User/Processes, and Memory usage being mapped, and most of them work fine...
Except for memory statistics for the Alphas.
Of the seven Alphas being monitored, two at least show data (though it never changes/updates). The rest show 'NaNQ' where numbers should be. At one time, four of the Alphas had working memory usage graphs, and it was accurately monitoring changes in memory usage on those machines. The statistic I am monitoring specifically on each Alpha is 1.3.6.1.2.1.25.2.3.1.6.1, or 'hrStorageUsed.1' (used kernel memory). I've tried everything to get this to function. I have tried changing where the target for memory is located in the .cfg file(s), I've tried creating a shell script to do the query and multiplication for MRTG and simply feed it numerical values. There are no error messages in the log (which I keep in /var/adm for the master.cfg file). The system flat refuses to map this value. And yet, it will map 1.3.6.1.2.1.25.2.3.1.6.2, or 'hrStorageUsed.2' (used swap).
I thought perhaps that the folder where the cgi and .png files were kept was the issue, so I placed .htaccess files in these directories to specify expirations (5 minutes) to force apache to refresh in those folders. I have tried restarting every service, I've tried rebooting the machine, I've tried .cfg files with just the memory target, on the outside chance that there were too many targets in each .cfg file (~10-12 per .cfg file). I double-triple checked that the user with which MRTG performs these functions as (httpd) could run the necessary commands, write to specified folders, and read from others.
And yet here I am. At whits' end, still no closer to discovering why MRTG/rrdtool will not map memory usage for the Alphas. Memory usage for those servers/PCs running Windows works just fine. New servers added to the network instantly start graphing once the .cfg file for that machine is created...
Except for memory utilization on the Alphas. Some of you may be aware that 'NaNQ' basically means 'not a number quantity', but if the query was returning alphanumeric or string results, this would throw an error to the log file. Which it doesn't. I even changed the target to call an external script that returned garbage, and confirmed that at least error reporting for that target was working (which it was). Does anyone have any thoughts? Ideas? I mean, I'll douse the monitoring server with holy water if I thought it would help...
As mentioned previously, the SNMP query for that value of used kernel memory returns a number that must be multiplied by the hrStorageAllocationUnits.1 value (usually 1024). I tried creating a script that would do all that outside MRTG, and I formatted the script to feed MRTG data the way it likes it, namely four lines, 'in', 'out', 'server uptime' and 'hostname'...nuthin'. No errors, no complaints, no cursing (except from me), just 'NaNQ'. I'm starting to hate Alphas...
So I'm using MRTG in conjunction with rrdtool, running on the above AlphaPC. 100% of the calls MRTG/rrdtool tracks is SNMP, and I am monitoring ~25 machines--a mix of Alphas, HP & Dell. I've daemonized the MRTG session to run only one instance, and I use a .cfg file full of 'Include:' statements to load the .cfg files for each machine monitored. Each server has disk space (on several different drives/mountpoints), CPU load, User/Processes, and Memory usage being mapped, and most of them work fine...
Except for memory statistics for the Alphas.
Of the seven Alphas being monitored, two at least show data (though it never changes/updates). The rest show 'NaNQ' where numbers should be. At one time, four of the Alphas had working memory usage graphs, and it was accurately monitoring changes in memory usage on those machines. The statistic I am monitoring specifically on each Alpha is 1.3.6.1.2.1.25.2.3.1.6.1, or 'hrStorageUsed.1' (used kernel memory). I've tried everything to get this to function. I have tried changing where the target for memory is located in the .cfg file(s), I've tried creating a shell script to do the query and multiplication for MRTG and simply feed it numerical values. There are no error messages in the log (which I keep in /var/adm for the master.cfg file). The system flat refuses to map this value. And yet, it will map 1.3.6.1.2.1.25.2.3.1.6.2, or 'hrStorageUsed.2' (used swap).
I thought perhaps that the folder where the cgi and .png files were kept was the issue, so I placed .htaccess files in these directories to specify expirations (5 minutes) to force apache to refresh in those folders. I have tried restarting every service, I've tried rebooting the machine, I've tried .cfg files with just the memory target, on the outside chance that there were too many targets in each .cfg file (~10-12 per .cfg file). I double-triple checked that the user with which MRTG performs these functions as (httpd) could run the necessary commands, write to specified folders, and read from others.
And yet here I am. At whits' end, still no closer to discovering why MRTG/rrdtool will not map memory usage for the Alphas. Memory usage for those servers/PCs running Windows works just fine. New servers added to the network instantly start graphing once the .cfg file for that machine is created...
Except for memory utilization on the Alphas. Some of you may be aware that 'NaNQ' basically means 'not a number quantity', but if the query was returning alphanumeric or string results, this would throw an error to the log file. Which it doesn't. I even changed the target to call an external script that returned garbage, and confirmed that at least error reporting for that target was working (which it was). Does anyone have any thoughts? Ideas? I mean, I'll douse the monitoring server with holy water if I thought it would help...
As mentioned previously, the SNMP query for that value of used kernel memory returns a number that must be multiplied by the hrStorageAllocationUnits.1 value (usually 1024). I tried creating a script that would do all that outside MRTG, and I formatted the script to feed MRTG data the way it likes it, namely four lines, 'in', 'out', 'server uptime' and 'hostname'...nuthin'. No errors, no complaints, no cursing (except from me), just 'NaNQ'. I'm starting to hate Alphas...
3 REPLIES 3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-11-2007 04:26 PM
тАО01-11-2007 04:26 PM
Re: The end of a long rope...
>> I'm starting to hate Alphas...
I'm sorry to hear that.
And I appreciate one could get annoyed by a function not working as desired. That can be aggravating.
However...
Is the Alpha doing fine otherwise?
And this problem is NOT an Tru64/OSF native function is it?
And it is just a freaking monitor function to generate data that in all likelyhood noone will really ever care to look at because the Alpha itself is hopefully doing just fine.
What's the priority here?
A good monitor or a good production system?
Yeah I like my Alpha, and no it's nothing personal, it's just an other box. Albeit a promissing box shamefully lost through poor management decision.
Cheers!
Hein.
I'm sorry to hear that.
And I appreciate one could get annoyed by a function not working as desired. That can be aggravating.
However...
Is the Alpha doing fine otherwise?
And this problem is NOT an Tru64/OSF native function is it?
And it is just a freaking monitor function to generate data that in all likelyhood noone will really ever care to look at because the Alpha itself is hopefully doing just fine.
What's the priority here?
A good monitor or a good production system?
Yeah I like my Alpha, and no it's nothing personal, it's just an other box. Albeit a promissing box shamefully lost through poor management decision.
Cheers!
Hein.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-16-2007 04:48 AM
тАО01-16-2007 04:48 AM
Re: The end of a long rope...
Um...okay...
Thanks?
Thanks?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО01-17-2007 06:14 AM
тАО01-17-2007 06:14 AM
Re: The end of a long rope...
The first thing you need to do is determine where the problem is. Try manually fetching that OID from the commandline and see if it returns a sane value. Here's the output from my server:
$ snmp_request localhost public get 1.3.6.1.2.1.25.2.3.1.6.1
1.3.6.1.2.1.25.2.3.1.6.1 = 3476136
If it returns a number, then the snmp daemon is off the hook. Things to try at this point would be: tcpdumping port 161 to verify that mrtg is actually sending the right query to the right server, looking really closely at your mrtg config file, and running mrtg with the --debug flag to see what it does with the value once it has it. Perl/mrtg/rrdtool updates might be useful too.
If snmp_request doesn't return a number, then I'd try bouncing snmpd or looking for errors in syslog. As a last resort you could install net-snmp and run it on an alternate port, and point mrtg at that.
$ snmp_request localhost public get 1.3.6.1.2.1.25.2.3.1.6.1
1.3.6.1.2.1.25.2.3.1.6.1 = 3476136
If it returns a number, then the snmp daemon is off the hook. Things to try at this point would be: tcpdumping port 161 to verify that mrtg is actually sending the right query to the right server, looking really closely at your mrtg config file, and running mrtg with the --debug flag to see what it does with the value once it has it. Perl/mrtg/rrdtool updates might be useful too.
If snmp_request doesn't return a number, then I'd try bouncing snmpd or looking for errors in syslog. As a last resort you could install net-snmp and run it on an alternate port, and point mrtg at that.
The opinions expressed above are the personal opinions of the authors, not of Hewlett Packard Enterprise. By using this site, you accept the Terms of Use and Rules of Participation.
News and Events
Support
© Copyright 2024 Hewlett Packard Enterprise Development LP