All About the Apps
Showing results for 
Search instead for 
Do you mean 

2 tips and tricks for effective system performance analysis

HPE-SW-Guest ‎08-06-2013 02:18 PM - edited ‎06-09-2015 01:08 PM

Guest post by Ramakrishna Baipadithaya Kenchabhatre and Sunil Lingappa, HP Operations Agent R&D Leads 


Andrew is a System Administrator at a large IT enterprise company. He uses HP GlancePlus, HP Operations Agent and HP Operations Manager in his environment to monitor and troubleshoot system performance issues. However he has two tricky problems that haunt him once in a while.


Problem 1:


Andrew comes to office on a fine morning and sees that his HP Operations Manager message browser is full of alarms generated from two systems in his environment. He is concerned and gets down to the issue at hand. He uses his debugging tools and figures out that these two systems are critical database systems in the infrastructure. He is able to figure out that these alarms are generated between 2 a.m. and 4 a.m., and he begins his investigation here.


Andrew opens the alarmdef and sees that he has alarm definitions which raise an alert on the threshold of cpu, disk resources crossing 70 percent utilization. This is an expected behavior. He is now wondering where the gap is. Andrew now figures out these systems are used for business database transactions during the day between 9 a.m. and 3p.m. everyday. These systems further back up their databases on a backend server between 2 a.m. and 4 a.m.


Problem: The need for ‘Intelligent Alarming’ wherein alarms can be masked at customizable times of the day.


We can define ‘shifts’ in alarm definitions to define the alarm thresholds. Let’s see an example of how this can be achieved.


Sample alarm definition:


In this alarm definition, I have defined a start_shift, end_shift time of the day parameters. I have an alarm for CPU bottleneck defined taking in to account these shifts. The alarms get generated only i :

-          The alarm threshold  condition is met and

-          The time of the day is between 04:20 and 04:23 only





 Message Browser window of HP Operations Manager:


As you can see from the output of the Message Browser of HP Operations Manager window, alarms are generated only between the customized times ( start_shift, end_shift) only. The message with severity “normal” indicates the shift window processing is complete. In this way Andrew is able to use a common alarm definition to intelligently monitor his environment efficiently.


Problem 2:


Andrew has a system in his IT infrastructure where constant upgrades and patches are applied for a DBserver Application on a regular basis. He wants to monitor these application upgrades in terms of resource utilization and in particularly the memory growth, because the system is just sufficient on memory. He wants to proactively detect a potential memory leak and take corrective action like downgrading the patch or any suitable action if the need arises.


Andrew then spends the next few hours racking his brain thinking of an easy and automated way to do this.


Problem : Proactively detect memory leak of an Application


Andrew uses the alarmdef file that comes with HP Operations Agent software to achieve his purpose. He sets up a customized alarm definition as below.


Customized Alarmdef:



He has defined a application in the parm file for all processes that attribute towards the DBServer Application. He has set a threshold of 100 MB for the memory growth limit. APP_MEM_VIRT gives the virtual memory growth of the defined application over time.


Unfortunately the alert messages keep coming to Andrew’s inbox every five minutes. That’s a lot of messages do deal with over night or during the weekend. In order to reduce the frequency of these messages in his inbox, he uses the repeat keywordin the alarm definition. This interval is configurable and is like a ‘snooze’ timer. The alert is validated only if the condition is true after the repeat interval and then an email is sent to the administrator indicating the issues still exists.


Thus Andrew is able to effectively monitor the DBServer application proactively and take necssary actions so that his entire IT system does not go down due to a memory crunch on the system.


The alarm definitions need not be edited on individual systems. You can change the alarmdef policy on the HP Operations Manager and deploy it on to all interested nodes


This is the third part of a three-part series. I encourage you to read part 1 and part 2 here


About the Author


This account is for guest bloggers. The blog post will identify the blogger.

on ‎08-06-2013 05:48 PM

Editing alarm.def files on individual servers is not easy, intuitive, or scalable. Wouldn't it be better so set shift windows and alarm thresholds in Operations Manager directly, rather than by editing files on individual servers? Wouldn't it be easier to just define a process to watch, and thresholds to alarm on using the Operations Manager policies?


Similarly with editing alarm.def to send email to individual users - this is problematic to manage across multiple systems, and is better handled at an Operations Manager level (probably usning the xMatters integration).

Ramkumar Devanathan
on ‎08-06-2013 09:19 PM

Ramki, nice tips for using alarmdef/adviser file effectively. Keep giving us expert tips like this.


It would be helpful if the alarmdef syntax above is copy-able (like a code snippet). Useful for copy-paste freaks - a large population among us. :)



Here's a nice gotcha to turn off alerting from perfalarm component entirely.


# agsysdb -ovo off


Also have a look at the other options for this command agsysdb (located in the same folder as the other perf binaries - /opt/perf/bin or %OvInstallDir%\bin.


I use this when i have deployed OM policies to do the system monitoring and so i don't want direct alerts from perfalarm.

on ‎08-08-2013 09:37 AM

Hello Lindsay,


Thank you for the feedback. We agree with you that editing individual alarmdef files is not a scalable solution.

We have updated the blog content suitably. Please note the addition of following lines at the end:

The alarm definitions need not be edited on individual systems. You can change the alarmdef policy on the HP Operations Manager and deploy it on to all interested nodes




Thanks and regards,


Field Service Program
on ‎08-14-2013 11:26 PM

The tips and tricks discussed above by Ramakrishna Baipadithaya Kenchabhatre and Sunil Lingappa, are effective enough. Editing individual alarmdef files is not a scalable solution is a point one must always be known to.

‎08-20-2013 04:08 AM - edited ‎08-20-2013 04:09 AM

Hello Ram,


Thanks for the feedback. Here are the sample alarmdefs for both scenarios:

1. Shift based alarming

start_shift = "08:00"
end_shift = "17:00"

symptom CPU_Bottleneck type=CPU
rule GBL_CPU_TOTAL_UTIL > 75 prob 25
rule GBL_CPU_TOTAL_UTIL > 85 prob 25
rule GBL_CPU_TOTAL_UTIL > 90 prob 25
rule GBL_RUN_QUEUE > 2 prob 25

ALARM CPU_Bottleneck > 80 AND GBL_STATTIME > start_shift AND GBL_STATTIME < end_shift for 10 minutes
type = "CPU"
if CPU_Bottleneck > 90 then
red alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
yellow alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
repeat every 10 minutes
if CPU_Bottleneck > 90 then
red alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
yellow alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
reset alert "End of CPU Bottleneck Alert"

2. Detecing memory leaks:


# Watch for DBServer application using over 100MB memory VSS

VSSthreshold = 100000

alarm DBServer:APP_MEM_VIRT > VSSthreshold for 5 minutes
start {
yellow alert "DBServer app memory threshold exceeded"
exec "echo 'DBServer app memory alert' | mail root@adminbox"
repeat every 60 minutes {
yellow alert "DBServer application still hogging memory"
exec "echo 'DBServer app alert continuing' | mail root@adminbox"

27 Feb - 2 March 2017
Barcelona | Fira Gran Via
Mobile World Congress 2017
Hewlett Packard Enterprise at Mobile World Congress 2017, Barcelona | Fira Gran Via Location: Hall 3, Booth 3E11
Read more
Each Month in 2017
Software Expert Days - 2017
Join us online to talk directly with our Software experts during online Expert Days. Find information here about past, current, and upcoming Expert Da...
Read more
View all