All About the Apps
cancel
Showing results for 
Search instead for 
Did you mean: 

Re: 2 tips and tricks for effective system performance analysis

Ramki

Hello Ram,

 

Thanks for the feedback. Here are the sample alarmdefs for both scenarios:

1. Shift based alarming


start_shift = "08:00"
end_shift = "17:00"

symptom CPU_Bottleneck type=CPU
rule GBL_CPU_TOTAL_UTIL > 75 prob 25
rule GBL_CPU_TOTAL_UTIL > 85 prob 25
rule GBL_CPU_TOTAL_UTIL > 90 prob 25
rule GBL_RUN_QUEUE > 2 prob 25

ALARM CPU_Bottleneck > 80 AND GBL_STATTIME > start_shift AND GBL_STATTIME < end_shift for 10 minutes
type = "CPU"
start
if CPU_Bottleneck > 90 then
red alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
else
yellow alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
repeat every 10 minutes
if CPU_Bottleneck > 90 then
red alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
else
yellow alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
end
reset alert "End of CPU Bottleneck Alert"



2. Detecing memory leaks:

 

# Watch for DBServer application using over 100MB memory VSS

VSSthreshold = 100000

alarm DBServer:APP_MEM_VIRT > VSSthreshold for 5 minutes
start {
yellow alert "DBServer app memory threshold exceeded"
exec "echo 'DBServer app memory alert' | mail root@adminbox"
}
repeat every 60 minutes {
yellow alert "DBServer application still hogging memory"
exec "echo 'DBServer app alert continuing' | mail root@adminbox"

About the Author

Ramki

Comments
LindsayHill

Editing alarm.def files on individual servers is not easy, intuitive, or scalable. Wouldn't it be better so set shift windows and alarm thresholds in Operations Manager directly, rather than by editing files on individual servers? Wouldn't it be easier to just define a process to watch, and thresholds to alarm on using the Operations Manager policies?

 

Similarly with editing alarm.def to send email to individual users - this is problematic to manage across multiple systems, and is better handled at an Operations Manager level (probably usning the xMatters integration).

Ramkumar Devanathan

Ramki, nice tips for using alarmdef/adviser file effectively. Keep giving us expert tips like this.

 

It would be helpful if the alarmdef syntax above is copy-able (like a code snippet). Useful for copy-paste freaks - a large population among us. :)

 

 

Here's a nice gotcha to turn off alerting from perfalarm component entirely.

 

# agsysdb -ovo off

 

Also have a look at the other options for this command agsysdb (located in the same folder as the other perf binaries - /opt/perf/bin or %OvInstallDir%\bin.

 

I use this when i have deployed OM policies to do the system monitoring and so i don't want direct alerts from perfalarm.

HPE-SW-Guest

Hello Lindsay,

 

Thank you for the feedback. We agree with you that editing individual alarmdef files is not a scalable solution.

We have updated the blog content suitably. Please note the addition of following lines at the end:

The alarm definitions need not be edited on individual systems. You can change the alarmdef policy on the HP Operations Manager and deploy it on to all interested nodes

 

--------------------------------------------------

 

Thanks and regards,

Ramki

Field Service Program

The tips and tricks discussed above by Ramakrishna Baipadithaya Kenchabhatre and Sunil Lingappa, are effective enough. Editing individual alarmdef files is not a scalable solution is a point one must always be known to.

Ramki

Hello Ram,

 

Thanks for the feedback. Here are the sample alarmdefs for both scenarios:

1. Shift based alarming


start_shift = "08:00"
end_shift = "17:00"

symptom CPU_Bottleneck type=CPU
rule GBL_CPU_TOTAL_UTIL > 75 prob 25
rule GBL_CPU_TOTAL_UTIL > 85 prob 25
rule GBL_CPU_TOTAL_UTIL > 90 prob 25
rule GBL_RUN_QUEUE > 2 prob 25

ALARM CPU_Bottleneck > 80 AND GBL_STATTIME > start_shift AND GBL_STATTIME < end_shift for 10 minutes
type = "CPU"
start
if CPU_Bottleneck > 90 then
red alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
else
yellow alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
repeat every 10 minutes
if CPU_Bottleneck > 90 then
red alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
else
yellow alert "CPU Bottleneck probability= ", CPU_Bottleneck, "%"
end
reset alert "End of CPU Bottleneck Alert"



2. Detecing memory leaks:

 

# Watch for DBServer application using over 100MB memory VSS

VSSthreshold = 100000

alarm DBServer:APP_MEM_VIRT > VSSthreshold for 5 minutes
start {
yellow alert "DBServer app memory threshold exceeded"
exec "echo 'DBServer app memory alert' | mail root@adminbox"
}
repeat every 60 minutes {
yellow alert "DBServer application still hogging memory"
exec "echo 'DBServer app alert continuing' | mail root@adminbox"

Events
June 6 - 8, 2017
Las Vegas, Nevada
Discover 2017 Las Vegas
Join us for HPE Discover 2017 in Las Vegas. The event will be held at the Venetian | Palazzo from June 6-8, 2017.
Read more
Each Month in 2017
Online
Software Expert Days - 2017
Join us online to talk directly with our Software experts during online Expert Days. Find information here about past, current, and upcoming Expert Da...
Read more
View all