HPE EVA Storage

monitoring the health of an eva5000

 
SOLVED
Go to solution
Tim Killinger
Regular Advisor

monitoring the health of an eva5000

I recently assumed admin duties for 2 RP servers using an eva5000 for storage. I'm not particularly familiar with eva technology, but I want to make sure there is event monitoring keeping tabs on the array. The servers themselves are monitored with EMS. What monitors tools are available to me?

Thanks in advance!
5 REPLIES 5
susanta_dutta
Trusted Contributor

Re: monitoring the health of an eva5000

Hello Tim,

HP Remote Support Pack (RSP) can be used to monitor EVA, refer this link http://g4u0419c.houston.hp.com/en/5900-0064/ch01s01.html to know more about RSP.

Regards
Susanta
DogBytes
Valued Contributor
Solution

Re: monitoring the health of an eva5000

Hi Tim,

True you can use HP's tools to monitor the EVA but you can also supplement through scripting. If you download sssu for HP-UX, you can run a cron job that extracts EVA details into files. From there its up to you.

What I do is examine each file for specific phrases like "operationalstate .................: good" and report the number of occurances (wc -l). I know should always get 16 for my eva8000 controller and if I don't, I generate an alarm.

(Don't forget to update sssu.exe on the server whenever you upgrade Command View EVA on the management server).

I can provide more details if you're interested.

- R.
Tim Killinger
Regular Advisor

Re: monitoring the health of an eva5000

- R,

Yes I am interested in a few more details! Your solution sounds simple and just what I need. My main concern is losing a drive and not knowing for a time long enough to get me in trouble. Thanks in advance!
Uwe Zessin
Honored Contributor

Re: monitoring the health of an eva5000

Command View EVA can also turn events into SNMP traps sent to a management system.
.
DogBytes
Valued Contributor

Re: monitoring the health of an eva5000

First I created a file file with the EVA credentials and sssu options.

# cat eva5000
select manager management_server username=authorized_user password=xxxxxx
select system MY_EVA5000
set options command_delay=15 display_time_off display_width=200
#

The authorized_user should be a member of the "HP Storage Users" group on the SAN management server. (This group has enough privs to get data without being a member of the "HP Storage Admins" group).

Once an hour, I run commands to generate data files:

/usr/local/bin/sssu "file eva5000" "ls disk full" > eva5000_disk.txt
/usr/local/bin/sssu "file eva5000" "ls controller full" > eva5000_controller.txt
/usr/local/bin/sssu "file eva5000" "ls diskgroup full" > eva5000_diskgroup.txt
/usr/local/bin/sssu "file eva5000" "ls host full" > eva5000_host.txt
/usr/local/bin/sssu "file eva5000" "ls power full" > eva5000_power.txt
/usr/local/bin/sssu "file eva5000" "ls lun full" > eva5000_lun.txt
/usr/local/bin/sssu "file eva5000" "ls system full" > eva5000_system.txt
/usr/local/bin/sssu "file eva5000" "ls vdisk full" > eva5000_vdisk.txt

In your case you'd want to start with the first one (ls disk full).

Allow 30 - 45 minutes for completion (it probably will take less than 10 minutes with the command_delay at 15), then search through the files and compare the results with established norms. If your EVA5000 is under extreme load, increase the command_delay value to reduce the impact of the sssu commands and the wait time.


##############################
Before testing values, perform a quick data integrity check against the files, just in case the management server was rebooting when your sssu scripts were running:

#
notify="me@mycompany.com"
msg="Unexpected condition"
eva_type=eva5000
eva_folder=/home/somewhere
EMAIL_DATE=`date`
#
error_test1=`grep "Error: Command not valid until a system is selected (SELECT SYSTEM)" $eva_folder/$eva_type*.txt`
if [[ -n "$error_test1" ]]
then echo "SSSU output file data integrity check failed. \nEVA element checks will not proceed until next scheduled window.\nEmail sent at $EMAIL_DATE" |mailx -s "$eva_type monitoring exception report" $notify
exit
fi
#############################


If your EVA5000 has 168 disks, this command should produce "168".
grep "state ........................................................................................: installed_good" eva8000_disk.txt |wc -l
168

Just determine your baseline values once and repeat the tests every hour. If different, generate a warning email.

expected_var=168
actual_var=`grep "state ........................................................................................: installed_good" eva8000_disk.txt |wc -l`
if ((actual_var != expected_var))
then echo "Expected result: $expected_var \nActual result: $actual_var" |mailx -s "$msg for $eva_type " $notify
fi

Don't use this method for the cache batteries because their state can change from "holding charge" to "charging", but it works fine for all the other parameters I've tried. I know this isn't fancy perl scripting but it does the trick.
Note, you can change the value of display_width=200 but you're output will change so set a value and stick with it.

Note: The reason I've gone through this process is because I've found the HP monitoring to be unreliable. The desta service can be running but the director process hung and no alarms get out. WEBES may be better now but I still rely on the primative scripts. :)

Good luck.