How to monitor Thin Provisioned LUNs

kghammond · ‎09-22-2009

We are starting to deploy our vSphere 4 environment using some thin provisioned VM's and thin provisioned LUNs.

We have alerts setup in vSphere to alert us when VMFS volumes are running out of space.

I have been looking for an alert that will trigger when a thin provisioned LUN starts running out of space, but I have not found that yet.

Does anyone know of a way using CMC or 3rd party tools to trigger email warnings and alerts when a thin provisioned LUN starts to exceed 85% of the provisioned capacity of the LUN?

Thank You,
Kevin

teledata · ‎09-22-2009

If you have a Nagios server I've built a script to return volume and cluster utilization performance metrics.

Which then can be graphed, and alerted based on percentage utilization....

I can post the scripts here.. Just have to go find them.

http://www.tdonline.com

teledata · ‎09-22-2009

Please be kind. This script is still a work in progress and I appreciate any community input. this script could certainly be simplified I suspect.
Appologies for the lack of comprehensive commenting in the shell script. This assumes you have your nagios server running and SNMP (with Lefthand MIBs) configured properly

Nagios Check Command:$USER1$/check_nsmvolume2 $HOSTADDRESS$ $USER7$ $ARG1$ "$ARG2$" "Space Used" $ARG3$ $ARG4$

ARG1 = # (which cluster in the mgmt group, starts with 1)
ARG2 = Name of the volume (this is case sensitive)
ARG3 = warning threshold (% of allocated space that has been provisioned)
ARG4 = critical threshold (% of allocated space that has been provisioned)

Command Line Example: check_nsm_volume2!2!SQL_Backup!95!99

/usr/local/groundwork/nagios/libexec/check_nsmvolume2 10.0.1.250 public 2 "SQL_Backup" "Space Used" 95 99

OK - 509 GB (67% used)| SQL_Backup=509 snapshots=28 total=509

Nagios Performance Graph settings:

Graph Label: Lefthand Volume Utilization
Service: nsm_volumesize_
Use Service as a Regular Expression ON
Host: *
Status Text Parsing Regular Expression:
Use Status Text Parsing instead of Performance Data OFF
RRD Name /usr/local/groundwork/rrd/$HOST$_$SERVICE$.rrd
RRD Create Command $RRDTOOL$ create $RRDNAME$ --step 300 --start n-1yr DS:MB:GAUGE:1800:U:U RRA:AVERAGE:0.5:1:8640 RRA:AVERAGE:0.5:12:9480
RRD Update Command $RRDTOOL$ update $RRDNAME$ $LASTCHECK$:$VALUE1$ 2>&1
Custom RRDtool Graph Command '/nagios/cgi-bin/label_graph.cgi'
Enable ON

The shell script used on the nagios server (check_nsmvolume2)
on my Nagios (Groundwork) it lives in /usr/local/groundwork/nagios/libexec/check_nsmvolume2

#! /bin/sh
#

STATE_OK=$(expr 0)
STATE_WARNING=$(expr 1)
STATE_CRITICAL=$(expr 2)
STATE_UNKNOWN=$(expr 3)

volumecount=$(/usr/local/groundwork/nagios/libexec/check_snmp -P 2c -H $1 -C $2 -o LEFTHAND-NETWORKS-NSM-CLUSTERING-MIB::clusVolumeCount.0|cut -d" " -f4)
volumecount=$(echo "$volumecount+1" |bc)

findvol=1

while [ "$findvol" -lt "$volumecount" ]

do

volname=$(/usr/local/groundwork/nagios/libexec/check_snmp -P 2c -H $1 -C $2 -o LEFTHAND-NETWORKS-NSM-CLUSTERING-MIB::clusVolumeName.$findvol|cut -d" " -f4|cut -d'"' -f2)
#echo "Checking $volname"
if [ "$volname" = "$4" ] ; then
volinstance=$(echo "$findvol" |bc)
break
# else
# echo "I guess $volname is not $4"
fi
findvol=$(($findvol+1))
done

if [ "$volinstance" = "" ] ; then
echo "problem - volume '$4' is not a volume known to the management group."
exit $STATE_UNKNOWN
fi

volspace=$(/usr/local/groundwork/nagios/libexec/check_snmp -P 2c -H $1 -C $2 -o LEFTHAND-NETWORKS-NSM-CLUSTERING-MIB::clusVolumeProvisionedSpace.$volinstance|cut -d" " -f4)

allocated=$(/usr/local/groundwork/nagios/libexec/check_snmp -P 2c -H $1 -C $2 -o LEFTHAND-NETWORKS-NSM-CLUSTERING-MIB::clusVolumeSize.$volinstance|cut -d" " -f4)

RET=$?
if [[ $RET -ne 0 ]]
then
echo "query problem - No data received from host"
exit $STATE_UNKNOWN
fi

snapcount=$(/usr/local/groundwork/nagios/libexec/check_snmp -P 2c -H $1 -C $2 -o LEFTHAND-NETWORKS-NSM-CLUSTERING-MIB::clusVolumeSnapshotCount.$volinstance|cut -d" " -f4)

repllevel=$(/usr/local/groundwork/nagios/libexec/check_snmp -P 2c -H $1 -C $2 -o LEFTHAND-NETWORKS-NSM-CLUSTERING-MIB::clusVolumeReplicaCount.$volinstance|cut -d" " -f4)
repllevel=$(echo "$repllevel / 1" | bc)
#echo "Got a rep level of $repllevel"

if [ $snapcount -ge 1 ] ; then
#echo "WE GOT $snapcount SNAPSHOTS"
varsnap=0
totalspace=$volspace
while [ "$varsnap" -lt $snapcount ]
do
varsnap=$(($varsnap+1))

snapspace=$(/usr/local/groundwork/nagios/libexec/check_snmp -P 2c -H $1 -C $2 -o LEFTHAND-NETWORKS-NSM-CLUSTERING-MIB::clusVolumeSnapshotProvisionedSpace.$volinstance.$varsnap|cut -d" " -f4)

totalspace=$(expr $totalspace + $snapspace)
volonly=$(echo "$snapspace + $volspace" | bc)
snaponly=$(echo "($totalspace - $snapspace) -volspace" |bc)

done

else
volonly=$(echo "$volspace /1" |bc)
snaponly=$(echo "0" |bc)
totalspace=$(echo "$volspace /1" |bc)
fi

volonly=$(echo "$volonly / $repllevel" | bc)
snaponly=$(echo "$repllevel * $snaponly" |bc)
totalspace=$(echo "$repllevel * $totalspace" |bc)
totalspace=$(echo "$totalspace / $repllevel" |bc)

mbspace=$(echo "$totalspace / 1024" |bc)
gigspace=$(echo "$mbspace / 1024" |bc)
gigspace=$(echo "$gigspace / $repllevel" |bc)

volonly=$(echo "$volonly / 1024" |bc)
volonly=$(echo "$volonly / 1024" |bc)

if [ $snaponly -gt 0 ] ; then
snaponly=$(echo "$snaponly / 1024" |bc)
snaponly=$(echo "$snaponly / 1024" |bc)
snaponly=$(echo "$snaponly/ $repllevel" |bc)
fi

if [ "$allocated" = 0 ] ; then
perct="N/A"
echo "OK - (remote snapshot) $totalspace GB | $volname=$volonly snapshots=$snaponly total=$gigspace"
else
warning=$(echo "scale=2; $6 * .01" | bc)
critical=$(echo "scale=2; $7 * .01" |bc)

warning=$(echo " ($warning * $allocated)/1024" |bc)
critical=$(echo " ($critical * $allocated)/1024" |bc)
allocated=$(echo "$allocated / 1024" |bc)
# allocated=$(echo "$allocated * $repllevel" |bc)
checksize=$(echo "$volonly * 1024" |bc)

perct=$(echo "scale=2; ($volonly*1024) / $allocated" |bc)
perct=$(echo "($perct * 100)/1" |bc)

if [ $checksize -ge $critical ] ; then
echo "CRITICAL - *$gigspace GB ($perct% used)* | $volname=$volonly snapshots=$snaponly total=$gigspace"
exit $STATE_CRITICAL

elif [ $checksize -ge $warning ] ; then
echo "WARNING - *$gigspace GB ($perct% used)* | $volname=$volonly snapshots=$snaponly total=$gigspace"
exit $STATE_WARNING

elif [ $checksize -lt $warning ] ; then
echo "OK - $gigspace GB ($perct% used)| $volname=$volonly snapshots=$snaponly total=$gigspace"
exit $STATE_OK

else
echo "problem - No data received from host"
exit $STATE_UNKNOWN
fi

fi

http://www.tdonline.com

Categories

Company

Local Language

Forums

Discussions

Forums

Discussions

Discussions

Forums

Discussions

Forums

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

How to monitor Thin Provisioned LUNs

How to monitor Thin Provisioned LUNs

Re: How to monitor Thin Provisioned LUNs

Re: How to monitor Thin Provisioned LUNs