HP9000/Smart Array 6402 Event Notification

ALCS · ‎01-11-2008

Dears,

We need to find a simple way to get a notification when a disk fails within a smart array 6402 controller on an RP3440/HP-UX 11i v2.

We do appreciate very much feedbacks..

Regards,
Farid

Keep it simple

Duncan Edmonstone · ‎01-11-2008

Farid,

The standard EMS Event Monitors should already be monitoring the SA6402 controller. If you get a disk failure I would expect a message in syslog and an email to root as a minimum.

If you want to change this in some way, you'll need to use the monconfig command.

man monconfig

for more info.

HTH

Duncan

I am an HPE Employee

ALCS · ‎01-11-2008

Thanks Duncan,

The EMS was implemented but it only detected the internal disk failures but not the SA attached disks (probably being a hardware RAID1 so the OS will not sense it)

Regards,
Farid

Keep it simple

Duncan Edmonstone · ‎01-11-2008

Farid,

EMS should be passed info by the RAID controller. Here are the events which can be generated for a RAID controller monitor:

http://docs.hp.com/en/diag/ems/dm_raid_adapter.htm

You will see disk failure etc. listed as potential events.

Check using monconfig that you are actually monitoring for RAID controller events.

HTH

Duncan

I am an HPE Employee

ALCS · ‎01-11-2008

Thanks Duncan,

Through EMS we could apply a condition (for ex: down) for internal disks by which an email could be generated using send mail.
(but no such entries for the SA and its attached disks)

In the SA case, where the generated events you mentioned can be located and how can we trigger an email notification for an event. (ex disk failure)

Tnanks again,

Farid

Keep it simple

Duncan Edmonstone · ‎01-11-2008

Farid,

Let's check EMS can see your RAID card.

run the following on your system:

resls /adapters/events

resls /adapters/events/raid_adapter

What output do you get?

HTH

Duncan

I am an HPE Employee

ALCS · ‎01-11-2008

Thanks a lot Duncan,
Will provide you feedbck on monday

best regards,
Farid

Keep it simple

ALCS · ‎01-14-2008

Hi Duncan,
Please find below the output of some EMS commands:
# resls /adapters/events
Contacting Registrar on bbacist2
NAME: /adapters/events
DESCRIPTION: Events for Adapter Resources
TYPE: /adapters/events is a Resource Class.

There are 4 resources configured below /adapters/events:
Resource Class
/adapters/events/raid_adapter
logout root
/adapters/events/ql_adapter
/adapters/events/iscsi_adapter

# resls /adapters/events/raid_adapter
Contacting Registrar on bbacist2
NAME: /adapters/events/raid_adapter
DESCRIPTION: RAID Smart Array (SA) Controller Event Monitor
This resource monitors events for RAID SA Controller products. The monitor can be configured to report events directly using the Monitoring Request
Manager, or as a change in device status using the Event Monitoring
Service (EMS).
For more information, see the monitor man page, dm_raid_adapter(1m)
TYPE: /adapters/events/raid_adapter is a Resource Class.
There is one resource configured below /adapters/events/raid_adapter:
Resource Class
/adapters/events/raid_adapter/0_2_1_0_4_0

# resls /adapter/status/raid_adapter
Contacting Registrar on bbacist2
NAME: /adapter/status/raid_adapter
DESCRIPTION: None
TYPE: /adapter/status/raid_adapter is not a valid resource name

# resls /adapters/status/raid_adapter
Contacting Registrar on bbacist2
NAME: /adapters/status/raid_adapter
DESCRIPTION: RAID Smart Array (SA) Controller Event Monitor
This resource monitors events for RAID SA Controller products. The monitor can be configured to report events directly using the Monitoring Request
Manager, or as a change in device status using the Event Monitoring
Service (EMS).
For more information, see the monitor man page, dm_raid_adapter(1m)
TYPE: /adapters/status/raid_adapter is a Resource Class.
There is one resource configured below /adapters/status/raid_adapter:
Resource Class
/adapters/status/raid_adapter/0_2_1_0_4_0

# resls /storage/status/disks/default
Contacting Registrar on bbacist2
NAME: /storage/status/disks/default
DESCRIPTION: Disk Event Monitor
This resource monitors events for stand-alone disk drives. Event
monitoring requests are created using the Monitoring Request
Manager. Monitoring requests to detect changes in device status are
created using the Peripheral Status Monitor (psmmon(1m)) and Event
Monitoring Service (EMS).
For more information see the monitor man page, (disk_em(1m)).
TYPE: /storage/status/disks/default is a Resource Class.
There are 2 resources configured below /storage/status/disks/default:
Resource Class
/storage/status/disks/default/0_1_1_0.0.0
/storage/status/disks/default/0_1_1_0.1.0

# resls /storage/status/disk_array
Contacting Registrar on bbacist2
NAME: /storage/status/disk_array
DESCRIPTION: None
TYPE: /storage/status/disk_array is not a valid resource name
#
/adapters/events/TL_adapter

Roger

Systems Engineer

Keep it simple

Steven E. Protter · ‎01-14-2008

Shalom Farid,

Based on the data you posted I reach the following conclusions:

1) EMS CAN see the array controller.
2) EMS can not fully monitor the disks on the controller.
3) Fixing this will require HP to update EMS and possibly hardware enablement on the system.
4) You might be able to write your own monitor script that checks the raw ouptut and greps through it and lets you know via a normal email script if there is a problem. I would think this outuput.

Resource Class
/storage/status/disks/default/0_1_1_0.0.0
/storage/status/disks/default/0_1_1_0.1.0

Would be different if one of the disks failed.

Therefore the problem can possibly be solved by scripting.

You may find this thread with embedded script useful in getting your results out with email.

http://www.hpux.ws/?p=7

SEP

Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com

ALCS · ‎01-14-2008

Hi Setven,

Thanks a lot for your reply.
Could you tell me how should i use the script to get the notification via e-mail.

Roger

Keep it simple

Duncan Edmonstone · ‎01-14-2008

Farid,

I'm not entirely sure how SEP reached the conclusion he did on this - I see nothing conclusive in the output to sugggest this.

The two disks seen don't appear to be on the RAID controller (they look like system disks), and I'm not totally sure that a RAID LUN would be seen under /storage/status/disks/default.

We do see that EMS sees ther RAID adapter - what's the output of:

resls /adapters/status/raid_adapter/0_2_1_0_4_0

HTH

Duncan

I am an HPE Employee

ALCS · ‎01-14-2008

Hi Duncan,

1- we have 2 x internal disks in the server conneted to the internal controller.
2- Through EMS, i already created notification events ( resources ) on the status of these disks : worked fine.
3- I could not detect the disks connected to MSA storage through EMS ; only the controller.
4- find below the output of the following command :

# resls /adapters/status/raid_adapter/0_2_1_0_4_0
Contacting Registrar on bbacist2

NAME: /adapters/status/raid_adapter/0_2_1_0_4_0
DESCRIPTION: UP:
A state of "UP" means the hardware is available for use.

DOWN:
A state of "DOWN" means the hardware is unavailable for use.
The /opt/resmon/bin/set_fixed command must be used with the resource name once t he hardware is fixed to reset the hardware to the "UP" state.

UNKNOWN:
A state of "UNKNOWN" means it is unknown whether the hardware
is unavailable for use or not. The hardware should be considered DOWN.

TYPE: /adapters/status/raid_adapter/0_2_1_0_4_0 is a Resource Instance
whose values are enumerated values.

There are 3 valid states reported for /adapters/status/raid_adapter/0_2_1_0_4_0:

State Value State Name
----------- ----------
0 UP
1 DOWN
2 UNKNOWN

There is 1 active monitor request reported for this resource.

The Monitor Request ID in DECIMAL: 1401225218
The (PID.Request#) is 21381.2
Operation is RM_NOT_EQUAL_TO 0 with no notification options set
Polling Interval is 60 seconds

Regards

Roger

Keep it simple

Duncan Edmonstone · ‎01-14-2008

Roger,

OK one last thing to check then we should be able to create a monitor for the RAID card.

Gather the output of the following command:

/etc/opt/resmon/lbin/monconfig << EOF
C
Q
EOF

Looks like no notification options are set for the RAID adapter, but it would be nice to see exactly what *is* set before we change anything

HTH

Duncan

I am an HPE Employee

ALCS · ‎01-14-2008

Hi Duncan,

1- we need to create a monitor for the Raid card and for the disks in MSA30 storage.
2- find below the output that yoy requeted:

# /etc/opt/resmon/lbin/monconfig

========================================================================================== Event Monitoring Service ===================
Enter selection: [s] Q
=======================================================================
EVENT MONITORING IS CURRENTLY ENABLED.
EMS Version : A.04.20
STM Version : C.54.00
=======================================================================
============== Monitoring Request Manager Main Menu ==============
======================================================================
Note: Monitoring requests let you specify the events for monitors
to report and the notification methods to use.

Select:
(S)how monitoring requests configured via monconfig
(C)heck detailed monitoring status
(L)ist descriptions of available monitors
(A)dd a monitoring request
(D)elete a monitoring request
(M)odify an existing monitoring request
(E)nable Monitoring
(K)ill (disable) monitoring
(H)elp
(Q)uit
Enter selection: [s] C

=======================================================================
============== Check Outstanding Monitoring Requests ==============
=======================================================================
>/StorageAreaNetwork/events/SAN_Monitor ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/storage/events/disks/default ... OK.
For /storage/events/disks/default/0_1_1_0.0.0:
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/wbem_disk_em.clcfg
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_disk_em.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_disk_em.clcfg
Events = 5 (CRITICAL) Goto TCP; host=bbacist2 port=49375
For /storage/events/disks/default/0_1_1_0.1.0:
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/wbem_disk_em.clcfg
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_disk_em.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_disk_em.clcfg
Events = 5 (CRITICAL) Goto TCP; host=bbacist2 port=49375

>/adapters/events/TL_adapter ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/system/events/chassis ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/system/events/core_hw ... OK.
For /system/events/core_hw/core_hw:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_dm_core_hw.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_dm_core_hw.clcfg
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/wbem_dm_core_hw.clcfg
Events >= 4 (SERIOUS) Goto TCP; host=bbacist2 port=49375

>/connectivity/events/hubs/FC_hub ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/connectivity/events/switches/FC_switch ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/adapters/events/iscsi_adapter ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/system/events/memory ... OK.
For /system/events/memory/8:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_dm_memory.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_dm_memory.clcfg
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/wbem_dm_memory.clcfg
Events >= 4 (SERIOUS) Goto TCP; host=bbacist2 port=49375

>/adapters/events/ql_adapter ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/adapters/events/raid_adapter ... OK.
For /adapters/events/raid_adapter/0_2_1_0_4_0:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 4 (SERIOUS) Goto TCP; host=bbacist2 port=49375

>/storage/events/enclosures/ses_enclosure ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/system/events/ups ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/storage/events/disk_arrays/FC60 ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/system/events/ipmi_fpl ... OK.
For /system/events/ipmi_fpl/ipmi_fpl:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_fpl_em.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_fpl_em.clcfg
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/default_fpl_em.clcfg

>/storage/events/disk_arrays/FW_SCSI ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/system/events/ia64_corehw ... OK.
For /system/events/ia64_corehw/core_hw:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_ia64_corehw.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_ia64_corehw.clcfg
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/wbem_ia64_corehw.clcfg

>/system/events/cpu/lpmc ... OK.
For /system/events/cpu/lpmc/cache_errors:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_lpmc_em.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_lpmc_em.clcfg
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/wbem_lpmc_em.clcfg
Events >= 4 (SERIOUS) Goto TCP; host=bbacist2 port=49375

>/system/events/system_status ... OK.
For /system/events/system_status/ui_host/bbacist2:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_sysstat_em.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_sysstat_em.clcfg
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root

>/storage/events/disk_arrays/MSA1000 ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/adapters/events/FC_adapter ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

Regards

Roger

Keep it simple

Duncan Edmonstone · ‎01-14-2008

Roger,

From the info here it looks like EMS is already monitoring and sending e-mails to root for a MAJOR WARNING and anything more serious.

>/adapters/events/raid_adapter ... OK.
For /adapters/events/raid_adapter/0_2_1_0_4_0:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 4 (SERIOUS) Goto TCP; host=bbacist2 port=49375

HTH

Duncan

I am an HPE Employee

ALCS · ‎01-14-2008

Duncan,

Do you mean that with the current configuration of EMS , it will send an e-mail in case we will face a problem with Raid card or with any disk in MSA30 controller.
NB : I could monitor the internal disks through EMS , but i could not find entries corresponding to MSA30 disks.

Regards

Roger

Keep it simple

Duncan Edmonstone · ‎01-14-2008

Roger,

Yes I do mean that. The OS and EMS doesn't 'know' about the actual physical disks that make up the MSA30, as it doesn't actually see them. What it cares about is the status of the RAID controller - the RAID controller itslef is responsible for flagging issues with physical disks and RAID volumes for devices hung off it - these are then passed on to EMS and are raised as events as detailed in the link I sent last week.

The real way to test this is to pull a disk and see what happens - I appreicate you might not be in a situation to do that right now, but thats the only test that ultimately would satisfy me that EMS is functioing correctly.

HTH

Duncan

I am an HPE Employee

Roro_2 · ‎01-15-2008

Hi Dyncan,

I appreciate your cooperation.
I will try to simulate the issue using another server . ( not customer's server).

Regards

Roger

ALCS · ‎01-15-2008

Hi Duncan,

Just to inform you that we did the simulation of a drive failure, and we got a notification.

Many Thanks for your great help

Roger

Keep it simple

Categories

Company

Local Language

Forums

Discussions

Knowledge Base

Forums

Discussions

Forums

Discussions

Forums

Discussions

Forums

Discussions

Discussions

Forums

Forums

Discussions

Forums

Discussions

Forums

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Discussion Boards

Community

Resources

Other HPE Sites

Discussions

Forums

Blogs

HP9000/Smart Array 6402 Event Notification

HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification

Re: HP9000/Smart Array 6402 Event Notification