Operating System - HP-UX
1830203 Members
29035 Online
109999 Solutions
New Discussion

HP9000/Smart Array 6402 Event Notification

 
SOLVED
Go to solution
ALCS
Regular Advisor

HP9000/Smart Array 6402 Event Notification

Dears,

We need to find a simple way to get a notification when a disk fails within a smart array 6402 controller on an RP3440/HP-UX 11i v2.


We do appreciate very much feedbacks..

Regards,
Farid
Keep it simple
18 REPLIES 18

Re: HP9000/Smart Array 6402 Event Notification

Farid,

The standard EMS Event Monitors should already be monitoring the SA6402 controller. If you get a disk failure I would expect a message in syslog and an email to root as a minimum.

If you want to change this in some way, you'll need to use the monconfig command.

man monconfig

for more info.

HTH

Duncan

I am an HPE Employee
Accept or Kudo
ALCS
Regular Advisor

Re: HP9000/Smart Array 6402 Event Notification

Thanks Duncan,

The EMS was implemented but it only detected the internal disk failures but not the SA attached disks (probably being a hardware RAID1 so the OS will not sense it)

Regards,
Farid

Keep it simple

Re: HP9000/Smart Array 6402 Event Notification

Farid,

EMS should be passed info by the RAID controller. Here are the events which can be generated for a RAID controller monitor:

http://docs.hp.com/en/diag/ems/dm_raid_adapter.htm

You will see disk failure etc. listed as potential events.

Check using monconfig that you are actually monitoring for RAID controller events.

HTH

Duncan

I am an HPE Employee
Accept or Kudo
ALCS
Regular Advisor

Re: HP9000/Smart Array 6402 Event Notification

Thanks Duncan,

Through EMS we could apply a condition (for ex: down) for internal disks by which an email could be generated using send mail.
(but no such entries for the SA and its attached disks)

In the SA case, where the generated events you mentioned can be located and how can we trigger an email notification for an event. (ex disk failure)

Tnanks again,

Farid



Keep it simple

Re: HP9000/Smart Array 6402 Event Notification

Farid,

Let's check EMS can see your RAID card.

run the following on your system:

resls /adapters/events

resls /adapters/events/raid_adapter

What output do you get?

HTH

Duncan

I am an HPE Employee
Accept or Kudo
ALCS
Regular Advisor

Re: HP9000/Smart Array 6402 Event Notification

Thanks a lot Duncan,
Will provide you feedbck on monday

best regards,
Farid
Keep it simple
ALCS
Regular Advisor

Re: HP9000/Smart Array 6402 Event Notification

Hi Duncan,
Please find below the output of some EMS commands:
# resls /adapters/events
Contacting Registrar on bbacist2
NAME: /adapters/events
DESCRIPTION: Events for Adapter Resources
TYPE: /adapters/events is a Resource Class.

There are 4 resources configured below /adapters/events:
Resource Class
/adapters/events/raid_adapter
logout root
/adapters/events/ql_adapter
/adapters/events/iscsi_adapter

# resls /adapters/events/raid_adapter
Contacting Registrar on bbacist2
NAME: /adapters/events/raid_adapter
DESCRIPTION: RAID Smart Array (SA) Controller Event Monitor
This resource monitors events for RAID SA Controller products. The monitor can be configured to report events directly using the Monitoring Request
Manager, or as a change in device status using the Event Monitoring
Service (EMS).
For more information, see the monitor man page, dm_raid_adapter(1m)
TYPE: /adapters/events/raid_adapter is a Resource Class.
There is one resource configured below /adapters/events/raid_adapter:
Resource Class
/adapters/events/raid_adapter/0_2_1_0_4_0

# resls /adapter/status/raid_adapter
Contacting Registrar on bbacist2
NAME: /adapter/status/raid_adapter
DESCRIPTION: None
TYPE: /adapter/status/raid_adapter is not a valid resource name

# resls /adapters/status/raid_adapter
Contacting Registrar on bbacist2
NAME: /adapters/status/raid_adapter
DESCRIPTION: RAID Smart Array (SA) Controller Event Monitor
This resource monitors events for RAID SA Controller products. The monitor can be configured to report events directly using the Monitoring Request
Manager, or as a change in device status using the Event Monitoring
Service (EMS).
For more information, see the monitor man page, dm_raid_adapter(1m)
TYPE: /adapters/status/raid_adapter is a Resource Class.
There is one resource configured below /adapters/status/raid_adapter:
Resource Class
/adapters/status/raid_adapter/0_2_1_0_4_0

# resls /storage/status/disks/default
Contacting Registrar on bbacist2
NAME: /storage/status/disks/default
DESCRIPTION: Disk Event Monitor
This resource monitors events for stand-alone disk drives. Event
monitoring requests are created using the Monitoring Request
Manager. Monitoring requests to detect changes in device status are
created using the Peripheral Status Monitor (psmmon(1m)) and Event
Monitoring Service (EMS).
For more information see the monitor man page, (disk_em(1m)).
TYPE: /storage/status/disks/default is a Resource Class.
There are 2 resources configured below /storage/status/disks/default:
Resource Class
/storage/status/disks/default/0_1_1_0.0.0
/storage/status/disks/default/0_1_1_0.1.0

# resls /storage/status/disk_array
Contacting Registrar on bbacist2
NAME: /storage/status/disk_array
DESCRIPTION: None
TYPE: /storage/status/disk_array is not a valid resource name
#
/adapters/events/TL_adapter



Roger

Systems Engineer


Keep it simple
Steven E. Protter
Exalted Contributor

Re: HP9000/Smart Array 6402 Event Notification

Shalom Farid,

Based on the data you posted I reach the following conclusions:

1) EMS CAN see the array controller.
2) EMS can not fully monitor the disks on the controller.
3) Fixing this will require HP to update EMS and possibly hardware enablement on the system.
4) You might be able to write your own monitor script that checks the raw ouptut and greps through it and lets you know via a normal email script if there is a problem. I would think this outuput.

Resource Class
/storage/status/disks/default/0_1_1_0.0.0
/storage/status/disks/default/0_1_1_0.1.0

Would be different if one of the disks failed.

Therefore the problem can possibly be solved by scripting.

You may find this thread with embedded script useful in getting your results out with email.

http://www.hpux.ws/?p=7

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
ALCS
Regular Advisor

Re: HP9000/Smart Array 6402 Event Notification

Hi Setven,

Thanks a lot for your reply.
Could you tell me how should i use the script to get the notification via e-mail.

Roger
Keep it simple
Solution

Re: HP9000/Smart Array 6402 Event Notification

Farid,

I'm not entirely sure how SEP reached the conclusion he did on this - I see nothing conclusive in the output to sugggest this.

The two disks seen don't appear to be on the RAID controller (they look like system disks), and I'm not totally sure that a RAID LUN would be seen under /storage/status/disks/default.

We do see that EMS sees ther RAID adapter - what's the output of:

resls /adapters/status/raid_adapter/0_2_1_0_4_0

HTH

Duncan

I am an HPE Employee
Accept or Kudo
ALCS
Regular Advisor

Re: HP9000/Smart Array 6402 Event Notification

Hi Duncan,

1- we have 2 x internal disks in the server conneted to the internal controller.
2- Through EMS, i already created notification events ( resources ) on the status of these disks : worked fine.
3- I could not detect the disks connected to MSA storage through EMS ; only the controller.
4- find below the output of the following command :

# resls /adapters/status/raid_adapter/0_2_1_0_4_0
Contacting Registrar on bbacist2

NAME: /adapters/status/raid_adapter/0_2_1_0_4_0
DESCRIPTION: UP:
A state of "UP" means the hardware is available for use.

DOWN:
A state of "DOWN" means the hardware is unavailable for use.
The /opt/resmon/bin/set_fixed command must be used with the resource name once t he hardware is fixed to reset the hardware to the "UP" state.

UNKNOWN:
A state of "UNKNOWN" means it is unknown whether the hardware
is unavailable for use or not. The hardware should be considered DOWN.

TYPE: /adapters/status/raid_adapter/0_2_1_0_4_0 is a Resource Instance
whose values are enumerated values.

There are 3 valid states reported for /adapters/status/raid_adapter/0_2_1_0_4_0:

State Value State Name
----------- ----------
0 UP
1 DOWN
2 UNKNOWN

There is 1 active monitor request reported for this resource.

The Monitor Request ID in DECIMAL: 1401225218
The (PID.Request#) is 21381.2
Operation is RM_NOT_EQUAL_TO 0 with no notification options set
Polling Interval is 60 seconds


Regards

Roger
Keep it simple

Re: HP9000/Smart Array 6402 Event Notification

Roger,

OK one last thing to check then we should be able to create a monitor for the RAID card.

Gather the output of the following command:

/etc/opt/resmon/lbin/monconfig << EOF
C
Q
EOF


Looks like no notification options are set for the RAID adapter, but it would be nice to see exactly what *is* set before we change anything

HTH

Duncan

I am an HPE Employee
Accept or Kudo
ALCS
Regular Advisor

Re: HP9000/Smart Array 6402 Event Notification

Hi Duncan,

1- we need to create a monitor for the Raid card and for the disks in MSA30 storage.
2- find below the output that yoy requeted:

# /etc/opt/resmon/lbin/monconfig

========================================================================================== Event Monitoring Service ===================
Enter selection: [s] Q
=======================================================================
EVENT MONITORING IS CURRENTLY ENABLED.
EMS Version : A.04.20
STM Version : C.54.00
=======================================================================
============== Monitoring Request Manager Main Menu ==============
======================================================================
Note: Monitoring requests let you specify the events for monitors
to report and the notification methods to use.

Select:
(S)how monitoring requests configured via monconfig
(C)heck detailed monitoring status
(L)ist descriptions of available monitors
(A)dd a monitoring request
(D)elete a monitoring request
(M)odify an existing monitoring request
(E)nable Monitoring
(K)ill (disable) monitoring
(H)elp
(Q)uit
Enter selection: [s] C

=======================================================================
============== Check Outstanding Monitoring Requests ==============
=======================================================================
>/StorageAreaNetwork/events/SAN_Monitor ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/storage/events/disks/default ... OK.
For /storage/events/disks/default/0_1_1_0.0.0:
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/wbem_disk_em.clcfg
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_disk_em.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_disk_em.clcfg
Events = 5 (CRITICAL) Goto TCP; host=bbacist2 port=49375
For /storage/events/disks/default/0_1_1_0.1.0:
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/wbem_disk_em.clcfg
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_disk_em.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_disk_em.clcfg
Events = 5 (CRITICAL) Goto TCP; host=bbacist2 port=49375

>/adapters/events/TL_adapter ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/system/events/chassis ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/system/events/core_hw ... OK.
For /system/events/core_hw/core_hw:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_dm_core_hw.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_dm_core_hw.clcfg
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/wbem_dm_core_hw.clcfg
Events >= 4 (SERIOUS) Goto TCP; host=bbacist2 port=49375

>/connectivity/events/hubs/FC_hub ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/connectivity/events/switches/FC_switch ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/adapters/events/iscsi_adapter ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/system/events/memory ... OK.
For /system/events/memory/8:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_dm_memory.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_dm_memory.clcfg
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/wbem_dm_memory.clcfg
Events >= 4 (SERIOUS) Goto TCP; host=bbacist2 port=49375

>/adapters/events/ql_adapter ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/adapters/events/raid_adapter ... OK.
For /adapters/events/raid_adapter/0_2_1_0_4_0:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 4 (SERIOUS) Goto TCP; host=bbacist2 port=49375

>/storage/events/enclosures/ses_enclosure ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/system/events/ups ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/storage/events/disk_arrays/FC60 ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/system/events/ipmi_fpl ... OK.
For /system/events/ipmi_fpl/ipmi_fpl:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_fpl_em.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_fpl_em.clcfg
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/default_fpl_em.clcfg

>/storage/events/disk_arrays/FW_SCSI ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/system/events/ia64_corehw ... OK.
For /system/events/ia64_corehw/core_hw:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_ia64_corehw.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_ia64_corehw.clcfg
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/wbem_ia64_corehw.clcfg

>/system/events/cpu/lpmc ... OK.
For /system/events/cpu/lpmc/cache_errors:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_lpmc_em.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_lpmc_em.clcfg
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= ?? Goto TCP; host=localhost port=49163
Client Configuration File: /var/stm/config/tools/monitor/wbem_lpmc_em.clcfg
Events >= 4 (SERIOUS) Goto TCP; host=bbacist2 port=49375

>/system/events/system_status ... OK.
For /system/events/system_status/ui_host/bbacist2:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/rst.log
Client Configuration File: /var/stm/config/tools/monitor/rst_sysstat_em.clcfg
Events >= 1 (INFORMATION) Goto TCP; host=localhost port=1402
Comment: RST Request
Client Configuration File: /var/stm/config/tools/monitor/rst_sysstat_em.clcfg
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root

>/storage/events/disk_arrays/MSA1000 ... NOT MONITORING.
(Possibly there is no hardware to monitor.)

>/adapters/events/FC_adapter ... NOT MONITORING.
(Possibly there is no hardware to monitor.)


Regards

Roger
Keep it simple

Re: HP9000/Smart Array 6402 Event Notification

Roger,

From the info here it looks like EMS is already monitoring and sending e-mails to root for a MAJOR WARNING and anything more serious.

>/adapters/events/raid_adapter ... OK.
For /adapters/events/raid_adapter/0_2_1_0_4_0:
Events >= 1 (INFORMATION) Goto TEXTLOG; file=/var/opt/resmon/log/event.log
Events >= 3 (MAJOR WARNING) Goto SYSLOG
Events >= 3 (MAJOR WARNING) Goto EMAIL; addr=root
Events >= 4 (SERIOUS) Goto TCP; host=bbacist2 port=49375


HTH

Duncan

I am an HPE Employee
Accept or Kudo
ALCS
Regular Advisor

Re: HP9000/Smart Array 6402 Event Notification

Duncan,

Do you mean that with the current configuration of EMS , it will send an e-mail in case we will face a problem with Raid card or with any disk in MSA30 controller.
NB : I could monitor the internal disks through EMS , but i could not find entries corresponding to MSA30 disks.


Regards

Roger

Keep it simple

Re: HP9000/Smart Array 6402 Event Notification

Roger,

Yes I do mean that. The OS and EMS doesn't 'know' about the actual physical disks that make up the MSA30, as it doesn't actually see them. What it cares about is the status of the RAID controller - the RAID controller itslef is responsible for flagging issues with physical disks and RAID volumes for devices hung off it - these are then passed on to EMS and are raised as events as detailed in the link I sent last week.

The real way to test this is to pull a disk and see what happens - I appreicate you might not be in a situation to do that right now, but thats the only test that ultimately would satisfy me that EMS is functioing correctly.

HTH

Duncan

I am an HPE Employee
Accept or Kudo
Roro_2
Regular Advisor

Re: HP9000/Smart Array 6402 Event Notification

Hi Dyncan,

I appreciate your cooperation.
I will try to simulate the issue using another server . ( not customer's server).

Regards

Roger
ALCS
Regular Advisor

Re: HP9000/Smart Array 6402 Event Notification

Hi Duncan,

Just to inform you that we did the simulation of a drive failure, and we got a notification.

Many Thanks for your great help

Roger
Keep it simple