Operating System - HP-UX
1826639 Members
3365 Online
109695 Solutions
New Discussion

EMS HW monitoring: how to re-enable monitoring?

 
SOLVED
Go to solution
Mihails Nikitins
Super Advisor

EMS HW monitoring: how to re-enable monitoring?

Seems to be a simple question.

A failed disk has been removed from the list of devices being monitored. How to enable monitoring again after replacement? I've already tried disable/enable commands in remsmon program.

Thanks and points in advance!
KISS - Keep It Simple Stupid
20 REPLIES 20
Mahesh Kumar Malik
Honored Contributor

Re: EMS HW monitoring: how to re-enable monitoring?

Hi

Use following coammand

# /etc/opt/resmon/lbin/monconfig

select and then (E)nable the monitoring for the required device

Regards
Mahesh


Mihails Nikitins
Super Advisor

Re: EMS HW monitoring: how to re-enable monitoring?

Global enabling does not work (new disk is still out of monitoring). What do you mean by 'selecting'? Thank you.
KISS - Keep It Simple Stupid
Fabio Ettore
Honored Contributor

Re: EMS HW monitoring: how to re-enable monitoring?

Hi,

when you launch the command suggested by Mahesh

/etc/opt/resmon/lbin/monconfig

you should get a list of options.
Interactively you should see

EVENT MONITORING IS CURRENTLY ENABLED.

and then select what you are searching for.

Post if you get other problems.

Best regards,
Fabio
WISH? IMPROVEMENT!
Mihails Nikitins
Super Advisor

Re: EMS HW monitoring: how to re-enable monitoring?

Hi Fabio,

I can run '(C)heck detailed monitoring status'. I see all the devices except the replaced disk. My question is how to re-enable its monitoring in the most graceful way. Thank you.
KISS - Keep It Simple Stupid
Borislav Perkov
Respected Contributor

Re: EMS HW monitoring: how to re-enable monitoring?

Hi Mihails,

Could you see the disk with ioscan command?

Regards,
Borislav
Mihails Nikitins
Super Advisor

Re: EMS HW monitoring: how to re-enable monitoring?

Borislav,

The disk is 'CLAIMED' by ioscan, the disk is in use on LVM level. Thank you.
KISS - Keep It Simple Stupid
Borislav Perkov
Respected Contributor

Re: EMS HW monitoring: how to re-enable monitoring?

Can you send the output of

/opt/resmon/bin/resls /storage/events/disks/default

command?

Mihails Nikitins
Super Advisor

Re: EMS HW monitoring: how to re-enable monitoring?

All disks except the problem one are listed in the file. The entries look like

/storage/events/disks/default/0_6_0_0.1.0
KISS - Keep It Simple Stupid
Borislav Perkov
Respected Contributor

Re: EMS HW monitoring: how to re-enable monitoring?

Strange thing.

You could check configuring request from GUI in SAM->Resource Monitoring->EMS then to add a request to /storage/events/disks/default/ and try to find the disk.

Have you try to restart the EMS?
If you haven't in monconfig main menu enter K and then E. Maybe it will help.
Mihails Nikitins
Super Advisor

Re: EMS HW monitoring: how to re-enable monitoring?

Hi,

Anyway, the case is still open.

The History

1. A disk failed in a shared storage.
2. EMS HW monitoring removed the failed device from monitoring after 24 hours
3. The failed disk was replaced and returned to normal usage. It is normally shown by ioscan and vgdisplay.
4. The disk is not included in the moncheck command output on _both_ cluster nodes.
5. I tried the following things:
- Restarting EMS from monconfig utility (Kill/Enable)
- Upgarding to the newest available available software
OnlineDiag B.11.11.15.13 HPUX 11.11 Support Tools Bundle, Dec 2004
- Rebootin the system

Note that I don't use SAM for configuring EMS. I see empty list in 'Current Monitoring Requests' in SAM on my HP-UX systems.

Thanks and points in advance for more relevant ideas!

BR,
Mihails
KISS - Keep It Simple Stupid
Ermin Borovac
Honored Contributor

Re: EMS HW monitoring: how to re-enable monitoring?

Does the missing disk appear in the output of set_fixed?

# /opt/resmon/bin/set_fixed -l
Emanuele_4
Regular Advisor

Re: EMS HW monitoring: how to re-enable monitoring?

I think you shoul enable the monitoring on ths new disk.

try this:

/etc/opt/resmon/lbin/set_fixed -l

And you can see the new disk NOT monitored

Thank run:

/etc/opt/resmon/lbin/set_fixed -n /storage/status/disks/default/6_0_1_0_0.2.0

where, obviously, you shoud replace the Hw path of the disk.

Hope it can help!
Mihails Nikitins
Super Advisor

Re: EMS HW monitoring: how to re-enable monitoring?

Hi,

Many thanks for the command 'set_fixed'. Anyway, it does not help yet...

Problem disk does not appear in the output of '/opt/resmon/bin/set_fixed -l'. All other resources are marked 'UP', like this

...
/storage/status/disks/default/0_6_0_0.0.0 UP TRUE
/storage/status/disks/default/0_6_0_0.1.0 UP TRUE
/storage/status/disks/default/0_6_0_0.3.0 UP TRUE
...

'/opt/resmon/bin/set_fixed -L' replies
No resources set to DOWN state.

In this case command set_fixed -n cannot work.

/etc/opt/resmon/lbin/set_fixed -n /storage/status/disks/default/0_6_0_0.2.0
/etc/opt/resmon/lbin/set_fixed: No matching resource names found to set to UP state.

On the other hand, the disk exists and there are several disks with the same part number.

disk 13 0/6/0/0.2.0 sdisk CLAIMED DEVICE SEAGATE ST39173WC
/dev/dsk/c5t2d0 /dev/rdsk/c5t2d0

Thanks and points for more help!

BR,
Mihails
KISS - Keep It Simple Stupid
Emanuele_4
Regular Advisor

Re: EMS HW monitoring: how to re-enable monitoring?

Can you look at

/etc/opt/resmon/log

do you see anything different as usual?

the disk is correctly conifgured by lvm?

Is everything ok with pvcreate, vgdisplay, lvdisplay, ioscan, mirror, diskinfo, dd...etc etc...
Andrew Merritt_2
Honored Contributor

Re: EMS HW monitoring: how to re-enable monitoring?

Hi Mihails,
I don't have any answers yet, but several questions ;-)

If you run STM, do you see the disk listed there? If not, try running the 'map' command; does this make any difference?

If you can see the disk in STM, run the info tool and see what the firmware level is - does it begin with 'HP'?

Are the other disks with the same part number being monitored OK?

Is there anything in the /var/stm/data/tools/monitor/disabled_instances file?

When you say the disk was removed from the list of devices being monitored, what exactly do you mean? Which list? Do you mean it was removed automatically, or by hand?

Is anything logged by disk_em in the api.log or client.log files in /var/opt/resmon/log?

Is anything shown relating to this disk in the maplog in STM ('Map Log' in the System menu in mstm, or 'ml' command in 'cstm')?

Andrew
Patrick Wallek
Honored Contributor

Re: EMS HW monitoring: how to re-enable monitoring?

It sounds like you might need to rebuild the /var/stm/data/psm_data file.

1) Stop EMS monitoring -- Go into monconfig and select K to stop EMS.

2) Verify that psmctd is not running. -- 'ps -ef |grep psmctd' -- If it is still there, stop diagnostics -- /sbin/init.d/diagnostic stop -- Check psmctd again, if it is still running kill with 'kill -9'

3) Move the psm_data file somewhere else -- mv /var/stm/data/psm_data /tmp/psm_data

4) If you stopped diagnostics in step 2, restart now with '/sbin/init.d/diagnostic start', if you did not stop them go to step 5.

5) Restart EMS -- Go into monconfig and choose E to start. It may take a while to rebuild the psm_data file and 'set_fixed -l' may return an error. Once the process is finished 'set_fixed -l' should return all hardware.
Andrew Merritt_2
Honored Contributor

Re: EMS HW monitoring: how to re-enable monitoring?

Just to tidy up what Patrick has suggested (I don't know if it would work to fix this problem, but it's worth a try), you WILL need to do Step 2, since step one only stops the monitors. psmctd is started by diagmond.

Andrew

Mihails Nikitins
Super Advisor

Re: EMS HW monitoring: how to re-enable monitoring?

Patrick, thanks a lot for your procedure, however, it didn't help.

Probably, Andrew found the correct question, i.e. the answer. This is the only disk having firmare revision without 'HP' letters! If this is my problem, I will come back with 10 points.

stm see the disks, 'disabled_instances' are empty.

Logs look OK.

Monitoring of the disk was disabled automatically. From event.log

Summary: Disk at hardware path 0/6/0/0.2.0 : Device removed from monitoring

The disk was removed from monitoring automatically. The device has been removed from the list of devices being monitored by this monitor.
KISS - Keep It Simple Stupid
Andrew Merritt_2
Honored Contributor
Solution

Re: EMS HW monitoring: how to re-enable monitoring?

Hi Mihails,
Yes, that will be the answer; the EMS HW Monitors will only monitor disks with HP firmware ids, because those are the only versions of firmware that are tested and supported by HP.

I would also guess that the "Summary: Disk at hardware path 0/6/0/0.2.0 : Device removed from monitoring" message was logged when this was detected, i.e. after the disk was replaced.

Andrew
Mihails Nikitins
Super Advisor

Re: EMS HW monitoring: how to re-enable monitoring?

I got another disk with correct firmware and now EMS runs fine. I was forced to Kill/Enable monitoring in monconfig tool so that EMS sees the changes. Thank you!
KISS - Keep It Simple Stupid