Operating System - HP-UX
1833323 Members
3307 Online
110051 Solutions
New Discussion

EMS error - but hardware fine

 
SOLVED
Go to solution
Omar Alvi_1
Super Advisor

EMS error - but hardware fine

Hi,

We're having some problems with EMS on a system recently upgraded.

The problem is that on adding all instances of disks to be monitored, it shows one of the disks as down (screenshot attached). However ioscan shows the device to be claimed, a dd test also works fine on the raw device file, and the red LED of the disk is aslo functional.

What could be the cause of this error?

What to check

Thanks and Regards,

-Alvi
12 REPLIES 12
Omar Alvi_1
Super Advisor

Re: EMS error - but hardware fine

here's the attachment.

connections to ITRC is timing out :)
Kurt Beyers.
Honored Contributor

Re: EMS error - but hardware fine

Are there any errors logged in the syslog.log?

Kurt
Steven E. Protter
Exalted Contributor
Solution

Re: EMS error - but hardware fine

Couple ideas here sir,

1) EMS has been detecting media flaw on one of my hot swaps of my educational D320 for the whole six months I have it. The drive is fine, the replacement from ebay is sitting on a shelf awaiting installation.

2) If you are connected to a san and have no luns assigned to a fiber card for example, and lun0 isn't disk, the fiber switch or disk array shows up on ioscan as a disk.

But it not and this triggers EMS errors.

In both circumstances, if the disk is working you can probably ignote the error without impact.

I am trying to attach a script that might help you.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Angus Crome
Honored Contributor

Re: EMS error - but hardware fine

By recently upgraded, do you mean the version of EMS or the OS?

If the OS, then make sure the EMS version is correct for the new OS. EMS is notoriously full of false positives when there is a mis-match on version (or the patch levels don't coincide with the EMS version). If EMS, make sure you apply the full set of patches for that version as well (same reason).
There are 10 types of people in the world, those who understand binary and those who don't - Author Unknown
Mic V.
Esteemed Contributor

Re: EMS error - but hardware fine

I recently had a call with the RC where EMS gave me fibrechannel adapter errors but nothing seemed to be wrong on the Sym and I saw no HP-side problems. The CSS RC engineer said that sometimes under extreme load, EMS will log errors (but nothing is really wrong).

All attempts to find issues with this card failed ( fcmsutil /dev/fcms2 stat and look for the bad Rx count increasing, etc), so I don't really have any reason to think the card's bad, though it was very weird.

Good luck,
Mic
What kind of a name is 'Wolverine'?
Dave Polshaw
Frequent Advisor

Re: EMS error - but hardware fine

We have had the same problem because we were using fibre channel hubs rather than switches. Upgraded to switches and the problem went away. Are you on hubs by any chance?
Knowledge speaks. Wisdom listens...
Mic V.
Esteemed Contributor

Re: EMS error - but hardware fine

No, actually, this was not a SAN environment.
What kind of a name is 'Wolverine'?
Omar Alvi_1
Super Advisor

Re: EMS error - but hardware fine

Thanks a lot for your suggestions,

The hints I have gotten make me think that the HW enablement patch bundle and the Online Daignostics revision (datecode) should be the same to avoid such problems

Presently we had HW Sep 03, and the Software from Dec 02, so installing same revision software is the first step i'll be doing.

- Is it possible also that the disk is busy swapping or doing somethin where it can't reply to EMS.

Another, although differen issue is the customer compalins that swap untilisation in swpainfo shows 0% whereas in glance it shows 33%. Why?

If it actually is 33 % could this cause the disk to not reply to EMS?

Thanks, appreciate your help
regards,

-Alvi
Omar Alvi_1
Super Advisor

Re: EMS error - but hardware fine

Hi,

I've downloaded the latest Online Diagnostics bundle from the HP site. However, whem I swinstall the bundle I see only the following bundles

EMS-Config
EMS-Core

It is my understanding that Online Diagnostics include STM and Predictive, and maybe some other stuff.

Why does this bundle only contain EMS? Should I swremove only the EMS part of Online Diagnostics and then install these?

But then I would have different modules of Online Diag having different revisions

I downloaded from the following link... for HP-UX 11.11

http://software.hp.com/portal/swdepot/displayProductInfo.do?productNumber=B6191AAE

Regards,

-Alvi

Mic V.
Esteemed Contributor

Re: EMS error - but hardware fine

I don't have a system handy to look, sorry, but I can tell you that Predictive is not supposed to be part of diagnostics any more. Predictive has been (or will be very soon) dropped from support plus in favor of ISEE. :(

--Mic
What kind of a name is 'Wolverine'?
Andrew Merritt_2
Honored Contributor

Re: EMS error - but hardware fine

There should be some events logged in event.log (from /var/opt/resmon/log) by disk_em that will tell you why EMS is marking the disk as DOWN.

It is also a good idea to make sure you have a recent version of OnlineDiags, with the latest patch for that revision applied.

Andrew
Angus Crome
Honored Contributor

Re: EMS error - but hardware fine

A note (from the release notes in my December 2003 Support Plus Bundle).

HP-UX Predictive Support will be discontinued on June 30, 2003. As a part of this discontinuance process, the Support Plus 0303 (for both v11.00 and 11.11) will be the last release in which HP-UX Predictive is distributed on physical media. After June 30,2003, HP-UX Pred software will no longer be available on Software Depot. HP-UX Predictive end of service begins on April 1, 2004. After this date Predictive events for HP-UX will no longer be processed.

The short version, I wouldn't worry about Predictive at this point.

No points please (just a point of clarification).
There are 10 types of people in the world, those who understand binary and those who don't - Author Unknown