1820592 Members
1670 Online
109626 Solutions
New Discussion юеВ

STM Alternative

 
Glenn May
Occasional Contributor

STM Alternative

Hey all,

I have been searching for something that would take the place of STM, perhaps offer a more detailed failure information, be more configurable in nature, etc...

Does anyone know of anything else that will work. I am also willing to investigate other OSes that will run on the HP9000 providing I can get my hands on something more robust to check the hardware..


Thanks in advance for any help you may be able to provide..


Glenn May
7 REPLIES 7
Steven E. Protter
Exalted Contributor

Re: STM Alternative

You can have EMS run a shell script and email you the output when certain conditions are met.

To achieve more detail than stm, you need to identify the information and write a shell script to provide the information to you.

cstm,mstm,xstm are pretty darned detail.

Linux will run on PA-RISC and Itanium HP-9000 but you are still going to have to write it yourself. As far as I know.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Sanjay_6
Honored Contributor

Re: STM Alternative

Hi,

I don't think there is an alternative to STM. We have disk monitoring software / apps that tells you that the disk has failed, but don't think they give you a whole lot of info that helps in identifying the root cause of problem.

Hope this helps.

Regds
Bill Hassell
Honored Contributor

Re: STM Alternative

STM is the only diagnostic tool available to customers. There are more extensive offline diagnostics, but after troubleshooting hardware problems on dozens of HP 9000's over the last 15 years, most of the CPU board failures were unbootable, meaning that trying to get the diags into the computer was impossible. If the machine could boot the diags, there were many hardware paths (very dependent on which of the hundreds of models of HP 9000 you have) that cannot be isolated (memory controllers, I/O adapters, backplane interfaces, etc). In general, it was much faster to look at the failure code (LCD panel, GSP, whatever) and swap the big boards.

Having information as to the component that failed would be great but you couldn't afford the cost of such a machine. Memory tests are useful but if something is wrong with the memory controller, most of the time the computer can't boot. I've run extensive offline memory tests on many platforms and no errors would be reported but bootup HP-UX and start a few apps and memory problems would often start getting logged.

Many, many problems attributed to hardware are actually software and/or patch issues. For instance, "out of memory" has nothing to do with RAM hardware errors...it is a software configuration issue.

Disks are a different problem. First there's the actual disk itself and EMS does a good job in logging the problem when it occurs, albeit in a very cryptic manner. When I see problems with disks, if it is a plain old SCSI disk, I get a new one and let mirroring bring it up to speed. I don't have time to run the diags to find that track 1234 and sectors 44-66 are unreadable...as if I could do anything about it. Most disks will try to fix themselves when you run mediainint but again, it takes a lot of time.

The other side of disks is the SAN side...and fibre controllers and fabric switches are all emerging products (read: lots of firmware updates, propietary programs to configure them, etc). Only the mfr's code will help there.

As far as another opsystem, Linux runs a very small number of HP 9000 models but the OS won't enable better diags. In fact, I don't know any diag programs that run on Linux for PA-RISC. Good diagnostics are very expensive products to create and as many service techs will tell you, they can usually out-fix the hardware strategist every time by swapping the most likely modules.


Bill Hassell, sysadmin
Glenn May
Occasional Contributor

Re: STM Alternative

Thanks for the response. I kind of expected that answer, but thought I might toss it out anyway.

We are also heavy users of Sun's VTS suite, that one is a little more robust. Can't blame a guy for trying..

Have a great day..


Glenn
Andrew Merritt_2
Honored Contributor

Re: STM Alternative

Hi Glenn,
What are the failings, as you see it, with STM? What would a replacement for it give you?

Do you mean just STM, or are you including the EMS Hardware Monitors?

Andrew
Glenn May
Occasional Contributor

Re: STM Alternative

I am just referring to STM. Perhaps I am simply not undertanding it entirely yet, I Have been working with Sun VTS for many years now, and it responds with error codes, descriptions of what failed, and if possible exactly what failed, I.E. Memory stick in slot XXXXXXX

I was hoping that I could get the same information out of STM, although it appears to be a bit short in that aspect.


If I am missing something, let me know.


Thanks
Glenn May
Andrew Merritt_2
Honored Contributor

Re: STM Alternative

Hi Glenn,

> Sun VTS for many years now, and it responds
> with error codes, descriptions of what
> failed, and if possible exactly what failed,
> I.E. Memory stick in slot XXXXXXX

It sounds like the model is slightly different on HP from Sun.

The EMS Hardware Monitors would provide notification of new faults, via email, SNMP and log files, and there is the peripheral status monitor, which shows up or down for a particular device. For the current status, or recent faults, there are, I believe, frameworks that will show you (based on what the hardware monitors have reported). ISEE and HAO might be worth looking at, and HP SIM for the future (http://h71028.www7.hp.com/enterprise/cache/4225-0-0-225-121.aspx, http://h18004.www1.hp.com/products/servers/management/hpsim/index.html)

Andrew