Operating System - HP-UX
1834499 Members
2857 Online
110068 Solutions
New Discussion

Re: odd problem with STM that I'm seeing...

 
Mark Vollmers
Esteemed Contributor

odd problem with STM that I'm seeing...

Hi, all. I've noticed some very odd behaviour related to stm. It really isn't causing a ton of problems, per say, and I can live with it, but it is peculiar nonetheless. Here goes:

A week or two ago, I posted a thread about stm giving me a serious warning concerning out RAID drive, saying that the driver was not supported, etc. I went in, found the event number, and changed the state from TRUE to FALSE. I used to get this warning every morning about 9:00. When I did this, the warning went away.

After I did this, there were messages in root mail after backup about critical and majorwarnings for the RAID drive, concerning bus resets and I/O errors. There were also entries in the syslog about read/write errors and SCSI bus resets, and the backup spat back some issues as well. Hmmmm, I said. I looked at stuff, rebooted the server, ran STM and looked at the drive, and found nothing. Noting the the messages switched after I dinked with the event, I went back in and undid the changes (reset the event number to TRUE). after I did this, the next morning I got the driver serious error, but no problems with backup at all (no stm errors, no syslog, no backup problems).

I cannot figure out why simply disabling the warning for the driver issue would trigger bus resets. I really don't see how the two are related. If there are issues with the drive, then there should be problems all the time. After all, I just disabled the notification for the error, so it would still register it but just not tell me, right? Any thoughts about this? Everything works fine now, and I get the event warning every morning. Strange, huh? I will post a file with the three warnings for review in a few mintues. Thanks, all.

Mark
"We apologize for the inconvience" -God's last message to all creation, from Douglas Adams "So Long and Thanks for all the Fish"
9 REPLIES 9
Mark Vollmers
Esteemed Contributor

Re: odd problem with STM that I'm seeing...

Here you go. Finally got the file put together. Stupid computer...
"We apologize for the inconvience" -God's last message to all creation, from Douglas Adams "So Long and Thanks for all the Fish"
Christopher McCray_1
Honored Contributor

Re: odd problem with STM that I'm seeing...

Without any substantial proof, I'd say that it's possible that ems has serious issues with your array. Just a couple of questions:

How long have you had this array?

If memory serves, you had just upgraded to 11.0 and, subsequently the 11.0 verion of diagnostics, correct?

What brand are your internal drives and are you getting messages on them?

Finally, what release month are the diagnostics on your system?

I know I'm not much help right now, and it's frustrating, but I'd consider turning off the hardware path currently for the array for now.

Regards
Chris
It wasn't me!!!!
Mark Vollmers
Esteemed Contributor

Re: odd problem with STM that I'm seeing...

Chris-

To try to fill you in some:

We have had the RAID as long as the server (Dec 99). We had no problems with it with 10.20, and I just upgraded to 11.0 in July. I had installed a version (don't remember the date code) of stm with 10.20 about four months before I upgraded, and had no problems with it. The drive itself, according to the manufacturer, is not activly supported anymore, and has not been tested with 11.0. However, they believe that it shouldn't have any problems, since "It uses generic scsi HBA and os built in scsi HBA driver", whatever that means. The version of stm was installed right after the upgrade, and it was taken off the june 2001 cd.
The internal drives are SEAGATE ST3457WC, and there have been no problems.

Now, when I logged in today, I once again got the critical errors. (maybe the two are not related, although I can't explain the previous behaviour). I have heard that a dirty tape device with cause SCSI resets during backup. Is this founded? or is stm right and the drive is nuts? I'm confused. any help, wild thoughts, ideas appreciated. Thanks.

Mark
"We apologize for the inconvience" -God's last message to all creation, from Douglas Adams "So Long and Thanks for all the Fish"
Mark Vollmers
Esteemed Contributor

Re: odd problem with STM that I'm seeing...

Anyone have an thoughts? Or is there a question that stumps the panel. An ill omen, I reckon.
"We apologize for the inconvience" -God's last message to all creation, from Douglas Adams "So Long and Thanks for all the Fish"
Christopher McCray_1
Honored Contributor

Re: odd problem with STM that I'm seeing...

The only other thing that I can think of right now is check the resdme file on the CD ( I have to look for mine, but if you have it handy...). There are usually prerequisite patches needed and see if you have them. I apologize for being away for so long and yes, it is mystifying, but I'll see what I can dig up, if anything.

Good Luck
Chris
It wasn't me!!!!
Mark Vollmers
Esteemed Contributor

Re: odd problem with STM that I'm seeing...

Chris-

I see that the disc has a number of patches on it, and I went through and installed a bunch of patches (using the custom patch manager) after I was up and running, so I would like to think that all the patches are there. When I put in the cd (diagnostic cd, june 2001), I cannot find a simple readme.txt file, so I'm not sure what that means.

Mark
"We apologize for the inconvience" -God's last message to all creation, from Douglas Adams "So Long and Thanks for all the Fish"
Christopher McCray_1
Honored Contributor

Re: odd problem with STM that I'm seeing...

What I have done is I put the cd in my PC and (or server) and cd into the DIAGNOSTICS directory. That's where it is and I hope it helps. Let me know if you need any more assistance with this.

BTW- if you put the cd into your pc, it translates well into MS-Word and pdf format.

Chris
It wasn't me!!!!
Christopher McCray_1
Honored Contributor

Re: odd problem with STM that I'm seeing...

Another thing to possibly check is the /etc/opt/resmon/log/api.log to see any entries from recent times. Also, I'm still not too familiar with the number schemes; what type of server is 9000/810? Still looking in the meantime.
It wasn't me!!!!
Mark Vollmers
Esteemed Contributor

Re: odd problem with STM that I'm seeing...

Chris-

The server is a D-class. I was looking through the patch list, and I'm not sure that the patches that it wants are there (although if I do the custom patch manager, it doesn't say that I need to get them, so I'll look more at this). There are no entries in the api.log relating to the warning that I am getting. I'll dig a little deeper.

mark
"We apologize for the inconvience" -God's last message to all creation, from Douglas Adams "So Long and Thanks for all the Fish"