Operating System - Tru64 Unix
1839268 Members
2623 Online
110137 Solutions
New Discussion

Re: System crashes whenever we cleanup unused devices...!

 
Nagarajan Balakrishnan_1
Frequent Advisor

System crashes whenever we cleanup unused devices...!

Hi,

We have a two node cluster. Whenever, we run the following commands to cleanup unused and unconnected devices...!
-------------------------------
hwmgr delete component -id 224
dsfmgr -R hwid 224
-------------------------------
We did a "hwmgr -scan scsi" and the subsequent "hwmgr -show scsi" was showing the device 224 as not having any path.

After the cleanup completed successfully, everything was ok and within seconds one node crashed. When we trying to bring that up and when it was booting, the other node also crashed. We used to ignore this as "SCSI bus getting reset". But I feel somethig else could well be wrong. Please see the attachments for details.

Any help is appreciated.

Baalki
14 REPLIES 14
Ravi_8
Honored Contributor

Re: System crashes whenever we cleanup unused devices...!

Hi,

can u disable the SMNP traps

(set snmpEnableAuthenTraps flag to 2 in /etc/snmpd.conf file)
never give up
Srivathsan
Frequent Advisor

Re: System crashes whenever we cleanup unused devices...!

Hello Ravi,

Could you please explain that a bit further.

Is it specific to this cluster setup or is it a bug of some sort ?

Thanks in advance.

Srivathsan
Michael Schulte zur Sur
Honored Contributor

Re: System crashes whenever we cleanup unused devices...!

Hi,

could you please give us os version and patch level?

panic (cpu 0): Unaligned kernel space access from kernel mode

I wonder, if this not a patch related matter.

hth,

Michael
Nagarajan Balakrishnan_1
Frequent Advisor

Re: System crashes whenever we cleanup unused devices...!

Hi,

It is running on Tru64 5.1A with PK4.

According to my knowledge, this was a bug resolved in PK3.

The explanation given was the devices database changes in the memory of one of the nodes only before writing into the common file and hence there is an inconsistency detected and hence it crashes.

However, we are having this problem. Any additional help.

Regards
Baalki
Greg Yates
Valued Contributor

Re: System crashes whenever we cleanup unused devices...!

Hi,

There is at least one hwmgr-related panic fixed in a later patch kit (later than PK4) that you may be experiencing. I didn't see a stack trace in any of your attachments. That's always good info to have when looking at a crash. You can find this in the crash-data file in /var/adm/crash (default location).

Greg
Michael Schulte zur Sur
Honored Contributor

Re: System crashes whenever we cleanup unused devices...!

Hi,

we also had a problem with 5.1A pk4 when there was a tape not attached and we did a mt status, the machine would crash.

Michael
Orrin
Valued Contributor

Re: System crashes whenever we cleanup unused devices...!

Hi Baalki,

We had the same problem, only ours was worse as the device database was corrupted and we had to restore the device dataabse, lucky that we did not loose any data and there were no changes to the device database.

On further investigation of the problem, The HP engineer advised that the problem is witht the dsfmgr command, that causes the crash.

Since we don't have a system to play with you can understand our reluctance to try the command again.

The only difference was we used the delete option with the scsi id.

If it is a production box, might be a good idea to raise a service call. Our call just resolved the corrupted database, we haven't made any changes to the devices after that and haven't had the chance to look at the problem.

Hope you have more luck resolving the issue.
Will keep an eye on this thread, maybe it will solve our problem as well.

regards,
Orrin.
Nagarajan Balakrishnan_1
Frequent Advisor

Re: System crashes whenever we cleanup unused devices...!

Gents,

Thanks for the overwhelming responses.

Sorry, got busy with some urgent implementation requests.

I have attached the trace files. Hope that will throw some light on the problem and the cause.

Thanks.
Michael Schulte zur Sur
Honored Contributor

Re: System crashes whenever we cleanup unused devices...!

Hi,

could you post
hwmgr -show comp
and give an example, what you would delete?

thanks,

Michael
Greg Yates
Valued Contributor

Re: System crashes whenever we cleanup unused devices...!

Baalki,

I took a little time and tried to find something for you. The good news is I found a case just like yours in V5.1(no letter) that was fixed with a patch. The bad news is that the patch is already in V5.1A PK4 (your release, iirc). That means your problem must be slightly different than what was fixed at V5.1. I'm assuming here that you have all of the patch kit installed. In other words, there weren't dependencies that prevented certain patches from being installed. You can check your /var/adm/patch/log log-files to verify.

My advice is 1) update to the latest patch kit since there are some hwmgr/dsfmgr fixes that you don't have and/or 2) file a case with the Support Center. Just a reminder, standard support for V5.1A has ended but Prior Version Support does exists. The support center will help even if you don't have PVS. But if you need an engineering fix (new code) you'll need PVS. So give us a call.

I'm still looking for a patch for V5.1A. If I find it, I'll let you know.

Greg
Greg Yates
Valued Contributor

Re: System crashes whenever we cleanup unused devices...!

One more bit of advice. If the hardware database isn't causing a problem, leave it alone until you update to the latest patch kit. :)

Greg
Yong_7
Frequent Advisor

Re: System crashes whenever we cleanup unused devices...!

nice Greg.

baalki, it's time to log a call to HP.

that's the official way to get CSP. ( customer specific patch ).

Good Luck !

YJ
Ralf Puchner
Honored Contributor

Re: System crashes whenever we cleanup unused devices...!

I've had some crashes if I used the wrong device within the command. So to be sure you used the correct syntax and id post the output of the hardware database and the name of the device you try to delete (this was an suggestion by another forum member but never answered).

If the command and device seems to be correct then follow the other route to solve your problem....
Help() { FirstReadManual(urgently); Go_to_it;; }
Johan Brusche
Honored Contributor

Re: System crashes whenever we cleanup unused devices...!



Baalki,

Did you, while installing patchkit #4, ignore de warning that you had a file conflict with a security patch in PK#4, and did you then continue installing only just the remaining patches ??
If yes, then the patch that should have cured this panic is NOT installed.

To check this, examine the /var/adm/patch/log/session.log.x file with the date of PK#4 installation.

Johan.

_JB_