Operating System - Linux
1752735 Members
5758 Online
108789 Solutions
New Discussion юеВ

Re: error message in linux

 
ats1
Frequent Advisor

error message in linux

hi,

I have a problem between my storage box and my linux server ((version 2.6.9-55.ELsmp (brewbuilder@hs20-bc2-4.build.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)).
It's a cluster

i need an assistance please.
see beow the error message


+++++++++++++++++++++++++++++++++++++++++++++++
406 STORAGE BOX # OVERHEATING erreur sur le SAN
14:49:44 dbapp-node1 clurgmgrd: [7805]: /dev/sdf1 is not mounted
Feb 3 14:49:47 dbapp-node1 clurgmgrd: [7805]: /dev/sdc1 is not mounted
Feb 3 14:49:49 dbapp-node1 clurgmgrd: [7805]: /dev/sde1 is not mounted
Feb 3 14:49:54 dbapp-node1 clurgmgrd: [7805]: /dev/sdd1 is not mounted
Feb 3 14:49:59 dbapp-node1 clurgmgrd: [7805]: /dev/sdb2 is not mounted
Feb 3 14:50:04 dbapp-node1 clurgmgrd: [7805]: /dev/sdb1 is not mounted
Feb 3 14:50:09 dbapp-node1 clurgmgrd: [7805]: /dev/sda2 is not mounted
Feb 3 14:50:14 dbapp-node1 clurgmgrd: [7805]: /dev/sda1 is not mounted


3 14:51:41 dbapp-node1 clurgmgrd[7805]: Service oracle is recovering
Feb 3 14:51:41 dbapp-node1 clurgmgrd[7805]: Recovering failed service oracle


Feb 3 18:44:29 dbapp-node1 rgmanager: clurgmgrd startup failed
10 REPLIES 10
Steven E. Protter
Exalted Contributor

Re: error message in linux

Shalom,

Few possible problems.

1) The server is not patched. Patching sometimes fixes issues like this.

2) Disks were withdrawn from the server. They were there and they are gone now. This could be due to failure or changes. The OS remembers what it had and squawks about it being gone.

3) Hardware failure effecting your cluster.

Your storage array is overheating, not working right and needs a hardware repair.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
ats1
Frequent Advisor

Re: error message in linux

hi,
1- how can i check the patch? or what is the command to check it?
2- is it possible to make the fsck command to repair it?

3- what informations do you want to repair this error for me please?

bests regards,
Jupinder Bedi
Respected Contributor

Re: error message in linux

In RHEL 5.0 you can use following command to install online patches
#yum install

in RHEL 4.0

#uptodate install

you can check the release in /etc/redhat-release


2) you can run fsck and try to mount the filesystem and see what error it is giving you after that.

All things excellent are as difficult as they are rare
ats1
Frequent Advisor

Re: error message in linux

hi all,

Is it possible to connect on tne MSA1000 to have more informations in order to resolve the problem?
how can i do this?
Jupinder Bedi
Respected Contributor

Re: error message in linux

Here is the document . please see , I know it will be helpful for you

http://h20000.www2.hp.com/bc/docs/support/SupportManual/c00800928/c00800928.pdf
All things excellent are as difficult as they are rare
Randy Jones_3
Trusted Contributor

Re: error message in linux

That "2.6.9-55" implies you are running Red Hat 4.5, which is about two years out of date. Jupinder makes a good suggestion to update your system.

Run "up2date --force -uv" to bring everything up to current releases. Even if it does not fix this specific problem you will be closing security holes and adding many patches to other parts of your system.
ats1
Frequent Advisor

Re: error message in linux

hi all,
my problem is that I currently don't see the disks on the MSA. befor running the update command what can i do to repair the disks? that is my problem!!

I need your assistance please


Matti_Kurkela
Honored Contributor

Re: error message in linux

> 406 STORAGE BOX # OVERHEATING erreur sur le SAN

I don't know much French, but I think I understand this. This message looks very bad!

If your storage box has shut itself down because of overheating, nothing you do at the server side will restart it.

First, check the storage box physically and find out why it overheated. Has the HVAC in the server room failed, causing the entire room to become too hot? Or perhaps one of the cooling fans of the storage box has failed. Or maybe something is blocking the flow of the cooling air.

The MSA1000 has a serial port (with a special RJ45Z connector) you can use to configure and diagnose it. A serial-to-RJ45Z cable should have come with the MSA.

The instructions for basic MSA1000 diagnostics and troubleshooting are in the MSA1000 user's manual. I'm 100% sure such a manual came with the MSA, probably in CD-ROM form. If the original manual is lost, you can download a new one at http://www.hp.com/go/support website.

After the MSA is running normally, you can begin restarting your cluster. If you cannot reboot the cluster nodes for some reason, here's the procedure for making the system re-detect the presence of disks without rebooting:

http://kbase.redhat.com/faq/docs/DOC-3942

MK
MK

Re: error message in linux

IS your filesystems are accesible? are the LUNs visible in cat /proc/scsi/scsi?