- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Red Hat and Proliant lock up's
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-21-2004 11:45 PM
03-21-2004 11:45 PM
Red Hat and Proliant lock up's
We have several Proliant servers (dl-360-G2/g3 and dl-380-g2/g3) running different Red Hat versions (ES 2.1, AS 2.1, ES 3, 9) which now and then just seem to freeze. Console is black, no network connectivity or what so ever. There seems to be a relation between the kernel version and the firmware because some servers are running ok since installing the latest firmware (system, SCSI controller etc) after lock-ups occurred with a new kernel. These lock-up's occur after a week, a month or sometimes twice a day and there seems to be no relation with the system load or installed software. No hints in the log files or on the console, just dead.
Installing the latest HP management agents did not improve stability only that ASR now automatically reboots the server in case of a frozen state.
Anyone have any suggestions?
TIA,
Andre
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-22-2004 02:33 AM
03-22-2004 02:33 AM
Re: Red Hat and Proliant lock up's
We have a similar mix of hardware of OS (ES,AS 2.1 & 3.0) with no problems of lockups. I've have problems with reboots on ver 7.0 of the HP agents, I recommend using version 6.40..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-22-2004 02:52 AM
03-22-2004 02:52 AM
Re: Red Hat and Proliant lock up's
It is standard HP hardware, no third party memory. Some servers are running Oracle, others only Apache or Postfix. We had the same problems with hpasm 6.40. Digging deeper, the problems where/are on Intel P4 Xeon servers with hypher threading enabled. Maybe to rule things out would could disable HT, but are there known issues with HT enabled and HP Linux servers? An other thing the servers share, is that they are running a local netfilter based firewall.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-22-2004 06:06 PM
03-22-2004 06:06 PM
Re: Red Hat and Proliant lock up's
the same problem?
We have three ML370's, running RH9 and
management software 6.40.
Two of the machines have locked up twice,
the third havn't (yet) had any problems.
One or two days before the lockup, I can
see increased CPU load in Big Brother,
and I get the following messages in
/var/log/messages:
Mar 21 10:46:07 server1 kernel: raid5: multiple 0 requests for sector 4812176
Mar 21 13:09:39 server1 kernel: raid5: multiple 1 requests for sector 142849888
These are from the linux software RAID driver.
I don't think the problem is in the
software raid, but these messages are
only indications that something else
is wrong...
The three machines are not completely identical:
server1: 4.5G RAM, 8x146G Disks
server2: 4.5G RAM, 8x146G Disks
mail: 1.5G RAM, 3x146G Disks
server2 is a mirror of server1, and not
used in daily production.
I've seen this lockup occurring on
the two production machines, server1
and mail.
Mogens
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-22-2004 07:43 PM
03-22-2004 07:43 PM
Re: Red Hat and Proliant lock up's
However, we did solve OUR particular problem, which was sort of a version disconnect between hardware BIOS/drivers from HP and the kernel version from RH... so you might just check it anyway.
Go into your system BIOS (obviously need to reboot), check the setting for â MPS Table Modeâ , set the value to â Full Table APICâ if it's not already, reboot again.
Jar
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-23-2004 01:24 AM
03-23-2004 01:24 AM
Re: Red Hat and Proliant lock up's
The other leading cause of random lockups was SCSI termination issues. I would double check my terminiation on the HD's & controller to make sure everything was in order..
You mentioned that your running netfilter on each of the boxes. I would also check that you are not blocking your loopback interface communicating with internal processes.. Make sure you have something like this in your script..
iptables -A INPUT -i 127.0.0.1 -j ACCEPT
iptables -A OUTPUT -o 127.0.0.1 -j ACCEPT
Good luck!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-23-2004 01:42 AM
03-23-2004 01:42 AM
Re: Red Hat and Proliant lock up's
SCSI termination issues are not very likely but indeed to rule things out...
Netfilter is configured 'loose' for the loopback interface because we had some problems before with a more strict approach.
The suggestion Jared makes about MPS table mode is also something worth checking, tnx.
Tnx for your time!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-23-2004 02:08 AM
03-23-2004 02:08 AM
Re: Red Hat and Proliant lock up's
If you bought all your servers at the same time maybe you got a bad hardware batch.
It happened to us with the power supplies of dl360 g1.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-24-2004 06:47 PM
03-24-2004 06:47 PM
Re: Red Hat and Proliant lock up's
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-29-2005 03:59 PM
01-29-2005 03:59 PM
Re: Red Hat and Proliant lock up's
Andre, what did you end up doing to fix this issue for yourself?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-29-2005 04:55 PM
01-29-2005 04:55 PM
Re: Red Hat and Proliant lock up's
What runlevel do you run these servers at?
If these are real servers make sure you have
id:3:initdefault:
in /etc/inittab. This way you eliminate X alltogether.
Also make sure that you have the latest firmware and PSP.
Can you login through the iLO/RILO card when these lock-ups happen? Is there anything to indicate problems in the IML?
Regards,
Ross
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-29-2005 05:53 PM
01-29-2005 05:53 PM
Re: Red Hat and Proliant lock up's
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-29-2005 08:04 PM
01-29-2005 08:04 PM
Re: Red Hat and Proliant lock up's
Cheers,
Andre
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-31-2005 07:39 AM
01-31-2005 07:39 AM
Re: Red Hat and Proliant lock up's
They are being run at run level 3.
Not sure how to connect using the iLO/RILO card. No connection through network or kvm, and all logs stop at time hard lock appears to happen.
When talking to HP they had us install the insight manager agents. This changed the behaviour from lockups to rebooting. We were able to get some dumps, but they have now come back and said it is a software issue. No problem with the hardware.
We are now talking with Red Hat and have sent them a vmcore dump, and they have requested another dump to do some comparisons but we have not got another successful dump yet.
Do your servers run with the smp kernel. Have you tried to run them with the non-smp kernel? Have you tried disabling the hyper-threading?
Thanks.
Dan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-31-2005 08:00 PM
01-31-2005 08:00 PM
Re: Red Hat and Proliant lock up's
Andre
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-01-2005 03:45 AM
02-01-2005 03:45 AM
Re: Red Hat and Proliant lock up's
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-01-2005 03:50 AM
02-01-2005 03:50 AM
Re: Red Hat and Proliant lock up's
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-01-2005 11:55 PM
02-01-2005 11:55 PM
Re: Red Hat and Proliant lock up's
Are all the machines that are locking up running Oracle ? What version ? App Server or Database ?
We've seen similar issues on a number of DL380 G3's all running RHEL 3, and Oracle Application Server 10g (9.0.4.0.0)
Every now and then the machines will just lock up - they'll normally drop off the network, but occasionally they'll stay on the network but they can't be logged into, either on the console or via SSH.
If I use iLO to look at the X console, I see the time at which the lock up happened, but there's no response from the keyboard or mouse.
Its very frustrating to say the least !
Cheers,
Rob
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-02-2005 03:11 AM
02-02-2005 03:11 AM
Re: Red Hat and Proliant lock up's
Some of them are running Oracle Database Server (8.1.7) but others only run Apache or Amavis/Spamassasin, some are connected to a MSA1000 SAN others just DAS, so for us there is no clear lead. When a hang-up occurs, the console is totaly black and there is no network connectivity (no ssh, sometimes a ping is possible).
Thanks and cheers,
Andre
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-02-2005 03:23 AM
02-02-2005 03:23 AM
Re: Red Hat and Proliant lock up's
The problem turned out to be the HP Insight Manager agents causing this. I disabled the agents and haven't had a lockup since.
I have an open case with HP for the past 3 months but the technician basically gave up trying to figure out the problem..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-21-2006 01:59 AM
03-21-2006 01:59 AM
Re: Red Hat and Proliant lock up's
Could you please let me know if these issues have been resolved for you? I am having the same issues with a HP ProLiant DL380 G4 and a G3 server.
G4 server had freezing issues about 6 months back. Then it disappeared for a while. Now beginning last week I have already had 3 freezing incidents. No response from keybd / mouse, ping works but not SSH, etc. Thru' RiLO I can get a console, but can't use my keybd / mouse. I am running SuSE Pro 9.3. No non-HP parts in the system.
G3 system used to have the freezing issue quite regularly till about 2 - 3 months back, but has not happened since. Don't know when it will start again.
The strangest thing is I have another G4 server that has been rock-solid (touch wood when I say this) for the past about 8 months. All the servers are running the same OS version.
TIA,
Prakash
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-21-2006 02:48 AM
03-21-2006 02:48 AM
Re: Red Hat and Proliant lock up's
Problems seem to be solved, but never had it clear what was causing these lock ups. There is undoubtedly a relation between kernel version and HP firmware. On a dl-360-g3 with Red Hat EL AS 2.1, the lock ups disappeared when we installed kernel 2.4.9-e.59smp (already were on level with firmware). Since March 11â th 2005, all systems are running smoothly. Last lock up was on a dl-380-g3, Red Hat EL AS 3, 2.4.21-32-0.1.ELsmp kernel at that time. The SA 5i controller firmware was not on level and after upgrading from 2.38 to 2.58 the lock ups also disappeared from this server. But with every new kernel released by Red Hat, like now with U7, I cross my fingers because neither HP nor Red Hat explained or could explain to us what was causing the mayhem.
Cheers,
André
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-21-2006 06:23 AM
03-21-2006 06:23 AM
Re: Red Hat and Proliant lock up's
SCSI ID 0 and 1 have been configured as a RAID 1 mirrored volume and host the following partitions:
/
/boot
swap
/usr/local
SCSI IDs 2, 3, 4 and 5 have been configured as a RAID 0 volume (no loss of space as we have an enterprise backup that takes care of data backup in case of disk going bad) and this hosts /home partition.
What I see during freezing is that the disks 0 and 1 are totally busy (green LED going crazy on these 2 disks) for some reason and my guess is that is the reason the system does not respon to any other requests.
Do you see any of these?
Thanks,
Prakash
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-21-2006 11:28 PM
03-21-2006 11:28 PM
Re: Red Hat and Proliant lock up's
Cheers,
André