1752707 Members
5473 Online
108789 Solutions
New Discussion юеВ

Re: Proliant 5000r

 
John Rowan_2
Regular Advisor

Proliant 5000r

I have two of these rack mount servers. One started having problems back in August. I ran disgnostics for 5 days non-stop and nothing showed up as having a problem. I figured the hardware was okay and the problem was somewhere in the Red Hat Linux v 7.3 Operating System. I booted the machine and let it run for days without problems. This is after pulling the machine apart, reseating RAM and CPUs (they're quad processor Pentium Pro 200s) and using pressured air to blow all dust out of everything. Since it did not crash with the old 7.3 disks I'd placed in to test (these were not the customer's disks but those from a decommissioned machine), I decided I'd build a Red Hat 9.0 based machine on three new 9.1GB hot swap drives in a RAID 5 config. All went well with the build and the machine was installed in September. It ran flawlessly until the long Thanksgiving weekend. The machine crashed sometime over the 4 days. I went there Monday morning, the machine would not go through POST, resetting itself while counting RAM. I pulled the machine apart, reset all the RAM chips, put it together and it booted without issue, or so I thought. Later in the day it again started resetting itself. My customer doesn't want to pay me for any work I've done on the system, doesn't want to do anything with the box which is down. Fortunately I was rsync'ing files from this machine to one sitting next to it so they didn't loose anything.

My other 5000r is doing the same thing. During POST it resets itself. I can't even run diagnostics. I've reseated everything but no progress. These are nice machines I hate to throw them into the dumpster.

Anyone have a suggestion for keeping these out of a landfill?
5 REPLIES 5
Brian_Murdoch
Honored Contributor

Re: Proliant 5000r

Hi John,

This is similar to issues I have seen with faulty Processor VRM's on the Proliant 5000.

You can try the system with only 1 CPU module in (Using the other without CPU's and VRM's simply as a terminator module).

Check the VRM's using the first CPU module.

You can have up to 6 VRM's in the Proliant 5000 (1 above each CPU = 4 + 2 redundant in the centre slots on the CPU modules).

If you need a troubleshhoting guide for the PL5000 just e-mail me at brian.murdoch@hp.com


I hope this helps.

Brian
John Rowan_2
Regular Advisor

Re: Proliant 5000r

Brian thanks for your reply. I took two of the four CPUs out of the machine and two VRMs (only has 4 total VRMs, missing redundant ones). The machine continued to exhibit the problem. I took the two remaining VRMs, moved one to the redundant position and placed the two I'd removed earlier back into the primary positions and the machine goes through the power on self test without resetting itself. The machine I have access to here is now running diagnostics without any failures. I see the parts store has the VRMs for $ 43 and I've found another source selling them for $ 23 (15 for the VRM and 8 for delivery).

Is there any way to test the VRM I removed other than reinstalling it in the 5000 again?
Brian_Murdoch
Honored Contributor

Re: Proliant 5000r

Hi John,

Sounds promising so far.

It's probably just components which have goen out of tolerance it the regulator circuit so you could spend a small fortune trying to get to the botom of it and I'm not sure that anyone would have the test equipment to check them properly.

Save yourself the expense.
There are a few on Ebay for as little as $4.99 which you can buy directly. The spare part number (service part) is 219209-001.

http://search.ebay.com/219209-001_W0QQsokeywordredirectZ1QQfromZR8QQsatitleZ219209-001

Good Luck

Brian
John Rowan_2
Regular Advisor

Re: Proliant 5000r

The VRM I removed appeared to have several of it's interface pins a blackish color, like they had arced or had moisture at some time. I used an eraser and cleaned the pins up, then I reinstalled the other two CPUs and put the redundant VRM with the cleaned one in the second CPU card. The machine booted beautifully Unfortunately I had scavenged the Ethernet card and the Compaq EISA Ethernet 10/100 isn't playing well with the system, keeps saying it's forcing TLAN and the lights on the switch go on for 1/2 second then off. I tried a different port in the switch and a different cable but same. I'll continue to monitor the machine to see if it stays running. Unfortunately I have this in my shed due to lack of room here in the house. It's 28 degrees out there. I had a Proliant 2500 out there last winter it continued to run uninterrupted with temps hitting a low of -10F.

Brian, thanks again for you help.
John Rowan_2
Regular Advisor

Re: Proliant 5000r

Okay, Brian's suggestion seems to have resurrected both 5000r servers. I purchased 10 VRMs off Ebay and had purchased two CPU cards a few years ago I had sitting in inventory.

I worked on the second 5000r yesterday, removing both CPU card and the 3 VRMs from each. I noticed one of the VRMs appeared to have black coloring on 3 of the pins and the corresponding pins on the CPU card were also blacked as if something had shorted. Since I had the extra CPU cards / VRMs I replaced the one CPU card and that particular VRM and the machine fired up without incident and has been running for 16 hours. I'm currently running two instances of Seti@home to keep the machine busy.

This server is in the same 10 x 14 room as three HP Color Laserjet 8550 printers. This company prints thousands of sheets per day and the aroma of toner is heavy in the room. I noticed a build-up of "dust" around the vents and fans on the 5000r.

My question: could the toner and other dust in the room be conductive? Could a build-up of exhausted toner cause an electrical short? Will I be back there in another year replacing another CPU card / VRM set? Should I schedule quarterly maintenance to break down the machine brushing or blowing out the build-up of "dust" to try to prevent another outage?