ProLiant Servers (ML,DL,SL)
1748113 Members
3372 Online
108758 Solutions
New Discussion

Re: Brand new ML350g9 Server 2012R2 PCI Bus errors on soft reboot

 
Brian Mayo
Occasional Advisor

Brand new ML350g9 Server 2012R2 PCI Bus errors on soft reboot

Received 2x ML350G9 servers last week.  Took 1, installed Volume License of Windows Server 2012 R2 Essentials. 32 gigs of RAM, single Xeon, onboard RAID controller, 4x 600 gig 10k SAS drives setup single RAID 1+0

Intelligent Provisioning setup.  Windows install went fine, carved out a 250 gig partition for the C drive.  Next reboot, installed the HP SPP dated around Sept 2016.  Reboots. Ran a few rounds of Microsoft updates.  Reboots.  Run another round of Microsoft updates to find 1 or 2 stragglers.  Prep for delivering to client..install our N-Able agent and bitdefender antivirus.  A few reboots..all is well.  I go to reboot another time...and it hangs hard on shutdown.  Windows screen shows "rebooting" circle..but froze.  Hard reboot...she comes up.  I go check management logs and I see the following:    I can replicate this on soft reboots...sometimes it reboots OK, other times it hard locks on a reboot..and the following will be in iLO management logs.  Get on the horn with HPE support..they send out a tech the next day to replace the motherboard.  In the meanwhile, I take the 4x hard drives, and the 32 gigs of RAM, and move them over to the second server that got delivered that day (for another client, but I'm on a tight schedule).  I deliver that server to the clients network...I join it to the domain, do a few things to prep for a migration over 3-4 days, some reboots..no problems.  And then on a reboot, it hangs again.  Same problem.  Same errors in logs.  I take it back to my office...I have HPE send a tech AGAIN to replace that system board.  He replaces it.  I reboot a few times..BOOM...same lockup on a soft reboot, same errors.  So I nuke 'n pave..fresh clean install. About 2 reboots after the HP SPP cd install of HP software and drivers...reboot..same thing AGAIN!  So get on the phone with HPE..sending tech out again (3rd time)...this time he says he will replace the RAID controller, since that is device bus 0 device 0.   Since I had this problem on 2x different yet identical HP servers...I doubt 2x RAID controllers were bad.  And that original server ...is running Server 2016 standard just fine for over a week now.  

 

What are the chances it's memory?  What are the chances it's one of the hard drives?  (all brand new HP direct parts).  What are the chances that HP SPP driver CD from Sept 2016 has some driver bug for Server 2012 R2 volume license image?  Client getting very upset...schedule for migrating their club systems software is getting tight now.

 iLO.jpg

 

5 REPLIES 5
TTr
Honored Contributor

Re: Brand new ML350g9 Server 2012R2 PCI Bus errors on soft reboot

>  ...I take the 4x hard drives, and the 32 gigs of RAM, and move them over to the second server...

> What are the chances it's memory?

 

I'd say that the chances are pretty high that it's the memory. Try using half of the memroy and power up the server to see what happens. Then try the other half of the memory.

Did you mention the fact about the memory to the response center or the field engineer?

Brian Mayo
Occasional Advisor

Re: Brand new ML350g9 Server 2012R2 PCI Bus errors on soft reboot

I did mention the memory all 3 times I was on the phone with HPE Server support.

I also mentioned it to the HP onsite tech who came this morning to replace the RAID controller.

 

For giggles, for what's it's worth, I ran the latest "memtestx86" for about 3 hours this morning.  No problems found.  Quick HP insight hardware diagnostics found nothing wrong either.  I have 4x 8 gig sticks of rammage in there.  

 

The onsite tech today replaced the RAID controller, he left about 1230 (EST)...it's 2::30 now and I've been rebooting the server many times since he left.  So far it has NOT locked up...BUT...remember, the server ran fine for a few days and many reboots until it acted up.  Sometimes it will act up on each reboot, and other times...I can reboot it many times in a day for several days on end without issue.  

 

I know it's a long process to eliminate things and narrow down things.  Called client and cancelled again...she's quite upset now...and she currently has a Dell server and they have all Dell workstations...LOL..what an introduction to HP for her!  :)  I remember her asking me about my quote.."No Dell, it's an HP..are they good"?  Since we're mostly an HP server house for over 20 years (I go back to the old white Compaq Proliant and ProSignia servers) of course I said yes.

 

 

TTr
Honored Contributor

Re: Brand new ML350g9 Server 2012R2 PCI Bus errors on soft reboot

Hmmm... Sad story for HPE. I wish you luck and that the problems are over with this server.

Jimmy Vance
HPE Pro

Re: Brand new ML350g9 Server 2012R2 PCI Bus errors on soft reboot

So at anytime did support request the AHS log, or are they just swapping parts? Details in the AHS log should help support determine the cause

 

No support by private messages. Please ask the forum! 
Erdogan Temur
HPE Pro

Re: Brand new ML350g9 Server 2012R2 PCI Bus errors on soft reboot

Hi,

Replace motherboard and processor together. Then the problem is solved.

Perform the following guidelines in RBSU
• Power Management Options >
=> Maximum Performance:
=> HP Static High Performance Mode:
=> C-states -> No C-states

Kind Regards,
Erdogan.
No support by private messages. Please ask the forum!

Accept or Kudo