HPE Community read-only access December 15, 2018
This is a maintenance upgrade. You will be able to read articles and posts, but not post or reply.
Hours:
Dec 15, 4:00 am to 10:00 am UTC
Dec 14, 10:00 pm CST to Dec 15, 4:00 am CST
Dec 14, 8:00 pm PST to Dec 15, 2:00 am PST
ProLiant Servers (ML,DL,SL)
cancel
Showing results for 
Search instead for 
Did you mean: 

Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

 
cykVM
Frequent Advisor

Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Hello HP community,

 

I already briefly discussed the issue in VMWare communities, see https://communities.vmware.com/thread/491027

But this was not really leading to a resolution.

 

I maintain a newly bought HP Proliant DL380e Gen8 server which was freshly installed in August using the HP customized VMWare vSphere 5.5 Update 1 installation ISO. After configuration the server ran fine.

Hardware data:

HP Proliant DL380e Gen8 (bought brand new in August 2014), HP SmartArray B320i storage controller, HP H222 host bus adapter (only a HP Ultrium4 tape drive connected to that), HP Intel 4port NIC 366i, 32GB RAM, 2 Quadcore Intel Xeon E5-2407

 

I'm aware that the storage controller B320i is not on the VMWare HCL but that's why I used the customnized installation ISO.

 

After HP released a new SPP and a VMWare 5.5 Update 2 ISO beginning of September I first installed the SPP during maintenance providing several firmware updates. The ilo4 firmware was updated to 2.0 some weeks before.

Afterwards I ran an upgrade installation to VMWare 5.5 U2. All went through without issues, no errors or crashes.

 

The server was running fine for some days and suddenly the first crash of VMWare happened. The PSOD displayed was similar to the one in the attachment. Error message: PCPU 0: no heartbeat (2/2 IPIs received)

 

I rebooted the server through iLo console and during the following days the server crashed multiple times with a similar PSOD, always with PCPU 0: no heartbeat (2/2 IPIs received)

 

At the time of the crash the server/VMWare was mostly idle (at night time or very early in the morning.

 

I reviewed the BIOS settings and set those according to HP recommondations for VMWare, especially referring to power management settigs.

 

But no configuration change or setting helped, VMWare kept crashing randomly, sometimes after about half a day, 2-3 days or about a week.

 

2 days ago I started deploying a new Windows VM, initial VM configuration was successful, the VM was created on the datastore and appeared in inventory. Just at power on of that VM VMWare crashed again with No heartbeat PSOD.

This was reproduceable after a reboot of the system. After the reboot the newly created VM disappeared from the inventory but was still existing physically on the datastore volume.

 

Since this happened during office hours, I was fed up with testing various BIOS settings and things in VMWare configuration and went back to VMWare 5.5 Update 1 (build 1746018 HP customized) by using the SHIFT+r altbootbank option on boot up.

 

The server runs stable without issues since then (I know only 2 days, but ...) and new VM deployment works fine with 5.5 U1.

 

I somehow suspect a kernel <-> driver error here to be the cause of the PSODs. It might be the HP 366i 4port NIC physical driver in conjunction with the virtual E1000 NICs within the VMs or even the HP hpvsa driver for the B320i Smart Array controller.

 

Anyone around here any ideas?

 

Thanks in advance for any help provided.

 

cykVM

21 REPLIES
Suman_1978
HPE Pro

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Hi,

 

If possible try this:

 

In BIOS, under HP Power Regulator, use HP Static High Performance Mode.
And add VMware boot flag timerEnableTSC = false
Add VMware boot flag usePCC = false

 

See if the above settings make any difference.

 

Thank You!
I am a HP employee.

Useful Links for ProLiant Servers

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Thanks for your suggestion, Suman but I'm afraid I can't try this right now. I went back to 5.5 U1 with the SHIFT+R method on VMWare boot because it was the only quick way to get back a stable running system.

 

I might try this during next weekend.

 

 

JaaM
Visitor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Thanks, same here.

 

I was using ESXi 5.1.0 with no problems. After update to 5.5.0 U2, PSODs began to appear. As cykVM said, i am also going to test it some of theese nights or in the weekend, cause it is a production server. I also made a rollback to 5.1.0.

 

Greetings

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

At VMWare forums another user faced the same problem of suddenly appearing PSODs after upgrade installation to 5.5 U2. He had similar hardware to my configuration. He also upgraded from 5.1.0 to 5.5 U2 as JaaM did.

 

See user CyrilH's post here: https://communities.vmware.com/thread/491027?start=0&tstart=0

 

 

ErikV1991
Occasional Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Hi,

 

We have the same problem, we have DL360e Gen8.

Installed from scratch ESXi 5.5 U2 on SD-Card and still get the same PSOD's

 

I have checked the BIOS and the settings are HP Static High Performance Mode.

Changed the settings in VMware for timerEnableTSC = false, but i could not find the usePCC = false !?

 

I let you know if the servers get an PSOD or not.

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

You may set usePCC option through VMWare vSphere client by clicking on the host -> configuration tab -> Software section / Advanced configuration -> epand VMKernel (right pane) and select Boot -> scroll down until you see VMKernel.Boot.usePCC and untick the box next to it

 

Also see http://h20195.www2.hp.com/v2/GetPDF.aspx%2F4AA3-9258ENW.pdf for further information (pages 14 and onwards)

 

I think there is also a way though vSphere CLI, but it should work this way.

 

If that helps for you it would be nice to hear about any progress. I may upgrade to 5.5 U2 then.

 

 

ErikV1991
Occasional Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

I checked it, but this one is not available in my VMware, other options are.

Maybe the Hardware ?

 

Well i check in the morning if the server is still up, i let you know.

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

I just checked on my VMWare, also not available. Maybe it was removed in 5.5. The PDF is referring to 5.0.

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Yes, was removed in 5.0 Update 2 version and defaults to FALSE now.

See last post (correct answer) in https://communities.vmware.com/thread/465241

 

"This issue has been resolved as of ESXi 5.0 Update 2 as PCC is disabled by default ..."

 

 

ErikV1991
Occasional Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Bad news, this moring the server crashed with the same PSOD.

 

add VMware boot flag timerEnableTSC = false is not the soloution.

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Thanks for the feedback, ErikV.

 

Just to be sure: After setting the timerEnableTSC = false you rebooted the server for the boot flag to go into effect?

 

ErikV1991
Occasional Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Hi,

 

No i did not reboot, but i checked en the setting is still off.

So now it's rebooted i think afhter the PSOD, but i reboot it now.

 

Let you know if it stay's up untill tomorrow.

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Changed boot flag go into effect only if VMware is rebooted, it's kernel flags which can't be changed live.

So we will see what happens now ...

 

 

ErikV1991
Occasional Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Hi,

 

This moring the same PSOD, after change the setting i rebooted the server but no effect.

Any ideas anyone ?

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Thanks for the feedback. I presume the only available workaround (and this is only a workaround!) is to go back to previous version with the SHIFT+r method.

Instead of doing this you may also use the HP customized VMWare 5.5 Update 1 version, which runs fine for me on the same hardware.  See https://my.vmware.com/de/web/vmware/details?downloadGroup=HP-ESXI-5.5.0U1-GA&productId=353 for that.

 

There is no other option, I guess, until HP and/or VMware fixes this.

 

JaaM
Visitor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

For now, there must be some kernel/drivers issue...

 

If i want to upgrade, i'll get ESXi 5.5.0 U1...

 

BUT, i'll let you know if i see something with U2 and all recommendations given before...

 

Thanks

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

Just discovered this: http://h20566.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/swdDetails/?sp4ts.oid=5269386&spf_p.tpst=swdMain&spf_p.prp_swdMain=wsrp-navigationalState%3Didx%253D0%257CswItem%253DMTX_561321363f664c7a8f75354c13%257CswEnvOID%253D4166%257CitemLocale%253D%257CswLang%253D%257Cmode%253D3%257Caction%253DdriverDocument&javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.v... (new VMWare 5.5 driver for B120i / B320i Smart Array controller - hpvsa version 5.5.0-90 [2014.09.11])

and this: http://h20566.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/swdDetails/?sp4ts.oid=5269386&spf_p.tpst=swdMain&spf_p.prp_swdMain=wsrp-navigationalState%3Didx%253D0%257CswItem%253DMTX_8372c55483b9432abd53d91951%257CswEnvOID%253D4166%257CitemLocale%253D%257CswLang%253D%257Cmode%253D3%257Caction%253DdriverDocument&javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.v... (new Firmware 2.02 for iLo4)

both released on 06 Oct 2014. Maybe this helps?

 

P.S. for 5.1.0 Version of VMWare the new hpvsa driver can be found here: http://h20566.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/swdDetails/?sp4ts.oid=5269386&spf_p.tpst=swdMain&spf_p.prp_swdMain=wsrp-navigationalState%3Didx%253D%257CswItem%253DMTX_e036863187434e4c9d32533b87%257CswEnvOID%253D4141%257CitemLocale%253D%257CswLang%253D%257Cmode%253D%257Caction%253DdriverDocument&javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.vig...

 

Both updates are RECOMMENDED and do not directly address the issues with VMWare 5.5 U2.

 

 

 

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

The new hpvsa version 5.5.0-90 driver seems to be kind of buggy. I installed it yesterday morning, install went fine and 5.5.0-86 was replaced followed by a host reboot, but after that the throughput on accessing Windows shares between the VMs went down massively. I recognized it on yesterday evening/night backup as the data backup which takes about 2-3 hours usually now was still running this morning and would take another 8-10 hours approximately. Throughput on backup to LTO4 tape drive is usually at about 4 GB/min, with the 5.5.0-90 hpvsa driver it went down to about 0.3 GB/min - that's about 10% of the former speed.

 

I just went back to 5.5.0-86 driver and throughput is back to normal 4 GB/min.

 

So 5.5.0-90 hpvsa is not recommended on this configuration. I'm using Backup EXEC 2010 R3 on Windows 2008 R2.

 

What's going on @HP? ;-) Can't remember joining the customer-beta-test-program ...

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

And an additional 2 users with these issues posted in VMWare forums: https://communities.vmware.com/thread/491027?start=15&tstart=0

 

 

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

In between more users having the same trouble/PSODs dropped into discussion at VMWare communities. Some are testing the updated hpvsa-90 driver with HP custoimized VMWare 5.5 Update 2 with so far stable results (their servers run for about 2-5 days without any trouble).

 

 

cykVM
Frequent Advisor

Re: Proliant DL380e Gen8 keeps crashing (PSOD) after upgrade to HP customized VMWare ESXi 5.5 U2

To keep this up to date: Further users with a Proliant Gen8 server and a B120i/B320i Dynamic Smart Array controller came across my discussion at VMWare forums (see: https://communities.vmware.com/thread/491027?start=60&tstart=0)

Some tried updating to hpvsa driver version -90 prior to upgrading/installing VMWare 5.5 Update 2. The PSODs stopped with hpvsa-90 driver but disk performance went down massively. That's what I experieced with hpvsa-90 driver and my VMWare 5.5 Uüpdate 1 installation.

For detailed description of the tests being done see VMWare forum discussion.

 

I will stick wioth 5.5 Update 1 and hpvsa-88/86 driver because it's running stable for aboutr 4 weeks now without issues.