ProLiant Servers (ML,DL,SL)
1752812 Members
5897 Online
108789 Solutions
New Discussion юеВ

Re: DL360Gen9 hangs on Linux boot occasionally

 
Peter Sokolov
Advisor

DL360Gen9 hangs on Linux boot occasionally

We have 6 pieces DL360Gen9 servers which ALL sometimes do not start correctly and hang during Linux boot. We have more than 20 pieces DL360pG8 where the problem DOES NOT happen.

After 3 weeks of extensive testing and trying to find out why the problem happens only very few times and not during every boot we were able to narrow down the problem to a very small but critical issue.

The server hangs when calling BIOS Int13h to write data to disk. Obviously this only happens sometimes and only while loading the operating system and before the disk controller driver is loaded which probably replaces BIOS Int13h calls.

It is quite easy to repeat this problem by booting MSDOS 6.22 or FreeDOS from hard disk or from a USB stick and by starting the following batch file:

:loop

copy c:\command.com c:\test.tst

goto loop

It takes from 10 seconds to 4 hours for the problem to appear (this is probably the reason why the Linux boot does not hang every time but only occasionally) . Under MSDOS either a red screen of death appears or the copy command hangs. Under FreeDOS the following error is displayed on the screen: Invalid Opcode at D057 0206 0A82 ... We installed the SmartArray Controller firmware 3.56 and the BIOS 1.52. Attached are the screenshots of the error under MSDOS and FreeDOS. 

HP support refuses to accept this as an issue because they claim that in AHS they cannot see any error and for them all the hardware (exchangable parts) is working correctly and they cannot do anything.

Does anybody know if there is a solution or how to correctly report this issue to HPE? There must be or there will be others who will be affected by this problem, although only occasionally.

11 REPLIES 11
Jimmy Vance
HPE Pro

Re: DL360Gen9 hangs on Linux boot occasionally

Did you capture a screen shot of the Linux boot error? More details please, Whate server are you working with, what Linux OS and version,  What i the boot mode set to in RBSU?

 

No support by private messages. Please ask the forum! 
Peter Sokolov
Advisor

Re: DL360Gen9 hangs on Linux boot occasionally

All 6 servers are:

1x 774435-425 HPE DL360 G9 E5-2620v3 16GB

2x 652564-B21 300GB 6G SAS 10K rpm SFF 

1x 720478-B21 HP 500W Flex Slot Platinum Hot Plug Power Supply 

OS: Ubuntu Server 14.04.3 LTS

Legacy Boot Mode (because we use the same on DL360pG8)

I do not have a screenshot because it simply hangs. And it does not happen very often, but sometimes it does.

I spent 3 weeks to find the root of the cause and currently I cannot dedicate more of my time to this issue. I am confident that if you comment to the BIOS developers that a call to BIOS Int13h sometimes hangs and that this issue can be repeated within minutes (even if it is with the "unsupported" DOS) they will definitely want to look into it because I believe that all OS are loaded through BIOS Int13h at the beginning and the problem could cause unexpected behavior in other cases as well.

By the way: During our tests we found another bug in the DL360Gen9 BIOS. This one is not important because we believe that it probably only affects DOS and some boot loaders so that maybe there is no need to fix it. Anyhow here is the description just in case that HPE developers might want to fix it anyhow:

When using BIOS Int 15h to query available RAM the DL360Gen9 reports that usable RAM is availble from the memory address 00000000h to 00094000h-1. When using BIOS Int 12h to ask how much memory is available at the beginning of the RAM the DL360Gen9 returns 620KB which is 09B000h instead of returning 592KB (094000h). This causes that all operating systems and boot loaders that ask for available memory using BIOS Int12h can use the memory from 094000h to 09B000h-1 which could cause problems because obviously this memory area shall not be used in the DL360Gen9.

Attached are the screenshots which display the return values of BIOS Int15h and Int12h on DL360Gen9 and DL360pG8. The DL360pG8 returns correct values.

Jimmy Vance
HPE Pro

Re: DL360Gen9 hangs on Linux boot occasionally

I'll pass the information along. BIOS does use int13 call, UEFI does not.

No support by private messages. Please ask the forum! 
Peter Sokolov
Advisor

Re: DL360Gen9 hangs on Linux boot occasionally

After installing all April 2016 updates this problem is still not solved. Do you require any help from our side to solve this problem?

As I explained it is very easy to repeat this problem on a DL360Gen9:

1. Make sure that one formatted FAT32 partition is on the hard disk with at least one file (in the following the file COMMAND.COM is used but it can be any other file)

2. Boot MSDOS or FreeDOS from a USB Stick in Legacy Mode. After booting the drive C: will be the USB Stick and drive D: will be the FAT32 partition on the hard disk

3. Create the following batch file D:\TSTCPY.BAT which constantly copies D:\COMMAND.COM to D:\TEST.TST

:loop

copy d:\command.com d:\test.tst

goto loop

4. After starting D:\TSTCPY.BAT the system either hangs or displays a red screen. Sometimes it only takes a few seconds for the problem to appear and sometimes it takes an hour.

Any chance that this will be corrected in a new firmware/BIOS/EFI version?

Peter Sokolov
Advisor

Re: DL360Gen9 hangs on Linux boot occasionally

My boss told me that he is unwilling to take the risk and continue using HPE servers if this problem is not corrected. Therefore I had to find a different server where this problem is not happening and where everything works correctly.

I tested the same on Lenovo X3250M5 and on this server the problem does not happen and everything is working correctly. According to my boss we will be switching to the Lenovo X3250M5 server in the next weeks if no progress in fixing the bug on HPE ProLiant DL360Gen9 will be shown from HPE. Unfortunatelly the HP DL360Gen8 where this problem is not happening is not available anymore.

Jimmy Vance
HPE Pro

Re: DL360Gen9 hangs on Linux boot occasionally

Have you opened a support case with HPE?  These forums are supported by the user community. 

No support by private messages. Please ask the forum! 
Peter Sokolov
Advisor

Re: DL360Gen9 hangs on Linux boot occasionally

Yes, but because the only way to repeat this problem easily is using DOS they have refused to accept this issue and they closed the case two times already without even trying to repeat the problem.

HPE support is simply not interested in handling this issue.

Peter Sokolov
Advisor

Re: DL360Gen9 hangs on Linux boot occasionally

I was hoping that a reasonable person from HPE will be reading this who understands that the issue is definitely not limited to DOS but may affect anything that boots in legacy mode. The first level support from HPE does not understand this and therefore does not want to handle this.

But maybe 20 servers per year that we buy is simply not important enough for HPE and we will then really have to switch to Lenovo servers even when we prefer HPE...

Frankly I do not understand that when a bug is found and it can easily be repeated on any server that the manufacturer simply ignores it.

bramd
New Member

Re: DL360Gen9 hangs on Linux boot occasionally

I experienced similar behaviour with 6 new servers we just got, booting from a centos 6.7 custom image.

It boots the image and then the centos blue progress bar keeps going forever.

Sometimes the machine reboots and goes through the same cycle.

At some time it works and boots into the installer and let me install the OS.

 

The machines are ProLiant DL380 Gen9.

We have over 1000 HP servers.

I don't know if it happened on any of those as I wasn't involved yet.