Operating System - Linux
1754310 Members
2805 Online
108813 Solutions
New Discussion юеВ

Unable to PXE boot on HPE ProLiant DL360 Gen10 server due to network drivers

 
som2022
Member

Unable to PXE boot on HPE ProLiant DL360 Gen10 server due to network drivers

I am working on setting up PXE Booting on a HPE ProLiant DL360 Gen10 Server. The server has a HPE Ethernet 100Gb 1-port 842QSFP28 Adapter for the network interface.

We have a custom linux kernel that has been configured for HPC. When this custom kernel is loaded from the local storage, the network card does not get loaded. This is probably because the network adapter requires 3rd party drivers. I have obtained the drivers from HPE website. (https://support.hpe.com/hpesc/public/swd/detail?swItemId=MTX_736fae748a834ba58a2e633d2e&swEnvOid=2000225)

At the moment, to keep the machine operational a standard kernel (3.10 with CentOS 7) is running. After studying the driver documentation and the installation instruction, I've found that there are some options for the install script that makes it possible to install the required kernel modules for a target kernel. Specifically, I use these options for the installation script:

 

 ./mlnxofedinstall --skip-distro-check --add-kernel-support --kernel-only --skip-unsupported-devices-check --kernel 4.4.47 -s /usr/src/kernels/linux4.4.47/ 

 

When this option is used, the kernel modules get installed into the directory /lib/modules/4.4.47/extra/. Using this, I generate an initramfs file using dracut -f /path/to/custom-initramfs.img /path/to/custom-vmlinuz --kver 4.4.47

I've deployed the custom initramfs and the kernel image to the tftpboot directory and configured the pxelinux.cfg accordingly. When I reboot the server, it successfully detects the the kernel and the initramfs but then after some time it gets stuck at trying to discovering network (See the attached screenshot). I simply get the message "Trying to detect network". It then changes to "Retrying in 110 seconds". Finally it times out and remains stuck.

So for some reason, the network drivers do not get loaded. When I use zcat to verify the contents of the initramfs, I can see that the driver modules are included in it. I am not sure how to proceed further. The driver documentation does not contain much information about how to install for PXE booting or I may have missed it. Can anyone advise what else I can try next? Thank you.

Network booting problem

6 REPLIES 6
shiva_jr
HPE Pro

Re: Unable to PXE boot on HPE ProLiant DL360 Gen10 server due to network drivers

Hi som2022,
If I believe that your network adapter's product number is 874253-B21 and the HPE network adapter named as HPE Ethernet 100Gb 1-port QSFP28 MCX515A-CCAT Adapter.
Please update this driver and firmware
Let us know if you have the other product number.

Regards,
Shiva_JR
Please mark as 'Accepted solution' if my post worked.

I work for HPE
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]

Accept or Kudo

som2022
Member

Re: Unable to PXE boot on HPE ProLiant DL360 Gen10 server due to network drivers

@shiva_jr 

Hi shiva_jr,

Thank you for the link to the drivers. I tried recompiling but I still get the same problem. May be I am missing some steps. Here is what I am doing on my build machine:

1. My build machine runs CentOS 7. I download kernel sources for versin 4.4.47 (since the configuration we currently have is optimized for this version).

2. I place the .config file inside the source root directory and run

 

make oldconfig && make prepare
make -j4

 

3. After this, I create a symlink of the source directory of the source directory at /lib/src/kernels/linux-4.4.47.

4.Now, I mount the drivers ISO and transfer the contents to my home directory. I run the installation script with the following options:

 

sudo ./mellanoxofedinstall --skip-distrocheck --add-kernel-support --kernel-only --kernel 4.4.47 -s /lib/src/kernels/linux-4.4.47/ --with-nvmf --with-nfsrdma --without-rshim

 

5. I pass the option --without-rhim because the installation script aborts from some reason and when I checked the logs, it had some problem with installing rshim.

6. After the installation, I generate the initramfs file:

 

depmod -a
dracut -f initramfs-4.4.47.img 4.4.47

 

7. I deploy the compiled kernel and the initramfs file on the tftp server's boot directory and update the pxelinux.cfg file as shown:

 

PROMPT 0
LABEL linux
default linux
  KERNEL vmlinuz-4.4.47
  APPEND initrd=initramfs-4.4.47.img root=/dev/nfs rw nfsroot=192.168.1.201:/netboot/64Bit/cfd-001 ip=dhcp splash=0 console=tty0

 

8. I reboot the machine via the iLO interface and select network booting. The kernel and the initramfs get detected and booting starts but it gets stuck indefinitely waiting for network.

Bunsol
HPE Pro

Re: Unable to PXE boot on HPE ProLiant DL360 Gen10 server due to network drivers

The driver available from HPE will be for RHEL 7 but if the kernel matches then it should be fine. Please check the below link:-

https://support.hpe.com/connect/s/softwaredetails?language=en_US&softwareId=MTX_08a0e6e8b2904c47af59bb6e67&tab=revisionHistory

The revision history shows this particular adapter listed. Test with this driver. If issue still persists then log a support ticket as it might need further investigation.


If you feel this was helpful please click the KUDOS! Thumbs below!
I am an HPE employee.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
som2022
Member

Re: Unable to PXE boot on HPE ProLiant DL360 Gen10 server due to network drivers

@Bunsol 

I retried with the kernel version shown in the driver's page (3.10). I got the error shown in the attached screenshot (A start job is running for dev-nfs.device).

linux_screenshot.png

How do I create a support ticket? The form asks for a contract number.

Bunsol
HPE Pro

Re: Unable to PXE boot on HPE ProLiant DL360 Gen10 server due to network drivers

Use below link and select Create and Manage Cases:-
https://support.hpe.com/connect/s/?language=en_US


If you feel this was helpful please click the KUDOS! Thumbs below!
I am an HPE employee.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
som2022
Member

Re: Unable to PXE boot on HPE ProLiant DL360 Gen10 server due to network drivers

Thank you. I have created a case: Case Number: 5378363594.