ProLiant Servers (ML,DL,SL)
1758603 Members
2360 Online
108873 Solutions
New Discussion юеВ

HP ProLiant ML350 G6 Hangs while booting

 
nottoday
Member

HP ProLiant ML350 G6 Hangs while booting

I have an HP ProLiant ML350 G6 sever with Debian 11 bullseye, but for some reason it sometimes hangs on boot. It only happens occasionally. I do not have a graphical interface installed.

I've tried to install the firmware-linux-nonfree package, but it did not help.

I looked in the boot logs for a boot where it did and did not boot. But except for that the failed boot does stop logging at some point they don't seem to differ.

Does anyone have any suggestions what might cause this or ideas how to diagnose the problem?

These are the logs for when it did not boot.

 

journalctl -b -1 --priority 4
-- Journal begins at Tue 2021-10-19 14:39:39 CEST, ends at Fri 2022-07-15 15:16:49 CEST. --
Jul 15 14:37:59 sever-Remi kernel: ACPI BIOS Warning (bug): Invalid length for FADT/Pm1aControlBlock: 32, using default 16 (20200925/tbfadt-669)
Jul 15 14:37:59 sever-Remi kernel: ACPI BIOS Warning (bug): Invalid length for FADT/Pm2ControlBlock: 32, using default 8 (20200925/tbfadt-669)
Jul 15 14:37:59 sever-Remi kernel: ACPI: SPCR: Unexpected SPCR Access Width.  Defaulting to byte size
Jul 15 14:37:59 sever-Remi kernel: DMAR-IR: This system BIOS has enabled interrupt remapping
                                   on a chipset that contains an erratum making that
                                   feature unstable.  To maintain system stability
                                   interrupt remapping is being disabled.  Please
                                   contact your BIOS vendor for an update
Jul 15 14:37:59 sever-Remi kernel: [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
Jul 15 14:37:59 sever-Remi kernel: Intel PMU driver.
Jul 15 14:37:59 sever-Remi kernel: core: CPUID marked event: 'bus cycles' unavailable
Jul 15 14:37:59 sever-Remi kernel: MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
Jul 15 14:37:59 sever-Remi kernel:   #5  #6  #7
Jul 15 14:37:59 sever-Remi kernel: ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
Jul 15 14:37:59 sever-Remi kernel: ERST: Failed to get Error Log Address Range.
Jul 15 14:37:59 sever-Remi kernel: ACPI Warning: SystemIO range 0x0000000000000928-0x000000000000092F conflicts with OpRegion 0x0000000000000920-0x000000000000092F (\SGPE) (20200925/utaddress-204)
Jul 15 14:37:59 sever-Remi kernel: lpc_ich: Resource conflict(s) found affecting gpio_ich
Jul 15 14:37:59 sever-Remi kernel: scsi 0:0:0:0: Power-on or device reset occurred
Jul 15 14:37:59 sever-Remi kernel: scsi 0:0:1:0: Power-on or device reset occurred
Jul 15 14:37:59 sever-Remi kernel: scsi 0:0:2:0: Power-on or device reset occurred
Jul 15 14:37:59 sever-Remi kernel: scsi 0:0:3:0: Power-on or device reset occurred
Jul 15 14:37:59 sever-Remi kernel: scsi 0:0:4:0: Power-on or device reset occurred
Jul 15 14:37:59 sever-Remi kernel: scsi 0:0:5:0: Power-on or device reset occurred
Jul 15 14:37:59 sever-Remi kernel: scsi 0:0:6:0: Power-on or device reset occurred
Jul 15 14:37:59 sever-Remi kernel: scsi 0:0:7:0: Power-on or device reset occurred

 


And these for when it did.

 

journalctl -b 0 --priority 4
-- Journal begins at Tue 2021-10-19 14:39:39 CEST, ends at Fri 2022-07-15 15:14:38 CEST. --
Jul 15 14:58:14 sever-Remi kernel: ACPI BIOS Warning (bug): Invalid length for FADT/Pm1aControlBlock: 32, using default 16 (20200925/tbfadt-669)
Jul 15 14:58:14 sever-Remi kernel: ACPI BIOS Warning (bug): Invalid length for FADT/Pm2ControlBlock: 32, using default 8 (20200925/tbfadt-669)
Jul 15 14:58:14 sever-Remi kernel: ACPI: SPCR: Unexpected SPCR Access Width.  Defaulting to byte size
Jul 15 14:58:14 sever-Remi kernel: DMAR-IR: This system BIOS has enabled interrupt remapping
                                   on a chipset that contains an erratum making that
                                   feature unstable.  To maintain system stability
                                   interrupt remapping is being disabled.  Please
                                   contact your BIOS vendor for an update
Jul 15 14:58:14 sever-Remi kernel: [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
Jul 15 14:58:14 sever-Remi kernel: Intel PMU driver.
Jul 15 14:58:14 sever-Remi kernel: core: CPUID marked event: 'bus cycles' unavailable
Jul 15 14:58:14 sever-Remi kernel: MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.
Jul 15 14:58:14 sever-Remi kernel:   #5  #6  #7
Jul 15 14:58:14 sever-Remi kernel: ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
Jul 15 14:58:14 sever-Remi kernel: ERST: Failed to get Error Log Address Range.
Jul 15 14:58:14 sever-Remi kernel: ACPI Warning: SystemIO range 0x0000000000000928-0x000000000000092F conflicts with OpRegion 0x0000000000000920-0x000000000000092F (\SGPE) (20200925/utaddress-204)
Jul 15 14:58:14 sever-Remi kernel: lpc_ich: Resource conflict(s) found affecting gpio_ich
Jul 15 14:58:14 sever-Remi kernel: scsi 0:0:0:0: Power-on or device reset occurred
Jul 15 14:58:14 sever-Remi kernel: scsi 0:0:1:0: Power-on or device reset occurred
Jul 15 14:58:14 sever-Remi kernel: scsi 0:0:2:0: Power-on or device reset occurred
Jul 15 14:58:14 sever-Remi kernel: scsi 0:0:3:0: Power-on or device reset occurred
Jul 15 14:58:14 sever-Remi kernel: scsi 0:0:4:0: Power-on or device reset occurred
Jul 15 14:58:14 sever-Remi kernel: scsi 0:0:5:0: Power-on or device reset occurred
Jul 15 14:58:14 sever-Remi kernel: scsi 0:0:6:0: Power-on or device reset occurred
Jul 15 14:58:14 sever-Remi kernel: scsi 0:0:7:0: Power-on or device reset occurred
Jul 15 14:58:16 sever-Remi kernel: pcc_cpufreq_init: Too many CPUs, dynamic performance scaling disabled
Jul 15 14:58:16 sever-Remi kernel: pcc_cpufreq_init: Try to enable another scaling driver through BIOS settings
Jul 15 14:58:16 sever-Remi kernel: pcc_cpufreq_init: and complain to the system vendor
Jul 15 14:58:16 sever-Remi kernel: cpufreq: Can't use schedutil governor as dynamic switching is disallowed. Fallback to performance governor
Jul 15 14:58:16 sever-Remi kernel: cpufreq: Can't use schedutil governor as dynamic switching is disallowed. Fallback to performance governor
Jul 15 14:58:16 sever-Remi kernel: cpufreq: Can't use schedutil governor as dynamic switching is disallowed. Fallback to performance governor
Jul 15 14:58:16 sever-Remi kernel: cpufreq: Can't use schedutil governor as dynamic switching is disallowed. Fallback to performance governor
Jul 15 14:58:16 sever-Remi kernel: cpufreq: Can't use schedutil governor as dynamic switching is disallowed. Fallback to performance governor
Jul 15 14:58:16 sever-Remi kernel: cpufreq: Can't use schedutil governor as dynamic switching is disallowed. Fallback to performance governor
Jul 15 14:58:16 sever-Remi kernel: cpufreq: Can't use schedutil governor as dynamic switching is disallowed. Fallback to performance governor
Jul 15 14:58:16 sever-Remi kernel: cpufreq: Can't use schedutil governor as dynamic switching is disallowed. Fallback to performance governor
Jul 15 14:58:16 sever-Remi kernel: power_meter ACPI000D:00: Ignoring unsafe software power cap!
Jul 15 14:58:16 sever-Remi kernel: power_meter ACPI000D:00: hwmon_device_register() is deprecated. Please convert the driver to use hwmon_device_register_with_info().
Jul 15 14:58:16 sever-Remi kernel: ipmi_si 0000:01:04.6: Could not setup I/O space
Jul 15 14:58:16 sever-Remi kernel: spl: loading out-of-tree module taints kernel.
Jul 15 14:58:16 sever-Remi kernel: znvpair: module license 'CDDL' taints kernel.
Jul 15 14:58:16 sever-Remi kernel: Disabling lock debugging due to kernel taint
Jul 15 14:58:16 sever-Remi kernel: kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL does not work properly. Using workaround
Jul 15 14:58:21 sever-Remi NetworkManager[876]: <warn>  [1657889901.0283] ifupdown: interfaces file /etc/network/interfaces.d/* doesn't exist

 

4 REPLIES 4
support_s
System Recommended

Query: HP ProLiant ML350 G6 Hangs while boot

nottoday
Member

Re: Query: HP ProLiant ML350 G6 Hangs while boot

I'm sorry for the late response.

I appreciate your help, but both of these links are for different generations of the server. I have the sixth generation. The second link is even for windows while I use Debian.

I don't think that this really helps me.

MV3
HPE Pro

Re: Query: HP ProLiant ML350 G6 Hangs while boot

Hello

 

We suspect this to be some kind of BIOS corruption, Please open a spport case with HPE for further analysis.

 

Cheers

I am an HPE employee.
[Any personal opinions expressed are mine, and not official statements on behalf of Hewlett Packard Enterprise]
Accept or Kudo
nottoday
Member

Re: Query: HP ProLiant ML350 G6 Hangs while boot

@MV3 

My servers warranty is expired. Is there any way to reflash the bios to ensure fix corruptions? Or maybe at least test for bios corruptions. So I know what the problem is?