1753783 Members
6908 Online
108799 Solutions
New Discussion

Server getting down

 
manues50
Occasional Contributor

Server getting down

Hello 

My blade server restarting unexpectedly. 

Syslog

``````

Jan 9 16:46:16 stblade1 kernel: [362355.616139] mce_notify_irq: 7 callbacks suppressed
Jan 9 16:46:16 stblade1 kernel: [362355.616169] mce: [Hardware Error]: Machine check events logged
Jan 9 16:46:16 stblade1 kernel: [362355.616350] mce: [Hardware Error]: Machine check events logged
Jan 9 16:46:26 stblade1 systemd[1]: session-166.scope: Succeeded.
Jan 9 16:47:00 stblade1 systemd[1]: Starting Proxmox VE replication runner...
Jan 9 16:47:02 stblade1 systemd[1]: pvesr.service: Succeeded.
Jan 9 16:47:02 stblade1 systemd[1]: Started Proxmox VE replication runner.
Jan 9 16:48:00 stblade1 systemd[1]: Starting Proxmox VE replication runner...
Jan 9 16:48:02 stblade1 systemd[1]: pvesr.service: Succeeded.
Jan 9 16:48:02 stblade1 systemd[1]: Started Proxmox VE replication runner.
Jan 9 16:49:00 stblade1 kernel: [362519.457113] mce_notify_irq: 25 callbacks suppressed
Jan 9 16:49:00 stblade1 kernel: [362519.457133] mce: [Hardware Error]: Machine check events logged
Jan 9 16:49:00 stblade1 kernel: [362519.457231] mce: [Hardware Error]: Machine check events logged
Jan 9 16:49:00 stblade1 systemd[1]: Starting Proxmox VE replication runner...
Jan 9 16:49:02 stblade1 systemd[1]: pvesr.service: Succeeded.
Jan 9 16:49:02 stblade1 systemd[1]: Started Proxmox VE replication runner.
Jan 9 16:50:00 stblade1 systemd[1]: Starting Proxmox VE replication runner...
Jan 9 16:50:02 stblade1 systemd[1]: pvesr.service: Succeeded.
Jan 9 16:50:02 stblade1 systemd[1]: Started Proxmox VE replication runner.
Jan 9 16:51:00 stblade1 systemd[1]: Starting Proxmox VE replication runner...
Jan 9 16:51:02 stblade1 systemd[1]: pvesr.service: Succeeded.
Jan 9 16:51:02 stblade1 systemd[1]: Started Proxmox VE replication runner.
Jan 9 16:51:27 stblade1 kernel: [362666.914445] mce_notify_irq: 7 callbacks suppressed
Jan 9 16:51:27 stblade1 kernel: [362666.914484] mce: [Hardware Error]: Machine check events logged
Jan 9 16:51:27 stblade1 kernel: [362666.914622] mce: [Hardware Error]: Machine check events logged
Jan 9 16:51:47 stblade1 kernel: [362686.801954] general protection fault: 0000 [#1] SMP PTI
Jan 9 16:51:47 stblade1 kernel: [362686.812358] CPU: 8 PID: 7550 Comm: sh Tainted: P O 5.0.15-1-pve #1
Jan 9 16:51:47 stblade1 kernel: [362686.815825] Hardware name: HP ProLiant BL460c Gen8, BIOS I31 03/01/2013
Jan 9 16:51:47 stblade1 kernel: [362686.818550] RIP: 0010:copy_process.part.38+0x1ac/0x1fc0
Jan 9 16:51:47 stblade1 kernel: [362686.821359] Code: d2 65 48 8b 05 45 24 b8 47 65 48 0f b1 15 3c 24 b8 47 75 f5 48 85 c0 48 89 c1 49 89 c0 4c 8b 95 60 ff ff ff 0f 84 d0 06 00 00 <49> 8b 78 08 31 f6 ba 00 40 00 00 4c 89 95 58 ff ff ff 4c 89 85 60
Jan 9 16:51:47 stblade1 kernel: [362686.825602] RSP: 0018:ffffb528478b7d90 EFLAGS: 00010286
Jan 9 16:51:47 stblade1 kernel: [362686.827500] RAX: fffd9787d80c47c0 RBX: ffff978971fadc00 RCX: fffd9787d80c47c0
Jan 9 16:51:47 stblade1 kernel: [362686.829128] RDX: 0000000000000000 RSI: 00000000006000c0 RDI: ffff9789a083a880
Jan 9 16:51:47 stblade1 kernel: [362686.830727] RBP: ffffb528478b7e80 R08: fffd9787d80c47c0 R09: 0000000000000000
Jan 9 16:51:47 stblade1 kernel: [362686.832011] R10: ffff9788dca68000 R11: 0000000000000000 R12: 0000000001200011
Jan 9 16:51:47 stblade1 kernel: [362686.833298] R13: 0000000000000000 R14: 00007fd74f4b4850 R15: 00000000ffffffff
Jan 9 16:51:47 stblade1 kernel: [362686.834685] FS: 00007fd74f4b4580(0000) GS:ffff9789a7400000(0000) knlGS:0000000000000000
Jan 9 16:51:47 stblade1 kernel: [362686.836024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 9 16:51:47 stblade1 kernel: [362686.837355] CR2: 00007fd74f4492c0 CR3: 0000000a3f582004 CR4: 00000000000626e0
Jan 9 16:51:47 stblade1 kernel: [362686.838787] Call Trace:
Jan 9 16:51:47 stblade1 kernel: [362686.840167] ? _copy_to_user+0x2b/0x40
Jan 9 16:51:47 stblade1 kernel: [362686.841522] ? cp_new_stat+0x152/0x180
Jan 9 16:51:47 stblade1 kernel: [362686.843028] _do_fork+0xf8/0x400
Jan 9 16:51:47 stblade1 kernel: [362686.844363] __x64_sys_clone+0x27/0x30
Jan 9 16:51:47 stblade1 kernel: [362686.845777] do_syscall_64+0x5a/0x110
Jan 9 16:51:47 stblade1 kernel: [362686.847112] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jan 9 16:51:47 stblade1 kernel: [362686.848417] RIP: 0033:0x7fd74f3b87be
Jan 9 16:51:47 stblade1 kernel: [362686.849798] Code: db 0f 85 25 01 00 00 64 4c 8b 0c 25 10 00 00 00 45 31 c0 4d 8d 91 d0 02 00 00 31 d2 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 b6 00 00 00 41 89 c4 85 c0 0f 85 c3 00 00
Jan 9 16:51:47 stblade1 kernel: [362686.852522] RSP: 002b:00007fffeb22cff0 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
Jan 9 16:51:47 stblade1 kernel: [362686.854041] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd74f3b87be
Jan 9 16:51:47 stblade1 kernel: [362686.855481] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
Jan 9 16:51:47 stblade1 kernel: [362686.856928] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fd74f4b4580
Jan 9 16:51:47 stblade1 kernel: [362686.858489] R10: 00007fd74f4b4850 R11: 0000000000000246 R12: 00005596294e4b48
Jan 9 16:51:47 stblade1 kernel: [362686.859980] R13: 0000000000000000 R14: 00007fffeb22d0b0 R15: 0000000000000002
Jan 9 16:51:47 stblade1 kernel: [362686.861467] Modules linked in: tcp_diag inet_diag dm_snapshot arc4 md4 cmac nls_utf8 cifs ccm fscache veth ebtable_filter ebtables ip_set ip6table_filter ip6_tables iptable_filter bpfilter softdog nfnetlink_log nfnetlink intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ipmi_ssif ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd zfs(PO) glue_helper zunicode(PO) intel_cstate zlua(PO) snd_pcm mgag200 snd_timer ttm snd soundcore intel_rapl_perf drm_kms_helper serio_raw pcspkr joydev input_leds drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt hpilo ioatdma dca ipmi_si ipmi_devintf ipmi_msghandler mac_hid acpi_power_meter zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio
Jan 9 16:51:47 stblade1 kernel: [362686.861704] hid_generic usbmouse usbkbd usbhid hid psmouse lpc_ich bnx2x hpsa mdio scsi_transport_sas libcrc32c video
Jan 9 16:51:47 stblade1 kernel: [362686.877429] ---[ end trace 770c837b12982041 ]---
Jan 9 16:51:47 stblade1 kernel: [362686.879496] RIP: 0010:copy_process.part.38+0x1ac/0x1fc0
Jan 9 16:51:47 stblade1 kernel: [362686.881383] Code: d2 65 48 8b 05 45 24 b8 47 65 48 0f b1 15 3c 24 b8 47 75 f5 48 85 c0 48 89 c1 49 89 c0 4c 8b 95 60 ff ff ff 0f 84 d0 06 00 00 <49> 8b 78 08 31 f6 ba 00 40 00 00 4c 89 95 58 ff ff ff 4c 89 85 60
Jan 9 16:51:47 stblade1 kernel: [362686.885837] RSP: 0018:ffffb528478b7d90 EFLAGS: 00010286
Jan 9 16:51:47 stblade1 kernel: [362686.887861] RAX: fffd9787d80c47c0 RBX: ffff978971fadc00 RCX: fffd9787d80c47c0
Jan 9 16:51:47 stblade1 kernel: [362686.890127] RDX: 0000000000000000 RSI: 00000000006000c0 RDI: ffff9789a083a880
Jan 9 16:51:47 stblade1 kernel: [362686.892174] RBP: ffffb528478b7e80 R08: fffd9787d80c47c0 R09: 0000000000000000
Jan 9 16:51:47 stblade1 kernel: [362686.894354] R10: ffff9788dca68000 R11: 0000000000000000 R12: 0000000001200011
Jan 9 16:51:47 stblade1 kernel: [362686.896369] R13: 0000000000000000 R14: 00007fd74f4b4850 R15: 00000000ffffffff
Jan 9 16:51:47 stblade1 kernel: [362686.898564] FS: 00007fd74f4b4580(0000) GS:ffff9789a7400000(0000) knlGS:0000000000000000
Jan 9 16:51:47 stblade1 kernel: [362686.900580] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 9 16:51:47 stblade1 kernel: [362686.902718] CR2: 00007fd74f4492c0 CR3: 0000000a3f582004 CR4: 00000000000626e0
Jan 9 16:52:00 stblade1 systemd[1]: Starting Proxmox VE replication runner...
Jan 9 16:52:01 stblade1 systemd[1]: pvesr.service: Succeeded.
Jan 9 16:52:01 stblade1 systemd[1]: Started Proxmox VE replication runner.
Jan 9 16:52:01 stblade1 kernel: [362700.911218] general protection fault: 0000 [#2] SMP PTI
Jan 9 16:52:01 stblade1 kernel: [362700.937953] CPU: 8 PID: 7674 Comm: run-parts Tainted: P D O 5.0.15-1-pve #1
Jan 9 16:52:01 stblade1 kernel: [362700.942433] Hardware name: HP ProLiant BL460c Gen8, BIOS I31 03/01/2013
Jan 9 16:52:01 stblade1 kernel: [362700.945378] RIP: 0010:copy_process.part.38+0x1ac/0x1fc0
Jan 9 16:52:01 stblade1 kernel: [362700.948200] Code: d2 65 48 8b 05 45 24 b8 47 65 48 0f b1 15 3c 24 b8 47 75 f5 48 85 c0 48 89 c1 49 89 c0 4c 8b 95 60 ff ff ff 0f 84 d0 06 00 00 <49> 8b 78 08 31 f6 ba 00 40 00 00 4c 89 95 58 ff ff ff 4c 89 85 60
Jan 9 16:52:01 stblade1 kernel: [362700.952764] RSP: 0018:ffffb528474e7d90 EFLAGS: 00010206
Jan 9 16:52:01 stblade1 kernel: [362700.954781] RAX: 000800000003b000 RBX: ffff978972a34500 RCX: 0000000000000000
Jan 9 16:52:01 stblade1 kernel: [362700.956655] RDX: 0000000000000000 RSI: 00000000006000c0 RDI: ffff9789861d7a80
Jan 9 16:52:01 stblade1 kernel: [362700.958507] RBP: ffffb528474e7e80 R08: 000800000003b000 R09: 0000000000000000
Jan 9 16:52:01 stblade1 kernel: [362700.960206] R10: ffff97834b39c500 R11: 0000000000000000 R12: 0000000001200011
Jan 9 16:52:01 stblade1 kernel: [362700.961928] R13: 0000000000000000 R14: 00007fcb19557a10 R15: 00000000ffffffff
Jan 9 16:52:01 stblade1 kernel: [362700.963753] FS: 00007fcb19557740(0000) GS:ffff9789a7400000(0000) knlGS:0000000000000000
Jan 9 16:52:01 stblade1 kernel: [362700.965519] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 9 16:52:01 stblade1 kernel: [362700.967362] CR2: 00000000010f9118 CR3: 0000000aa2d54002 CR4: 00000000000626e0
Jan 9 16:52:01 stblade1 kernel: [362700.969165] Call Trace:
Jan 9 16:52:01 stblade1 kernel: [362700.971034] ? security_file_alloc+0x4e/0x90
Jan 9 16:52:01 stblade1 kernel: [362700.972788] _do_fork+0xf8/0x400
Jan 9 16:52:01 stblade1 kernel: [362700.974570] ? __secure_computing+0x3e/0xd0
Jan 9 16:52:01 stblade1 kernel: [362700.976261] ? syscall_trace_enter+0x196/0x2b0
Jan 9 16:52:01 stblade1 kernel: [362700.977913] __x64_sys_clone+0x27/0x30
Jan 9 16:52:01 stblade1 kernel: [362700.979550] do_syscall_64+0x5a/0x110
Jan 9 16:52:01 stblade1 kernel: [362700.981230] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Jan 9 16:52:01 stblade1 kernel: [362700.982898] RIP: 0033:0x7fcb18c08922
Jan 9 16:52:01 stblade1 kernel: [362700.984547] Code: f7 d8 64 89 04 25 d4 02 00 00 64 4c 8b 04 25 10 00 00 00 31 d2 4d 8d 90 d0 02 00 00 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 5d 01 00 00 85 c0 41 89 c5 0f 85 67 01 00
Jan 9 16:52:01 stblade1 kernel: [362700.987959] RSP: 002b:00007fff191b4da0 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
Jan 9 16:52:01 stblade1 kernel: [362700.989627] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcb18c08922
Jan 9 16:52:01 stblade1 kernel: [362700.991271] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
Jan 9 16:52:01 stblade1 kernel: [362700.992880] RBP: 00007fff191b4dc0 R08: 00007fcb19557740 R09: 0000000000000000
Jan 9 16:52:01 stblade1 kernel: [362700.994444] R10: 00007fcb19557a10 R11: 0000000000000246 R12: 0000000000000000
Jan 9 16:52:01 stblade1 kernel: [362700.996028] R13: 0000000000000000 R14: 0000000000000001 R15: 00007fff191b51fc
Jan 9 16:52:01 stblade1 kernel: [362700.997535] Modules linked in: tcp_diag inet_diag dm_snapshot arc4 md4 cmac nls_utf8 cifs ccm fscache veth ebtable_filter ebtables ip_set ip6table_filter ip6_tables iptable_filter bpfilter softdog nfnetlink_log nfnetlink intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ipmi_ssif ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd zfs(PO) glue_helper zunicode(PO) intel_cstate zlua(PO) snd_pcm mgag200 snd_timer ttm snd soundcore intel_rapl_perf drm_kms_helper serio_raw pcspkr joydev input_leds drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt hpilo ioatdma dca ipmi_si ipmi_devintf ipmi_msghandler mac_hid acpi_power_meter zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) vhost_net vhost tap ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio
Jan 9 16:52:01 stblade1 kernel: [362700.997955] hid_generic usbmouse usbkbd usbhid hid psmouse lpc_ich bnx2x hpsa mdio scsi_transport_sas libcrc32c video
Jan 9 16:52:01 stblade1 kernel: [362701.012086] ---[ end trace 770c837b12982042 ]---
Jan 9 16:52:01 stblade1 kernel: [362701.013810] RIP: 0010:copy_process.part.38+0x1ac/0x1fc0
Jan 9 16:52:01 stblade1 kernel: [362701.015706] Code: d2 65 48 8b 05 45 24 b8 47 65 48 0f b1 15 3c 24 b8 47 75 f5 48 85 c0 48 89 c1 49 89 c0 4c 8b 95 60 ff ff ff 0f 84 d0 06 00 00 <49> 8b 78 08 31 f6 ba 00 40 00 00 4c 89 95 58 ff ff ff 4c 89 85 60
Jan 9 16:52:01 stblade1 kernel: [362701.019657] RSP: 0018:ffffb528478b7d90 EFLAGS: 00010286
Jan 9 16:52:01 stblade1 kernel: [362701.021366] RAX: fffd9787d80c47c0 RBX: ffff978971fadc00 RCX: fffd9787d80c47c0
Jan 9 16:52:01 stblade1 kernel: [362701.023107] RDX: 0000000000000000 RSI: 00000000006000c0 RDI: ffff9789a083a880
Jan 9 16:52:02 stblade1 kernel: [362701.024806] RBP: ffffb528478b7e80 R08: fffd9787d80c47c0 R09: 0000000000000000
Jan 9 16:52:02 stblade1 kernel: [362701.026571] R10: ffff9788dca68000 R11: 0000000000000000 R12: 0000000001200011
Jan 9 16:52:02 stblade1 kernel: [362701.028236] R13: 0000000000000000 R14: 00007fd74f4b4850 R15: 00000000ffffffff
Jan 9 16:52:02 stblade1 kernel: [362701.030022] FS: 00007fcb19557740(0000) GS:ffff9789a7400000(0000) knlGS:0000000000000000
Jan 9 16:52:02 stblade1 kernel: [362701.031694] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 9 16:52:02 stblade1 kernel: [362701.033310] CR2: 00000000010f9118 CR3: 0000000aa2d54002 CR4: 00000000000626e0
Jan 9 16:52:15 stblade1 kernel: [362714.420224] general protection fault: 0000 [#3] SMP PTI
Jan 9 16:52:15 stblade1 kernel: [362714.443211] CPU: 10 PID: 1071 Comm: ksmtuned Tainted: P D O 5.0.15-1-pve #1
Jan 9 16:52:15 stblade1 kernel: [362714.446920] Hardware name: HP ProLiant BL460c Gen8, BIOS I31 03/01/2013
Jan 9 16:52:15 stblade1 kernel: [362714.449446] RIP: 0010:copy_process.part.38+0x1ac/0x1fc0
Jan 9 16:52:15 stblade1 kernel: [362714.451807] Code: d2 65 48 8b 05 45 24 b8 47 65 48 0f b1 15 3c 24 b8 47 75 f5 48 85 c0 48 89 c1 49 89 c0 4c 8b 95 60 ff ff ff 0f 84 d0 06 00 00 <49> 8b 78 08 31 f6 ba 00 40 00 00 4c 89 95 58 ff ff ff 4c 89 85 60
Jan 9 16:52:15 stblade1 kernel: [362714.456432] RSP: 0018:ffffb52847a83d90 EFLAGS: 00010286
Jan 9 16:52:15 stblade1 kernel: [362714.458757] RAX: fffd9783a041b200 RBX: ffff9788d8018000 RCX: fffd9783a041b200
Jan 9 16:52:15 stblade1 kernel: [362714.460932] RDX: 0000000000000000 RSI: 00000000006000c0 RDI: ffff9789a1890300
Jan 9 16:52:15 stblade1 kernel: [362714.463134] RBP: ffffb52847a83e80 R08: fffd9783a041b200 R09: 0000000000000000
Jan 9 16:52:15 stblade1 kernel: [362714.465257] R10: ffff9789a0f50000 R11: 0000000000000000 R12: 0000000001200011
Jan 9 16:52:15 stblade1 kernel: [362714.467346] R13: 0000000000000000 R14: 00007fdd7bc76a10 R15: 00000000ffffffff
Jan 9 16:52:15 stblade1 kernel: [362714.469438] FS: 00007fdd7bc76740(0000) GS:ffff9789a7480000(0000) knlGS:0000000000000000
Jan 9 16:52:15 stblade1 kernel: [362714.471544] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 9 16:52:15 stblade1 kernel: [362714.473662] CR2: 00007fdd7be357f0 CR3: 0000000c20ada004 CR4: 00000000000626e0
Jan 9 16:52:15 stblade1 kernel: [362714.475794] Call Trace: 

1 REPLY 1
BH_S
HPE Pro

Re: Server getting down

Hello manues50,

Good Day!!!

As  I undrstand blade restarts unexpectedly with the syslog messages reported. However we need to know more details and have to capture complete logs for analysis. Only then we can come to an solution,.

1. Server OS logs (HPS for Windows, Vmsupport logs for ESXi, SOSreport for Linux)

2. Details of blade server configuration. (model details)

3. Does the server restart after logging into OS or before POST Screen.

Hence please raise a support case with basic details as above and we will do the needful.

Regards,

BH_S

I am an HPE Employee.

 

 

 

 

 

 


I am an HP Employee

Accept or Kudo