Operating System - HP-UX
1753701 Members
4923 Online
108799 Solutions
New Discussion

Re: rx2600 ignite failure

 
donna hofmeister
Trusted Contributor

rx2600 ignite failure

got an 11.23 rx2600 server where the second drive (not boot) comprising vg00 failed. multiple attempts at replacing this drive always resulted in vgcfgrestore failing because the drive was "too small" even though the model matched (probably a firmware issue). decided to just punt and ignite the box from my ignite server -- and ain't that been just barrels of fun (not) -- but finally got it to actually start the recovery.

here's some snippets leading up to the actually failure:

* Loading configuration utility...
* Beginning installation from source: 10.84.3.26
======= 08/21/19 15:27:18 EDT Starting system configuration...
* Configure_Disks: Begin
* Mapping LUN Instance Data
* Will install B.11.23 onto this system.
* Using LVM for disk 0/1/1/0.1.0 (c2t1d0), group: vg00
* Using LVM for disk 0/1/1/0.0.0 (c2t0d0s2), group: vg00
* Creating LVM physical volume "/dev/rdsk/c2t1d0" (0/1/1/0.1.0).
* Creating LVM physical volume "/dev/rdsk/c2t0d0s2" (0/1/1/0.0.0).
* Creating volume group "vg00".
* Creating logical volume "vg00/lvol1" (/stand).

<<snip>>

* Making VxFS filesystem for "/opt", (/dev/vg00/rlvol6).
* Making VxFS filesystem for "/usr", (/dev/vg00/rlvol7).
* Configure_Disks: Complete
* Download_mini-system: Begin
x ./usr/sbin/idisk, 85596 bytes, 168 tape blocks
x ./sbin/fs/hfs/mkfs, 586996 bytes, 1147 tape blocks
x ./sbin/fs/hfs/newfs, 461420 bytes, 902 tape blocks
x ./sbin/fs/vxfs/mkfs, 1966500 bytes, 3841 tape blocks

<<snip>>

x ./etc/inetsvcs.conf, 15 bytes, 1 tape blocks
x ./etc/nsswitch.conf, 444 bytes, 1 tape blocks
x ./configure3, 3773952 bytes, 7371 tape blocks
x ./monitor_bpr, 2704084 bytes, 5282 tape blocks
* Download_mini-system: Complete
* Loading_software: Begin
* Installing boot area on disk.
* Formatting HP Service Partition.
* Enabling swap areas.
* Number of archives to install: 1
* Processing the archive source (Recovery Archive).
* Wed Aug 21 15:28:03 EDT 2019: Starting archive load of the source (Recovery Archive).
pax_iux: etc/dt/config/Xconfig : No user information could be found in the archive
pax_iux: etc/dt/config/Xconfig : No group information could be found in the archive
* Processed 10% of archive
* Processed 20% of archive
Calling function e000000001260ca0 for Shutdown State 8 type 0x2

<<snip>>

System Panic:

panic: post_hndlr(): Unresolved kernel interruption

Stack Trace:
IP Function Name
0xe00000000061b9a0 post_hndlr+0xcc0
0xe00000000061a800 vm_hndlr+0x220
0xe000000000e7e780 bubbledown
0xe00000000194a170 kmem_gc_small+0x140
0xe000000001949aa0 kmem_gc_freelist+0xb0
0xe000000001949860 kmem_gc_arena+0x1c0
0xe000000001949570 kmem_do_gc+0xf0
0xe00000000193e5a0 foreach_arena_ingroup+0xc0
0xe0000000008efa60 kmem_garbage_collect_group+0x1c0
0xe0000000008bade0 kmem_garbage_collect+0x1c0
0xe0000000008babe0 kmem_arena_stealpages+0x40
0xe00000000089a0d0 vhand_service_mem_class+0x190
0xe0000000018cba60 vhand+0x190
0xe0000000014d40b0 im_vhand+0x190
0xe0000000018c5cd0 DoCalllist+0x3a0
End of Stack Trace

anyone have any idea what's wrong?

7 REPLIES 7
mbarnwal
HPE Pro

Re: rx2600 ignite failure

Does that crash occur during restore process or after reboot post archive restore is complete?

 

I am a HPE Employee
donna hofmeister
Trusted Contributor

Re: rx2600 ignite failure

During the initial restore.

mbarnwal
HPE Pro

Re: rx2600 ignite failure

The stack trace of panic event points to BUG QXCR1000589514, which is fixed in PHNE_36979.

PHNE_36979:
( QX:QXCR1000589514 SR:8606492812 CR:JAGag44983 )
A client system panics when using CIFS mounts.
The panic may occur in "rfscall()" with a stack
trace similar to:

panic+0x380
post_hndlr+0xcc0
vm_hndlr+0x220
bubbleup+0x880
+------------- TRAP ----------------------------
| Data TLB Fault in KERNEL mode
| IIP=0xe00000000148d330
| IFA=0xe0000001a857ecc8
| p struct save_state 0x1c2e6231.0x9fffffff7f7e7400
+------------- TRAP ----------------------------
clnt_clts_kcallit_addr+0x1100
clnt_clts_kcallit+0x60
rfscall+0x440
rfs3call+0x80
nfs3write+0x180
nfs3_do_bio+0x210
async_daemon+0x820
coerce_scall_args+0x130
syscall+0x920
or the panic may occur outside of "rfscall()" due to memory
corruption from stale pointer references with the following
example stack trace:

panic+0x380
post_hndlr+0xc80
vm_hndlr+0x210
bubbleup+0x740
+------------- TRAP ----------------------------
| Data TLB Fault in KERNEL mode
| IIP=0xe0000000015f9010:0
| IFA=0x834f3c6e9e2be1f8
| p struct save_state 0xbde0031.0x9fffffff7f7e7c00
+------------- TRAP ----------------------------
kmem_gc_small+0x140
kmem_gc_freelist+0xb0
kmem_gc_arena+0x1c0
kmem_do_gc+0xf0
foreach_arena_ingroup+0xc0
kmem_garbage_collect_group+0x1c0
kmem_garbage_collect+0x1c0
kmem_arena_stealpages+0x40
vhand_service_mem_class+0x190
vhand+0x190
im_vhand+0x190
DoCalllist+0x3a0

But here the OS is still not up and the crash is seen during the ignite restore.

I would update Ignite-UX software on Ignite server first, just to make sure that its not due to some old bug in Ignite kernel with which the client has booted during restore:

https://h20392.www2.hpe.com/portal/swdepot/displayProductInfo.do?productNumber=IGNITEUXB

 

I am a HPE Employee
donna hofmeister
Trusted Contributor

Re: rx2600 ignite failure

Ignite C.7.22.175 is on all my servers.

This PHNE patch is/was not installed, so I resorted to Igniting from HP Media (11.23, Dec 2007).

donna hofmeister
Trusted Contributor

Re: rx2600 ignite failure

just for documentation sake, here's the patch hierarchy:

PHNE_36979 2007-11-01 ** 11.23 Kernel RPC cumulative patch
  PHNE_37487 2008-07-01 ** 
    PHNE_40953 2010-05-28 ** 
      PHNE_42524 2011-10-26 ** 
        PHNE_43708 2013-11-26 *   --- this patch wasn't installed either

mbarnwal
HPE Pro

Re: rx2600 ignite failure

So the server could be recovered with cold installation of OS ?
I am a HPE Employee
donna hofmeister
Trusted Contributor

Re: rx2600 ignite failure

Yes and the June 2009 patch dvd has been applied. That just leaves a whole lot of other recovery work to do (oh boy....not).