Operating System - HP-UX
1833024 Members
2312 Online
110049 Solutions
New Discussion

make_net_recovery freze on list_expander

 
Johnny Damtoft
Regular Advisor

make_net_recovery freze on list_expander

Hi all,

When starting up the ignite backup, either using cron or by hand - executing:
/opt/ignite/bin/make_net_recovery -v -s 10.255.72.23 -d clihst01 -x inc_entire=vg00

The log shows this:
* Creating NFS mount directories for configuration files.
* Recovery Archive Name = 2006-06-27,08:47

* Lanic Id = 0x00306E4B5454

* Ignite-UX Server = 10.2.7.23


======= 06/27/06 08:47:05 UTC Started /opt/ignite/bin/make_net_recovery. (Tue
Jun 27 08:47:05 UTC 2006)
@(#) Ignite-UX Revision C.6.0.109
@(#) net_recovery (opt) $Revision: 10.655 $

* Testing pax for needed patch
* Passed pax tests.
* Recovery Archive Description = clihst01

* Recovery Archive Location =
10.255.72.23:/var/opt/ignite/recovery/archives/clihst01

* Number of Archives to Save = 2

* Pax type = tar

Program Terminated. SIGINT received. Exiting.

======= 06/27/06 09:03:24 UTC make_net_recovery completed unsuccessfully

----------------

I have alot of list_expander processes hanging, from up to 5 days ago.
# ps -ef |grep -i list
root 25313 1 0 08:47:05 pts/1 0:00 sh -c /opt/ignite/lbin/list_expander -d -f /var/opt/ignite/reco
root 25314 25313 0 08:47:05 pts/1 0:00 /opt/ignite/lbin/list_expander -d -f /var/opt/ignite/recovery/c
root 18121 18120 0 Jun 22 ? 0:00 /opt/ignite/lbin/list_expander -d -f /var/opt/ignite/recovery/c
root 18120 1 0 Jun 22 ? 0:00 sh -c /opt/ignite/lbin/list_expander -d -f /var/opt/ignite/reco
root 29202 1 0 Jun 23 ? 0:00 /opt/ignite/lbin/list_expander -d -f /var/opt/ignite/recovery/c
root 22829 22828 0 Jun 24 ? 0:00 /opt/ignite/lbin/list_expander -d -f /var/opt/ignite/recovery/c
root 22828 1 0 Jun 24 ? 0:00 sh -c /opt/ignite/lbin/list_expander -d -f /var/opt/ignite/reco

-----------------

So after trying to cleanup, by killing the processes i get stuck with a few list_expander processes that will not die even by a kill -9.

What happend here?
How can i make sure this does not happen again?
Any way to fix this without a reboot? (kill -9 did not work on list_expander processes)

Any sugestions?

Rgds,

Johnny Damtoft


4 REPLIES 4
Steven E. Protter
Exalted Contributor

Re: make_net_recovery freze on list_expander

Shalom Johnny,

Best bet is that the /var filesystem is full.

This is where Ignite stores lists and such and needs free space to continue.

bdf /var

Clear space and it may resume.

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Johnny Damtoft
Regular Advisor

Re: make_net_recovery freze on list_expander

Hi Steven,

/dev/vg00/lvol8 4718592 2270576 2429000 48% /var

I have just checked HP OV, no disk space problems the last month or so.

// Johnny
Steven E. Protter
Exalted Contributor

Re: make_net_recovery freze on list_expander

Shalom,

Program Terminated. SIGINT received. Exiting.

fuser -cu to identify the processes on /var and then kill -9 the process.

If you end up with a zombie, which is likely you will need to reboot the system.

Good Luck,

SEP
Steven E Protter
Owner of ISN Corporation
http://isnamerica.com
http://hpuxconsulting.com
Sponsor: http://hpux.ws
Twitter: http://twitter.com/hpuxlinux
Founder http://newdatacloud.com
Johnny Damtoft
Regular Advisor

Re: make_net_recovery freze on list_expander

Well, by trying to find the source to this problem, even my lsof started hanging. :(

So a reboot has been initiated.

Thanks for your answers anyway. :)