shutdown is taking more than about 15 minutes

shutdown is taking more than about 15 minutes

Linux Node1 2.6.10-telco-1.46-mckinley-smp #1 SMP Fri May 30 18:29:43 UTC 2008 ia64 GNU/Linux

I have above linux platform. When I issue shutdown, sometimes I find system is taking about 15 minutes after it had broadcasted the message about shutdown. I issue shutdown -r now
Re: shutdown is taking more than about 15 minutes

is there any cluster software running or any other software that has a long time to finish?

Re: shutdown is taking more than about 15 minutes

we have dual node software systems. By the way. shutdown should be taking care of it I believe. When system is issued shutdown it sends SIGTERM to all for graceful exit, if processes are still stuck up , then brutal SIGKILL is sent and reboot is forced. Why do you think in any case , shutdown should hang for 15 minutes ?
Re: shutdown is taking more than about 15 minutes

Some things just won't die ;-)

Processes can be in an 'uninterruptable sleep', for example when performing I/O operations. They will then only receive the KILL when coming out of that state. Maybe you have a process blocked in such a state for a while. (your fencing mechanism, maybe?)

Can you see where it's hanging, exactly ? The console should be telling you what the machine is currently trying to do. It could be useful to provide the last few lines of console output you see when it hangs.

You can also try to pinpoint this by stopping services manually. Since the cluster software would be our prime suspect, see what happens when you manually shut down these services (using the init scripts in /etc/init.d) while the machine is still up.

an engineer's aim in a discussion is not to persuade, but to clarify.
Re: shutdown is taking more than about 15 minutes

Sorry, I forgot to ask an important question.

Are you using HP ServiceGuard for linux or another clustering solution ?

an engineer's aim in a discussion is not to persuade, but to clarify.

Re: shutdown is taking more than about 15 minutes

Its linux system.
I did not observe any console message flashed after shutdown issued. No console messages for 15 minutes. Do you suggest any degug I should add? Like
1. Adding "top" output to see , if any process is hung

2. Adding ps output and check out the status of processes after shutdown command is issued.

If you have any such specific debug command that I can insert in my perl code to find out which processes goes in uninterruptable sleep, that would be great help

Another question , why would a process goes for uninterruptable sleep ? KILL -9 also can not kill them ? though I have seen many a times, kill -9 too does not help in killing out the processes


Re: shutdown is taking more than about 15 minutes

I have one important question. Will shutdown hang , if any of the process does not exit , or hung. I thought "shutdown -r now" would anyway proceed for rebooting, if its not able to kill any of the process that are hung.

Pls clerify , if my concept is right or wrong. Describe how shutdown works in steps.
Re: shutdown is taking more than about 15 minutes

"I did not observe any console message flashed after shutdown issued"

r u really wathcing the console (not the terminal you issued the shutdown). Or try watching the /var/log/boot.log or /var/log/messages from another terminal parallay.
Re: shutdown is taking more than about 15 minutes

Why would a process go into such an uninterruptable state: it mustn't get killed in the middle of an I/O operation, because that could cause corruption. However, a process is supposed to be in such a state for only very small amounts of time, unless if it's blocked for some reason. It will not ignore the kill -9, it will just only see it when it returns from the operation.

Thay only illustrates why a process would not react to a kill -9 immediately, though.

The man page for shutdown is quite short and describes the shutdown process well.

Are you working on the system console, or through telnet/ssh ?

We still need to know which cluster software you are using.

What's the output of these commands:

# whereis cmviewcl
# service cman status

an engineer's aim in a discussion is not to persuade, but to clarify.

Re: shutdown is taking more than about 15 minutes

service cman status
-su: service: command not found

service tcp status
-su: service: command not found

whereis cmviewcl

I have console connection to platform. Its not telnet / ssh sessions. thats for sure.

I did not find boot.log file under /var/log

Jul 7 02:29:05 Node1 kernel: EXT2-fs warning: maximal mount count reached, running e2fsck is recommended
Jul 7 02:45:18 Node1 kernel: nfsd: last server has exited
Jul 7 02:45:18 Node1 kernel: nfsd: unexporting all filesystems

Shutdown was issued around 02:29:46

There is no messages in syslog and kern.log around 15 mins after shutdown was invoked.

Re: shutdown is taking more than about 15 minutes

ok, no recent redhat.

We're really going to need more info on your machine. The linux distribution you're running as well as the software are important to know in case of problems.

an engineer's aim in a discussion is not to persuade, but to clarify.
Re: shutdown is taking more than about 15 minutes

What distribution are you using and how up to date is it?

Seems like you could be having a problem with NFS.

Can you replicate the same delay you have by just stopping NFS instead of running a shutdown?

Re: shutdown is taking more than about 15 minutes

Hey I don't think I am using any of NFS file systems
1. More than above list of packages. What I am interested how to debug this issue. Is there somekind of "while loop" I could execute in loop for ps command , and determine which processes are not killed after shutdown was executed

2. Or I can strace on shutdown pid and determine how many processes that shutdown is waiting on before actually bringing donw the system

Pls suggest How we can debug in shutdown/init front, so that I can determine which process was hung for 15 minutes.

Re: shutdown is taking more than about 15 minutes

Module Size Used by
bond0 160698 0
sctp 448112 6 [unsafe]
ipv6 650568 17 sctp
capability 15208 0
commoncap 21336 1 capability
thermal 38836 0
fan 16664 0
button 21488 0
processor 44344 1 thermal
ipmi_devintf 22364 2
ipmi_si 76484 1
ipmi_msghandler 108740 2 ipmi_devintf,ipmi_si
scsi_dump 20568 0
dump_blockdev 32364 1 scsi_dump
dump_gzip 13944 0
zlib_deflate 56820 1 dump_gzip
dump 100740 2 dump_blockdev,dump_gzip
rpqiosmp 8994240 0
e1000 317856 0
tg3 196480 0
e100 92478 0
tulip 119596 0
cmd64x 30272 0 [permanent]
loop 41576 0
vfat 36910 0
msdos 26954 0
fat 90164 2 vfat,msdos
isofs 65840 0
sr_mod 48096 0
ide_cd 90407 0
ide_core 276020 2 cmd64x,ide_cd
cdrom 90712 2 sr_mod,ide_cd
dm_mod 143152 17
raid1 45856 1
md 104416 2 raid1

Above is lsmod output , it may help you
Re: shutdown is taking more than about 15 minutes

Aha, so you're on a debian box. Good to know.

Can you try to stop services manually, one by one ?
You could also try to change from runlevel 5 to runlevel 3, and then down to 1.. maybe you bump in to the hang then.

Note that runlevel 1 is a maintenance mode, no remote access will be possible.

# init 3


# init 1

Just a thought..

an engineer's aim in a discussion is not to persuade, but to clarify.
Re: shutdown is taking more than about 15 minutes


What applications are you running on the box ?

I've seen Oracle databases take ages (hours) to shut down cleanly, especially if they're busy.

After you've issued the shutdown -r command, take a look at what's running with top or similar. You should be able to work out what's not dying.



Re: shutdown is taking more than about 15 minutes

Finally we traced down wall was hung on tty_write call ,explananation is as below.
Please let me know what is the root cause for tty_write to get hung for about 15 minutes. This wall was invoked from shutdown.

wall hangs on tty_write system call for about 15 minutes on debian 2.6 kernel


My system was about 20% cpu utilization. Console was open and some other terminals were oepn.

0. system info Linux Node1 2.6.10-telco-1.35.1-mckinley-smp #1 SMP Sun Nov 18 20:58:46 UTC 2007 ia64 GNU/Linux
1. System had invoked shutdown -r now commmand
2. As part of it , it was putting wall message to all ttys
The wall message is "The system is going down for reboot NOW!"
3. We found that wall was hanging for 15mins and was not posting this message to tty.
4. Because of above reason, shutdown was hanging for about 15 minutes.

system ("while ( true );do sleep 25;date >> $rnps;ps -emo pid,ppid,psr,pcpu,pmem,vsz,rss,lwp,state,time,wchan=WIDE-WIDE-WCHAN-COLUMN,args>> $rnps; echo >> $rnps;done &");
shutdown -r now

====Code Ends===

Below is seen for about 15-mins as captured from ps -e
30133 30131 - 0.0 0.0 3632 1744 - - 00:00:00 -/usr/bin/wall
- - 3 0.0 - - - 30133 S 00:00:00 tty_write


Re: shutdown is taking more than about 15 minutes

Hi Rob,
I have ssh sessions open to this linux box. Do you expect this ssh sessions are hung for 15-minutes. Shutdown trying to put wall messatges to ssh terminals and got hung because of that ?

Appreciate your quick reply