Operating System - Linux
1826269 Members
3780 Online
109692 Solutions
New Discussion

linux server was hung and causing a web service outage

 
Diyana
Occasional Contributor

linux server was hung and causing a web service outage

Our linux server was hung and causing a web service outage. I have to investigate as to why the server got hung and respond back with findings. Anyone can assist me how to start?


2 REPLIES 2
Matti_Kurkela
Honored Contributor

Re: linux server was hung and causing a web service outage

Find out what exactly was done to restore the server to working condition: was only the web service restarted, or was the entire server rebooted? If the server was rebooted, was there an error message on the server's console before the reboot? What did it say?

Practically all Linux servers will produce logs to /var/log directory: read them.

The web service application may produce yet another set of logs in an application-specific location: those logs may be informative too.

You should familiarize yourself with normal startup/shutdown messages of your Linux distribution, and of your web service application.

If the application logs just stop without the normal shutdown messages, the application may have crashed, or someone may have forcibly stopped it using the "kill -KILL" or "kill -9" commands.

If the OS log just end without the normal shutdown messages, and then resume with the normal startup messages, the entire system has crashed/hung: if the cause is hardware-related and your server has hardware diagnostic log functionality, the hardware diagnostics might reveal something useful.

If there are no clues at all, there are a number of things you can do to get more information if the crash/hang happens again.

- For application-level problems, you can enable core dumps. Add "ulimit -c unlimited" to a suitable location to enable core dumps and teach your operators to try "kill -QUIT" or "kill -ABRT" to force a hanging application to produce a core dump before restarting it. The core dump may be a very large file, but a developer armed with a debugger can use the core dump to find out exactly what the program was doing when it was hung or when it crashed.

- For system-level problems, most modern Linux distributions designed for server use include some type of crash dump functionality. It is typically not enabled by default: read the documentation of your distribution to find out how to enable it and what its requirements are. If your server is hanging, you may need a way to send a hardware-level NMI signal to force the crash dump: read your hardware documentation to see if your server has a button that can be used to send a NMI signal.

- If you suspect there's a hardware-level problem, pay attention to the boot-up messages of the server, and use the hardware diagnostic programs provided by your server manufacturer.

MK
MK
Diyana
Occasional Contributor

Re: linux server was hung and causing a web service outage

I'm very new in linux..from the log that given to me..user complaint that they unable to ping to the server and rebooted was performed by our engineer..i have check CPU,memory and swap utilization..everything under threshold..