General
cancel
Showing results for 
Search instead for 
Did you mean: 

What can cause a simple "ls" command hung?

Alex6126
Occasional Advisor

What can cause a simple "ls" command hung?

Dear experts,

I ran into an issue lately when our programmers were doing performance testing. Basically this is a oracle database server and they were putting hundreds of users generating reports and query on it.  I was watching glance closely.  CPU and Memory usage was around 60%, enough swap, system table wasn't full. It went on for a couple of hours and all the sudden, the server seem to stuck.  I typed in ls command , hit return , no response. I tried several other commands such as kcusage, no response, ssh from other servers no response. I could still use the existing glance session and look at CPU, memory, disk etc and everything seems to be normal. later when programmers terminated the test,  the "ls" command I typed in finally returned without any warning. I checked system log , there was no warning or error...

my question is, if kcusage shows the kernel parameters are set high enough, and it has plenty of free cpu/mem/swap, no disk bottleneck, what could cause this kind of issue?

Thanks.

Alex

 

 

 

10 REPLIES
Ken Grabowski
Respected Contributor

Re: What can cause a simple "ls" command hung?

It sounds like the root file system hit 100%.  Make sure that nobody is writing anything to root.

Alex6126
Occasional Advisor

Re: What can cause a simple "ls" command hung?

no file system full during that time.

Thanks.

Alex

Ken Grabowski
Respected Contributor

Re: What can cause a simple "ls" command hung?

Interestingly when the root file system hits 100% full, most other function stop working, logging, responding.  Take a lab system if you have one and see what happens.

Ken Grabowski
Respected Contributor

Re: What can cause a simple "ls" command hung?

With that said... I do remember having a similar experience with a Sybase IQ server that behaved just like the root file system full.  I believe a database patch and some changes to database memory and kernel memory setting were recommended. Have you checked with your application vendor?

Dennis Handly
Acclaimed Contributor

Re: What can cause a simple "ls" command hung?

I assume that ls wasn't used on NFS filesystem?

It seems like something was causing the hang because it was so busy?

Where to you keep your shell history file, locally?

 

What HP-UX version are you on?

Alex6126
Occasional Advisor

Re: What can cause a simple "ls" command hung?

no NFS. It's 11.31 running Oracle 11.2.

By extracting mwa history data, we found during tha time, a lot of the processes were blocked with "stop reason" = VM

Could anyone explain how come that happened? we have plenty of free memory (>10G) though.

 

 

             |                | Sec  |  Alive  |             |    Process    |      User      |    |  Interest       |   Stop   |

   Date   |  Time       |/Intvl|      Intvl  |   PID    |      Name      |      Name      |Pri |   Reason   |  Reason  |

09/01/2012|20:45:00|  59.9|    59.90|      8003|postmaster      |sfmdb           | 130|  C         |        VM|-D

09/01/2012|20:45:00|  59.9|    59.90|     25195|oracleCW      |oracle          | 130|  C         |        VM|(LOCAL=NO

09/01/2012|20:45:00|  59.9|    59.90|      8884|sh              |root            | 130|  C         |        VM|/usr/sbin

09/01/2012|20:45:00|  59.9|    59.90|     10497|oracleCW      |oracle          | 130|  C         |        VM|(LOCAL=NO

09/01/2012|20:45:00|  59.9|    59.90|     17889|oracleCW      |oracle          | 130|  C         |        VM|(LOCAL=NO

09/01/2012|20:45:00|  59.9|    59.90|     25498|oracleCW      |oracle          | 130|  C         |        VM|(LOCAL=NO

09/01/2012|20:45:00|  59.9|    59.90|     17991|oracleCW      |oracle          | 130|  C         |        VM|(LOCAL=NO

09/01/2012|20:45:00|  59.9|    59.90|     11118|oracleCW      |oracle          | 130|  C         |        VM|(LOCAL=NO

09/01/2012|20:45:00|  59.9|    59.90|     26011|oracleCW      |oracle          | 130|  C         |        VM|(LOCAL=NO

09/01/2012|20:45:00|  59.9|    59.90|     18403|oracleCW      |oracle          | 130|  C         |        VM|(LOCAL=NO

09/01/2012|20:45:00|  59.9|    59.90|      9098|sh              |oracle          | 130|  C         |        VM|

09/01/2012|20:45:00|  59.9|    59.90|     26460|oracleCW      |oracle          | 130|  C         |        VM|(LOCAL=NO

09/01/2012|20:45:00|  59.9|    59.90|     11685|oracleCW      |oracle          | 130|  C         |        VM|(LOCAL=NO

09/01/2012|20:45:00|  59.9|    59.90|      4352|oracleCW      |oracle          | 130|  C         |        VM|(LOCAL=NO

09/01/2012|20:45:00|  59.9|    59.90|     19172|oracleCW      |oracle          | 130|  C         |        VM|(LOCAL=NO

09/01/2012|20:46:00|  59.6|    59.60|      3434|vxpal           |root            | 130|  C         |        VM|-a

09/01/2012|20:46:00|  59.6|    59.60|      1726|inetd           |root            | 130|  C         |        VM|-l

09/01/2012|20:46:00|  59.6|    59.60|      1540|sshd            |root            | 130|  C         |        VM|

09/01/2012|20:46:00|  59.6|    59.60|      2260|cimserver       |root            | 130|  C         |        VM|

09/01/2012|20:46:00|  59.6|    59.60|      3109|cron            |root            | 130|  C         |        VM|

09/01/2012|20:46:00|  59.6|    59.60|      2271|cimprovagt      |root            | 130|  C         |        VM|0

09/01/2012|20:46:00|  59.6|    59.60|     12958|sh              |oracle          | 130|  C         |        VM|

Alex6126
Occasional Advisor

Re: What can cause a simple "ls" command hung?

sorry , it has NFS..

Dennis Handly
Acclaimed Contributor

Re: What can cause a simple "ls" command hung?

>we have plenty of free memory (>10G) though.

 

Are you sure you had the memory when it hung?  Your output indicates you didn't.

Also, did you have swap in/out stats during this period?

 

It's probably too late now but what does "swapinfo -tam" show?

Dave Olker
HPE Pro

Re: What can cause a simple "ls" command hung?


Alex6126 wrote:

sorry , it has NFS..


Does this mean you were sitting in an NFS mounted filesystem when you issued the ls command that hung?  Assuming you were not, do you know if all NFS filesystems are responding to requests?  Even if you were sitting in a local directory, I've seen symbolic links, PATH variables, etc. cause an ls to hang if any of the NFS filesystems are not responding.

 

If you have an HP support contract you could open a case with the Response Center and get a copy of the tusc utility.  That would show you which system call the ls command hung executing.  That usually gives you a good idea where the underlying problem resides, or at least gives you a nudge in the right direction.

 

Dave

Dennis Handly
Acclaimed Contributor

Re: What can cause a simple "ls" command hung?

>Even if you were sitting in a local directory, I've seen symbolic links, PATH variables, etc. cause an ls to hang if any of the NFS filesystems are not responding.

 

And shell history logging.

>That would show you which system call the ls command hung executing.

 

I think the big clue is stopped on VM.