Operating System - HP-UX
1753797 Members
7442 Online
108805 Solutions
New Discussion юеВ

Ansi C read hang in _read_sys ()

 
MTSU_SAN
Regular Advisor

Ansi C read hang in _read_sys ()

I have a C process, compiled with cc (patches to B.11.11.02), which is part of my mail server. It seems to hang at the same place everytime I try to delete a mail user. I have attached the c-code and gdb backtrace. Since it is evidently a library call that hangs, I would like to know if it can be fixed!
7 REPLIES 7
Steven Gillard_2
Honored Contributor

Re: Ansi C read hang in _read_sys ()

This is the read() system call:

n = read(fd, p, left);

Depending on what type of file the fd variable refers to its perfectly valid for it to block waiting for more data, especially if its reading from a socket or a terminal.

Given that it looks like some sort of database code that is calling your function, its possible that the bug is there:

#3 0x438bc in starttxn_or_refetch () at cyrusdb_flat.c:252
#4 0x439bc in myfetch () at cyrusdb_flat.c:270
#5 0x43ab0 in fetch () at cyrusdb_flat.c:291

As a first step, find out exactly what read() is trying to read from. You can do this by using glance or lsof to display the open files. Use the debugger to find out what file descriptor the variable "fd" refers to.

If it is a normal file that read() is hanging on then perhaps you have a hardware or file system problem, so check syslog for errors.

Regards,
Steve
MTSU_SAN
Regular Advisor

Re: Ansi C read hang in _read_sys ()

The fd is an integer file descriptor to the mailboxes.db flat file. There has been no locking of the file done external to the read.
Steven Gillard_2
Honored Contributor

Re: Ansi C read hang in _read_sys ()

Can another process read this file? What happens if you run:

$ xd mailboxes.db > /dev/null

If that hangs as well you might have a hardware or file system problem, so have a close look at syslog and the diags.

Regards,
Steve
MTSU_SAN
Regular Advisor

Re: Ansi C read hang in _read_sys ()

The xd command does not hang--I am able to do normal grep or tail commands on the file, while this process grabs more and more cpu load. Also, if I attach with gdb, I can force it to break and continue, so maybe it is not actually hung here, but I've just caught the backtrace at this point every time. Otherwise it would not let me stop the program and add a breakpoint, then go there, right?
Steven Gillard_2
Honored Contributor

Re: Ansi C read hang in _read_sys ()

If its using CPU its unlikely to be 'hung' in read(). The program is more likely to be in an infinite loop, continually trying to read from this file. If you run a tusc trace you will be able to confirm this.

How big is the file? If its a really big file it may just be taking a long time to read. Also note that "left" should be of type off_t instead of int, otherwise you will get problems if the file is >2Gig.

A couple of other suggestions:

- use glance to view the process's offset and see if it is increasing over time

- Now that you've attached with gdb, have a closer look at some of the variables in the read() loop, in particular "left", "n" and "p".

- Step through the code instead of continuing and watch the variables. Also make sure that it is remaining in the map_refresh function and not looping from higher up.

Regards,
Steve
MTSU_SAN
Regular Advisor

Re: Ansi C read hang in _read_sys ()

I do not know what you mean by tusc trace--maybe this is a development tool I do not have!

File is only 4Mb, so it is not really that big.

I have tried stepping through the program--unfortunately, with the way it is compiled, it won't let me print any of the variables I need to look at, the "optimization" level is too high. I can't stop the mail server to recompile, and since we are having bad performance as it is, I really don't want it any slower/less efficient! On my backup system, there is no load, so I don't see the same response, at all.
Steven Gillard_2
Honored Contributor

Re: Ansi C read hang in _read_sys ()

Tusc is a great tool, it displays the system calls a process is making. You can get it from http://hpux.connect.org.uk/ - just perform a package search for 'tusc'. Definitely have a look at what it has to say, its the next best thing to attaching a debugger to the process.

Regards,
Steve