cancel
Showing results for 
Search instead for 
Did you mean: 

mailq

Ricardo Macedo
Occasional Advisor

mailq

Hi all !!

I´m having a sendmail problem!!

Anyone can help?

In maillog file I see the message below:

"runqueue: Skipping queue run -- load average too high"

Thanks in advance

Ricardo Macedo
16 REPLIES
Fred Ruffet
Honored Contributor

Re: mailq

This is not a sendmail problem but a server load problem. Mail sending is delayed due to high load. But they will be send when load will decrease.

Regards,

Fred
--

"Reality is just a point of view." (P. K. D.)
Marcel Boogert_1
Trusted Contributor

Re: mailq

Hi there,

Try to restart the sendmail daemon.

$ mailq
$ /sbin/init.d/sendmail stop
$ /sbin/init.d/sendmail start
$ mailq

Take a look at the difference...

MB.
Muthukumar_5
Honored Contributor

Re: mailq

There are lot of mails are in queue without sending them there,

Try to run as mailq to get the mail queue there.

or mailq -v or sendmail -bp or messages will be there in /var/spool/mqueue directory.

Try to decrease message resend time in sendmail.cf file
Easy to suggest when don't know about the problem!
Ricardo Macedo
Occasional Advisor

Re: mailq


Marcel,

I just tried and still the same, look bellow:

# mailq
Mail Queue (7 requests)
--Q-ID-- --Size-- -----Q-Time----- ------------Sender/Recipient------------
IAA10387X 5 Thu Sep 9 08:56 zpgt
rhicardo@gmail.com
JAA11033X 5 Thu Sep 9 09:58 zpgt
rhicardo@gmail.com
IAA10397X 5 Thu Sep 9 08:57 zpgt
rhicardo@gmail.com
IAA10395X 6 Thu Sep 9 08:57 zpgt
rhicardo@gmail.com
IAA10393X 6 Thu Sep 9 08:56 zpgt
ricardor@br-petrobras.com.br
JAA10886X 29 Thu Sep 9 09:41 zpgt
zpgt
IAA10390X 46 Thu Sep 9 08:56 zpgt
rhicardo@gmail.com

And .... Fred,

My server seems not overloaded, I ran top and look at :

"Load averages: 36.62, 36.44, 36.07"

The "Idle" parameter is "99%"

Thank for your help.


Ricardo Macedo
Fred Ruffet
Honored Contributor

Re: mailq

A load of 36 and an idle percentage at 99,99... let me tell you your server is certainly a little bit overloaded :)

A great load and a big idle percentage may mean that your server spends a lot of time scheduling and not running processes.

Regards,

Fred
--

"Reality is just a point of view." (P. K. D.)
Marcel Boogert_1
Trusted Contributor

Re: mailq

Ricardo,

Do you use some kind of a relay server for your mail. If so, check that server for problems.

You can also try the mailx -v command and check the output.

MB.
Patrick Wallek
Honored Contributor

Re: mailq

What is your timeslice kernel parameter set to? If it is anything but 10, it should be changed.
Fred Ruffet
Honored Contributor

Re: mailq

Could you also tell us result of
sar 1 5

Regards,

Fred
--

"Reality is just a point of view." (P. K. D.)
Ricardo Macedo
Occasional Advisor

Re: mailq



Fred,

The result is below :

#
# sar 1 5

HP-UX sede3 B.11.00 U 9000/800 09/09/04

11:26:13 %usr %sys %wio %idle
11:26:14 0 0 100 0
11:26:15 0 1 99 0
11:26:16 0 1 99 0
11:26:17 0 0 100 0
11:26:18 0 1 99 0

Average 0 1 99 0
#
#
#


Thanks again.
Ricardo Macedo
Fred Ruffet
Honored Contributor

Re: mailq

So, a load of 36, an idle percentage nearly 100% and server spending all its time in I/O. Sendmail is right your server is a little bit overloaded. Running processes seems to be waiting on I/O. You may have too slow I/Os on this box or too many concurent processes. Have a look at what processes are running and what they are doing. Once you'll resolve this, sendmail will do his job.

Regards,

Fred
--

"Reality is just a point of view." (P. K. D.)
Muthukumar_5
Honored Contributor

Re: mailq

what are you getting with mailq -v
It will contains load details / message failure too.

Is the mail domains are getting resolved check as,

nslookup gmail.com

etc.
Easy to suggest when don't know about the problem!
Ricardo Macedo
Occasional Advisor

Re: mailq


Hi all!

Fred is right! My server is having I/O problems and sendmail is waiting to process messages!

I will verify the potencial I/O problem.

Thanks!
Ricardo Macedo
Bill Hassell
Honored Contributor

Re: mailq

A load average of 36 is extremely high unless you have a 32 processor system! The 99% system overhead indicates massive kernel activity, either due to I/O errors or possibly memory errors. Look at the end of /var/adm/syslog/syslog.log for error messages. If you did not load the diagnostics or they are misconfigured, there may not be any error messages.

A normal load is 1x-2x the amount of processors so if you have a 2 processor system, a normal load is 2 to 4, a high load is 10, and 36 is a serious condition.

sendmail is doing exactly what it is supposed to do: stop when load averages are too high. Depending on the version of sendmail you have, this limit may be 8 or 12 but can be adjusted in /etc/mail/sendmail.cf. Look for "load" in the sendmail.cf file.


Bill Hassell, sysadmin
Ricardo Macedo
Occasional Advisor

Re: mailq


Hi everyone,

My syslog.log file is logging the messages below:

Sep 9 16:52:44 sede3 vmunix: SCSI: Abort Tag -- lbolt: 9333319, dev: 1f041200, io_id: 400252c
Sep 9 16:52:44 sede3 vmunix: SCSI: Request Timeout -- lbolt: 9333319, dev: 1f041000
Sep 9 16:52:44 sede3 vmunix: lbp->state: 4060
Sep 9 16:52:44 sede3 vmunix: lbp->offset: ffffffff
Sep 9 16:52:44 sede3 vmunix:
Sep 9 16:52:44 sede3 vmunix: lbp->uPhysScript: fa7fd000
Sep 9 16:52:44 sede3 vmunix: From most recent interrupt:
Sep 9 16:52:44 sede3 vmunix: ISTAT: 22, SIST0: 04, SIST1: 00, DSTAT: 80, DSPS: 00000006
Sep 9 16:52:44 sede3 vmunix: lsp: be9900
Sep 9 16:52:44 sede3 vmunix: bp->b_dev: 1f041000
Sep 9 16:52:44 sede3 vmunix: scb->io_id: 400252a
Sep 9 16:52:44 sede3 vmunix: scb->cdb: 28 00 00 00 00 10 00 00 04 00
Sep 9 16:52:44 sede3 vmunix: lbolt_at_timeout: 9333119, lbolt_at_start: 9330119
Sep 9 16:52:44 sede3 vmunix: lsp->state: 205
Sep 9 16:52:44 sede3 vmunix: lbp->owner: 1119c00
Sep 9 16:52:44 sede3 vmunix: bp->b_dev: 1f041100
Sep 9 16:52:44 sede3 vmunix: scb->io_id: 400252b
Sep 9 16:52:44 sede3 vmunix: scb->cdb: 28 00 00 00 00 10 00 00 04 00
Sep 9 16:52:44 sede3 vmunix: lbolt_at_timeout: 9330119, lbolt_at_start: 9330119
Sep 9 16:52:44 sede3 vmunix: lsp->state: d
Sep 9 16:52:44 sede3 vmunix:
Sep 9 16:52:44 sede3 vmunix: SCSI: Abort Tag -- lbolt: 9333319, dev: 1f041100, io_id: 400252b
Sep 9 16:52:44 sede3 vmunix: SCSI: Abort Tag -- lbolt: 9333319, dev: 1f041000, io_id: 400252a

I think I´m having a disk problem.

Any sugestions?

Thanks in advanc
Ricardo Macedo
Bill Hassell
Honored Contributor

Re: mailq

Big disk problems. Use:

grep lbolt /var/adm/syslog/syslog.log

to see how many disks are affected. The "dev" value is the address of the disk(s). As far as fixing the problem, you need to identify where this disk is used. Start with ioscan -kfnCdisk to see all your disks. Match the hardware address from the lbolt message to a drive, then look below that line to see the device file like /dev/dsk/c1t2d0 or similar. Now take that device file and find it in vgdisplay -v to see the volume group using that disk. If the disk is mirrored, you can use lvreduce and vgreduce to remove the bad disk out of the volume group (several steps).

If the disk is not mirrored, replacement is very dependent on whether the disk is the boot disk or a peripheral disk.


Bill Hassell, sysadmin
Tom Danzig
Honored Contributor

Re: mailq

Look like a problem with disk /dev/dsk/c4t1d0. Better have it replaced soon!

BTW, you can adjust at what load average sendmail will stop sending messages. In the sendmail.cf file:

# load average at which we just queue messages
O QueueLA=8


You could change this to a higher value if desired, however, in this case you should reduce the load on the system.