MailScanner slow and not taking all resources available

David Vosburgh vosburgh at dalsemi.com
Fri Dec 8 18:16:27 GMT 2006


Ugo Bellavance wrote:
> Hi,
> 
> This is MailScanner version 4.54.6
> 
> It sometimes happen on some servers (not always the same) that suddenly, 
> the INQ grows very fast.  When I look at the server, CPU is used only at 
> ~ 50% or less and MailScanner is not using the CPU that much.
> 
> - The networks seems to be ok (I can easily download a file at 1 MB/s 
> (10mbps link).
> - DNS resolution seems to be quick
> - SA doesn't time out
> - System doesn't swap
> - disk i/o doesn't look saturated
> 
> Here is an excerpt of dstat output:
> ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
> usr sys idl wai hiq siq|_read _writ|_recv _send|__in_ _out_|_int_ _csw_
>  51   9  30   9   1   1|   0  1032k|  41k  246k|   0     0 | 998   857
>  64   6  29   0   1   1|   0   136k|  32k  286k|   0     0 | 808   550
>  53   8  39   0   0   1|   0     0 |  61k  625k|   0     0 |1179   530
>  80  13   8   0   0   1|   0     0 |  30k  319k|   0     0 | 846  1350
>  80   9  10   0   1   1|   0     0 |  41k  311k|   0     0 | 824   485
>  84   6  10   0   0   1|   0  1264k|  29k  281k|   0     0 | 913   510
>  54  13  32   0   1   1|   0     0 |  45k  327k|   0     0 | 950   579
>  74  10  16   0   0   2|   0     0 |  52k  317k|   0     0 | 953   984
>  54  16  30   0   0   1|   0     0 |  64k  323k|   0     0 | 995   676
>  18   6  76   0   0   1|   0     0 | 122k  292k|   0     0 |1152  1053
>  62  13  16   9   1   0|   0    19M|  74k  255k|   0     0 |1356  1154
>  64  13  22   0   1   1|   0     0 |  45k  317k|   0     0 | 917   937
>  68   8  25   0   0   0|   0     0 |  51k  325k|   0     0 | 923   558
>  44  21  34   0   1   2|   0   184k|  47k  299k|   0     0 | 969   647
>  47  20  20  13   0   2|   0    19M|  44k   96k|   0     0 |1002   634
> 
> top:
> 
>  10:07:18  up 40 days,  1:22,  2 users,  load average: 1.62, 1.20, 1.07
> 95 processes: 94 sleeping, 1 running, 0 zombie, 0 stopped
> CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
>            total   12.0%    0.0%    6.2%   0.0%     0.4%    0.3%   80.6%
>            cpu00   14.2%    0.0%    5.0%   0.2%     0.6%    0.4%   79.6%
>            cpu01    9.9%    0.0%    7.5%   0.0%     0.3%    0.3%   81.6%
> Mem:  3074084k av, 2809584k used,  264500k free,       0k shrd,  134216k 
> buff
>                    2141452k actv,  470156k in_d,   25672k in_c
> Swap: 2008104k av,    7580k used, 2000524k free                 1809200k 
> cached
> 
>   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
> 18116 root      18   0 78476  76M  3264 S     3.6  2.5   0:00   1 
> MailScanner
> 15502 root      15   0  390M 390M   784 S     3.3 13.0   1:20   1 
> milter-greylist
> 15970 root      25   0 13104  12M  3400 S     2.6  0.4   0:10   1 php
>  7807 root      15   0  390M 390M   784 S     1.0 13.0   0:21   1 
> milter-greylist
>  7843 root      16   0  390M 390M   784 S     0.9 13.0   0:28   1 
> milter-greylist
> 17886 root      15   0 78240  76M  3264 S     0.5  2.5   0:00   1 
> MailScanner
> 22380 root      15   0 76008  74M  3248 S     0.1  2.4   0:25   1 
> MailScanner
> 17840 root      15   0  1056 1056   812 R     0.1  0.0   0:00   0 top
> 
> 
> Anyone seeing this as well?
> 
I've seen something similar on another admins system.  There were four 
essentially identical systems (that is, same hardware and OS, and fairly 
close in terms of revs of MS/SA etc.) filtering mail for one domain. One 
had a backlog in mqueue.in of something like 8k messages, yet the load 
average on this two processor box was something like 0.2.  I looked at 
most of the normal things at a system level as you did.  I ran test 
messages through looking for obvious bottlenecks in the system (none), 
and no SA timeouts at all. Nothing leapt out at me.  In looking at the 
maillog, despite the large inbound queue, MailScanner was not picking 
messages up very often to process.  There were ten MS processes going, 
it was only picking up a new batch for processing every few minutes.  I 
helped him tweak the queue run interval (lower) and the number of MS 
processes (fewer), and it seemed to help a bit, but I honestly think it 
was more just coincidence.  I need to check back in with him this 
weekend to see how it's behaving now.

Dave




More information about the MailScanner mailing list