MailScanner slow and not taking all resources available
David Vosburgh
vosburgh at dalsemi.com
Fri Dec 8 18:16:27 GMT 2006
Ugo Bellavance wrote:
> Hi,
>
> This is MailScanner version 4.54.6
>
> It sometimes happen on some servers (not always the same) that suddenly,
> the INQ grows very fast. When I look at the server, CPU is used only at
> ~ 50% or less and MailScanner is not using the CPU that much.
>
> - The networks seems to be ok (I can easily download a file at 1 MB/s
> (10mbps link).
> - DNS resolution seems to be quick
> - SA doesn't time out
> - System doesn't swap
> - disk i/o doesn't look saturated
>
> Here is an excerpt of dstat output:
> ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
> usr sys idl wai hiq siq|_read _writ|_recv _send|__in_ _out_|_int_ _csw_
> 51 9 30 9 1 1| 0 1032k| 41k 246k| 0 0 | 998 857
> 64 6 29 0 1 1| 0 136k| 32k 286k| 0 0 | 808 550
> 53 8 39 0 0 1| 0 0 | 61k 625k| 0 0 |1179 530
> 80 13 8 0 0 1| 0 0 | 30k 319k| 0 0 | 846 1350
> 80 9 10 0 1 1| 0 0 | 41k 311k| 0 0 | 824 485
> 84 6 10 0 0 1| 0 1264k| 29k 281k| 0 0 | 913 510
> 54 13 32 0 1 1| 0 0 | 45k 327k| 0 0 | 950 579
> 74 10 16 0 0 2| 0 0 | 52k 317k| 0 0 | 953 984
> 54 16 30 0 0 1| 0 0 | 64k 323k| 0 0 | 995 676
> 18 6 76 0 0 1| 0 0 | 122k 292k| 0 0 |1152 1053
> 62 13 16 9 1 0| 0 19M| 74k 255k| 0 0 |1356 1154
> 64 13 22 0 1 1| 0 0 | 45k 317k| 0 0 | 917 937
> 68 8 25 0 0 0| 0 0 | 51k 325k| 0 0 | 923 558
> 44 21 34 0 1 2| 0 184k| 47k 299k| 0 0 | 969 647
> 47 20 20 13 0 2| 0 19M| 44k 96k| 0 0 |1002 634
>
> top:
>
> 10:07:18 up 40 days, 1:22, 2 users, load average: 1.62, 1.20, 1.07
> 95 processes: 94 sleeping, 1 running, 0 zombie, 0 stopped
> CPU states: cpu user nice system irq softirq iowait idle
> total 12.0% 0.0% 6.2% 0.0% 0.4% 0.3% 80.6%
> cpu00 14.2% 0.0% 5.0% 0.2% 0.6% 0.4% 79.6%
> cpu01 9.9% 0.0% 7.5% 0.0% 0.3% 0.3% 81.6%
> Mem: 3074084k av, 2809584k used, 264500k free, 0k shrd, 134216k
> buff
> 2141452k actv, 470156k in_d, 25672k in_c
> Swap: 2008104k av, 7580k used, 2000524k free 1809200k
> cached
>
> PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
> 18116 root 18 0 78476 76M 3264 S 3.6 2.5 0:00 1
> MailScanner
> 15502 root 15 0 390M 390M 784 S 3.3 13.0 1:20 1
> milter-greylist
> 15970 root 25 0 13104 12M 3400 S 2.6 0.4 0:10 1 php
> 7807 root 15 0 390M 390M 784 S 1.0 13.0 0:21 1
> milter-greylist
> 7843 root 16 0 390M 390M 784 S 0.9 13.0 0:28 1
> milter-greylist
> 17886 root 15 0 78240 76M 3264 S 0.5 2.5 0:00 1
> MailScanner
> 22380 root 15 0 76008 74M 3248 S 0.1 2.4 0:25 1
> MailScanner
> 17840 root 15 0 1056 1056 812 R 0.1 0.0 0:00 0 top
>
>
> Anyone seeing this as well?
>
I've seen something similar on another admins system. There were four
essentially identical systems (that is, same hardware and OS, and fairly
close in terms of revs of MS/SA etc.) filtering mail for one domain. One
had a backlog in mqueue.in of something like 8k messages, yet the load
average on this two processor box was something like 0.2. I looked at
most of the normal things at a system level as you did. I ran test
messages through looking for obvious bottlenecks in the system (none),
and no SA timeouts at all. Nothing leapt out at me. In looking at the
maillog, despite the large inbound queue, MailScanner was not picking
messages up very often to process. There were ten MS processes going,
it was only picking up a new batch for processing every few minutes. I
helped him tweak the queue run interval (lower) and the number of MS
processes (fewer), and it seemed to help a bit, but I honestly think it
was more just coincidence. I need to check back in with him this
weekend to see how it's behaving now.
Dave
More information about the MailScanner
mailing list