MailScanner SpamAssassin Timeout cause CPU100%

Glenn Steen glenn.steen at gmail.com
Fri May 24 13:32:10 IST 2013


On 24 May 2013 11:50, 东风 <dongwind at 21cn.com> wrote:
> hi,Martin,could you tell me more please,i see the url,but can't understand
> how to use the cron-job method instead of Bayes expirey options.
>
(snip)
What Martin is getting at is that you can create a cron job that does
"sa-learn --force-expire", scheduled to some "off hour" in the middle
of the night,  and (in spam.assassin.prefs.conf or similar) disable
auto-expire of the database.
But to see if this is really the problem you have, you can do a couple
of manual "sa-learn --force-expire" and time them. If you set the SA
timeout too low (which I'm almost certain you have done!), the expiry
will never finish ... which leads to more work next time etc. Increase
your SA timeout to at least 5 minutes.
Also, if you have any files named like bayes_toks.expire<numbers>, you
very likely have the expiry problem. Forcing an expire may be all you
need do to alleviate the problem, in which case you needn't bother
with the cron job/disabling auto-expiry... Experimentation will be
needed to tell which is best in your particular case;-)

Another thing to look at, which can have catastrophic ramifications if
it has happend, is if you have a bayes_seen file that have grown ...
huge... It will grow over time and in the end, updating it will
dominate the processing time of bayes... After all, IO in *nix is
almost always CPU-bound, so having to wade through a huge file for
every message/batch/child can become the thing that brings your system
to its knees.
If you do have a very large (100 MiB+) bayes_seen file, simply remove
it. If you want to play it safe, stop MailScanner, remove it and then
restart MailScanner.

Cheers!
-- 
-- Glenn
email: glenn < dot > steen < at > gmail < dot > com
work: glenn < dot > steen < at > ap1 < dot > se


More information about the MailScanner mailing list