Another call for improvements

Steve Freegard smf at f2s.com
Wed May 31 10:01:31 IST 2006


On Wed, 2006-05-31 at 10:01 +0200, Holger Gebhard wrote:
> Is it possible to modify the spamassassincache feature a little bit?
> 
> Most incoming mails are spam so the caching is very useful and speeds up 
> MailScanner a lot.
> But when a spam message is detected as nonspam the cache result always 
> returns due to cache timeout.
> Any customrules to detect the message are "ingored" except a cache timeout 
> or complete database deletion.
> 
> I think it occurs very rarely that a "real" nonspam message are send twice 
> to get a great speedup by nonspam caching.

Actually -- the non-spam caching gives a big performance boost if you do
recipient splitting in your MTA.

Imagine a non-spam e-mail with 10 recipients - this gets split into 10
separate e-mails at the MTA level prior to MailScanner.  Without
non-spam caching, you have to SpamAssassinate the same message 10 times
without the cache compared to once with.

> A useful feature would be to add some config options like:
> 
> Cache NonSpam = yes/no
> Cache LowSpam = yes/no
> Cache HighSpam = yes/no
> Cache Virus = yes/no
> 
> or simply when a cachetiming set to "0" no caching is done for the category.

With settable options such as these -- this would get my vote.  Non-spam
caching is best when used with recipient splitting and probably gives
little benefit otherwise and increases cache contention.

Thinking about this further -- it should be possible to increase
performance and reduce cache contention on a busy system with recipient
splitting by *not* caching non-spam results if the message is only to a
single recipient. 

The config options would then be something like:

Cache Non-Spam = yes/no/multi-recipient
Cache Low-Spam = yes/no/multi-recipient
Cache High-Spam = yes/no
Cache Viruses = yes/no

Cheers,
Steve.



More information about the MailScanner mailing list