Spamassassin cache in mysql - feature request

Ken A ka at pacific.net
Wed Mar 18 14:12:53 GMT 2009


Eduardo Casarero wrote:
> 2009/3/18 Matt <spamlists at coders.co.uk>:
>> Eduardo Casarero wrote:
>>>
>>> I did some research in 1 of my servers, today i've procesed 8505
>>> emails, with 338 cache hits. How  can we measure if sharnig caches
>>> improves (a lot, a little, nothing) cache hits? (there is another
>>> server next to it) Obviously without much development so we can test
>>> if having a mysql server improves or not the scenario.
>>>
>>>
>>
>> On each box you need to do
>>
>> sqlite3 /var/spool/MailScanner/incoming/SpamAssassin.cache.db "select md5
>> from cache;" > /tmp/cachehashes.servername
>>
>> move the files on to one box
>>
>> cat /tmp/cachehashes.serv1 /tmp/cachehashes.serv2 | sort | uniq -c | sort -n
>> | grep -v " 1 "
>>
>> Any lines outputted will be the same hash on multiple servers
>>
> 
> Here are the results:
> 
> (2 MS servers) 34 matches, one cache had 820 records and the other 548.
> 
> i just don't see the benefit, however somebody with a different
> scenario may have a different view...
> 
> my 5 cents.
> 

Very similar here: 2 servers, default cache timing.
5800 records, 55 matches = less than 1% are duplicates

On each machine, we see about 15% cache hits on low+high scoring spam 
(we don't log non-spam), but much of this is due to splitting recipients 
in sendmail.

If you don't use anything in your MTA to stop spam (in front of MS/SA), 
you will probably get higher numbers, since you are using SA to catch a 
more common variety of spam, which tends to be duplicated more often.

Ken

> 
> 
> 
>> --
>> MailScanner mailing list
>> mailscanner at lists.mailscanner.info
>> http://lists.mailscanner.info/mailman/listinfo/mailscanner
>>
>> Before posting, read http://wiki.mailscanner.info/posting
>>
>> Support MailScanner development - buy the book off the website!
>>


-- 
Ken Anderson
Pacific Internet - http://www.pacific.net


More information about the MailScanner mailing list