Spamassassin cache in mysql - feature request
Ken A
ka at pacific.net
Wed Mar 18 14:12:53 GMT 2009
Eduardo Casarero wrote:
> 2009/3/18 Matt <spamlists at coders.co.uk>:
>> Eduardo Casarero wrote:
>>>
>>> I did some research in 1 of my servers, today i've procesed 8505
>>> emails, with 338 cache hits. How can we measure if sharnig caches
>>> improves (a lot, a little, nothing) cache hits? (there is another
>>> server next to it) Obviously without much development so we can test
>>> if having a mysql server improves or not the scenario.
>>>
>>>
>>
>> On each box you need to do
>>
>> sqlite3 /var/spool/MailScanner/incoming/SpamAssassin.cache.db "select md5
>> from cache;" > /tmp/cachehashes.servername
>>
>> move the files on to one box
>>
>> cat /tmp/cachehashes.serv1 /tmp/cachehashes.serv2 | sort | uniq -c | sort -n
>> | grep -v " 1 "
>>
>> Any lines outputted will be the same hash on multiple servers
>>
>
> Here are the results:
>
> (2 MS servers) 34 matches, one cache had 820 records and the other 548.
>
> i just don't see the benefit, however somebody with a different
> scenario may have a different view...
>
> my 5 cents.
>
Very similar here: 2 servers, default cache timing.
5800 records, 55 matches = less than 1% are duplicates
On each machine, we see about 15% cache hits on low+high scoring spam
(we don't log non-spam), but much of this is due to splitting recipients
in sendmail.
If you don't use anything in your MTA to stop spam (in front of MS/SA),
you will probably get higher numbers, since you are using SA to catch a
more common variety of spam, which tends to be duplicated more often.
Ken
>
>
>
>> --
>> MailScanner mailing list
>> mailscanner at lists.mailscanner.info
>> http://lists.mailscanner.info/mailman/listinfo/mailscanner
>>
>> Before posting, read http://wiki.mailscanner.info/posting
>>
>> Support MailScanner development - buy the book off the website!
>>
--
Ken Anderson
Pacific Internet - http://www.pacific.net
More information about the MailScanner
mailing list