MailScanner + Bayes on SQL

Dhawal Doshy dhawal at netmagicsolutions.com
Tue May 16 13:15:57 IST 2006


Kai Schaetzl wrote:
> Dhawal Doshy wrote on Mon, 15 May 2006 12:09:33 +0530:
> 
>> precisely.. See, 
>> http://wiki.mailscanner.info/doku.php?id=documentation:anti_spam:spamassassin:bayes:sql
> 
> Ah, thanks, seems I read the wrong wiki :-)
> 
>> mysql> SELECT id, username, spam_count, ham_count, token_count FROM 
>> bayes_vars;
> 
> Seems to be the one that's also proposed in the wiki: root.
> 
> I'm still waiting that the --restore finishes, I've got quite a few tokens .... One caveat 
> I've already recognized is that storing it in MySQL takes much more, maybe three times as 
> much space as with dbm. The indexes take a lot.

Yes, but disk is cheap.. comparing MySQL (innodb) with DBM: scanning and 
expiry are way faster, forgets are slower and learning is more or less 
as fast/slow as for DBM.

See these for more details..
http://wiki.apache.org/spamassassin/BayesBenchmark
http://wiki.apache.org/spamassassin/BayesBenchmarkResults

Plus SQL will let you share Bayes across multiple front-end MX servers 
and permission errors are a thing of the past..

- dhawal

> Kai


More information about the MailScanner mailing list