Bayes setup

Julian Field mailscanner at ecs.soton.ac.uk
Tue Apr 8 14:35:16 IST 2003


At 11:00 08/04/2003, you wrote:
>Greetings:
>
>I am running SpamAssassin 2.52 in MailScanner, and I've also been
>following the discussions of the SpamBayes project fairly closely for
>some months. One of the crucial elements of Bayesian detection is
>training, but I don't see any place that documents how to get ham and
>spam messages routed back to the server for training.
>
>Is there some documentation? Am I just missing it by installing
>SpamAssassin from cpan and MailScanner from RPMs?

There are 2 parts to the answer to this:

1) You can set up a "spam" and a "notspam" email address for people to dump
wrongly categorised mail into. You then use sa-learn once every hour (or
day) to teach SpamAssassin about the messages it got wrong. I have already
posted a script to do this to this list, but have attached it again for you.

2) SpamAssassin is unique in being able to "auto-learn", i.e. teach itself.
It uses its other traditional rules to produce a score for each message. If
the score is very high (i.e. definitely spam) or very low (i.e. definitely
ham) then it feeds the message back into the learning code for the Bayes
engine. It only starts using the Bayes engine output as part of the overall
message score once it has auto-learned about 600 messages (I might well be
wrong on that figure, but it's a few hundred).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: learn.spam
Type: application/octet-stream
Size: 748 bytes
Desc: not available
Url : http://lists.mailscanner.info/pipermail/mailscanner/attachments/20030408/30c90012/learn.obj
-------------- next part --------------
--
Julian Field
www.MailScanner.info
MailScanner thanks transtec Computers for their support


More information about the MailScanner mailing list