SA-Learn
Matt Kettler
mkettler at evi-inc.com
Fri Apr 18 21:37:02 IST 2008
Vernon Webb wrote:
> This may be a silly question to some, but I would really like to learn
> more about what sa-learn does. I have created a folder on my server that
> I move all my SPAM mail to. Mind you this is only SPAM that is NOT
> labeled as SPAM. Should I be moving all mail, even mail that is labeled
> as such as well? And exactly what does this do? I assume that it somehow
> trains MailScanner that this is SPAM, but how? Does it tell it that the
> mail addressed and IPs that this mails come from are sending bad mail?
> Is it only local to my server? Does it report these emails as SPAM to
> some RBL? Please pardon the intrusion if taken as such, I am only trying
> to better understand how MailScanner works.
Sa-learn trains the bayes database used by SpamAssassin. It doesn't report to
RBLs, Razor, or anything else. That's what spamassassin -r is for.
As for feeding, I would strongly suggest not make any considerations other than
"is this spam or not" when choosing whether to feed a message to sa-learn --spam.
If you're only feeding false negatives, you're introducing a bias into your
bayes database. That will eventually cause you to miss some of the spam you were
detecting.
I'd also suggest feeding some nonspam emails to sa-learn with the --ham
parameter, instead of the --spam parameter.
In general, it's best to give sa-learn a realistic, well balanced diet from your
email stream. Obviously it would be difficult to hand classify and train every
message you receive, but that would be the theoretical ideal. Head in that
direction as far as you can without causing yourself undue stress or hassle.
More information about the MailScanner
mailing list