Training the bayesian engine and sa-learn {Scanned by HJMS}

Julian Field mailscanner at ecs.soton.ac.uk
Wed Sep 3 21:34:02 IST 2003


The other thing to remember is that, based on the scores from the other
rules, it auto-learns from very spammy and very non-spammy messages. So
most of the time you don't need to train it manually at all, it will do it
on its own. However, you may also want to have some sort of "spam" and
"notspam" mailboxes which get processed by sa-learn. Search the archives
for "sa-learn --mbox" or "sa-learn -mbox" and you'll find my scripts to do
it all for you.

At 21:03 03/09/2003, you wrote:
>You're close - but sa-learn doesn't update whitelists or blacklists - it
>just trains the Bayesian filtering engine, which identifies patterns in spam
>and uses them to recognize future spam.  SpamAssassin passes messages to the
>Bayesian engine and gets a score for each message, just as it does for its
>other rules.  This score just becomes part of the cumulative score for the
>message.
>
>There's a FAQ entry on how to set up a script to automatically run sa-learn
>- sounds like you already found that.  If you have trouble getting it to
>work, ask for help again.
>
>Besides the bayesian filtering, you can also whitelist and blacklist senders
>but I would hesitate to recommend automating that process - I can imagine
>users blindly forwarding spam from the sobig virus to an address that would
>automatically blacklist the sender, which would be a bad thing since sobig
>is likely to come "from" someone who regularly emails you.
>
>HTH,
>Trever
>
> > -----Original Message-----
> > From: Chris Lyon [mailto:cslyon at NETSVCS.COM]
> > Sent: Wednesday, September 03, 2003 2:34 PM
> > To: MAILSCANNER at JISCMAIL.AC.UK
> > Subject: Training the bayesian engine and sa-learn {Scanned by HJMS}
> >
> >
> > So, I have been reading the FAQ and also the past posts but
> > have a little
> > confusion that I need to resolve. Just to give a little back
> > ground, I have
> > a lot of users who all have issues with e-mail that is being
> > marked as spam
> > or not being marked as spam. So, I think the answer to this
> > is to have them
> > forward the messages to an unattended mailbox that will
> > autowhitelist or
> > autoblacklist the sender.  Is that what sa-learn is all about?
> >
> >
> > So, if I create a spam and non-spam account on server and use
> > the sa-learn
> > to check the messages that my users forward to these
> > accounts, if something
> > was marked as spam and is not, further messages will not be
> > marked again?
> > Conversely, if I have a message that is spam but not marked,
> > I can forward
> > that to spam and it will be marked as spam the next message
> > that comes in
> > from that sender?
> >
> >
> > How does it work, based on content I would assume or does it
> > work by the
> > domain? Also, what happens with stuff being forwarded from
> > different mail
> > clients like outlook?
> >
> >
> > Can anybody shed some light on this one?
> >

--
Julian Field
www.MailScanner.info
Professional Support Services at www.MailScanner.biz
MailScanner thanks transtec Computers for their support



More information about the MailScanner mailing list