Bayes auto learn

Matt Kettler mkettler at EVI-INC.COM
Mon Jun 21 22:13:43 IST 2004


At 05:00 PM 6/21/2004, James R. Stevens wrote:
>Hmmm... Got the info from here
>http://www.sng.ecs.soton.ac.uk/mailscanner/serve/cache/98.html
>Can you offer any way to accomplish my goal?
>
>Again, Forward missed Spam to a Linux mailbox. Have something to
>distiguish beteen my organizations mail headers and feed the rest into
>sa-learn.

To be honest with you, what you want to do is VERY difficult unless you
make a lot of assumptions about the MUA in use.

You're worried about fixing the headers... but what about the message body?
encoding formats? etc.

Even if you have a script that fixes the headers, Most email clients
completely re-encode the message body when you forward it, inserting some
things, removing others, and the resulting message bears little resemblance
to the original from a bayes perspective.

Some mail clients are good about this, but most are not.

If you've got a well behaved mailclient, you can use something like this
script that fixes headers to remove forwarding headers:

http://wiki.apache.org/spamassassin/BayesFeedbackViaForwarding?action=highlight&value=forward


Otherwise pretty much the only way to do bayes learning via forward
reliably for most mail clients is to set up a system where your users
forward the original email as an attachment. You can then write a little
script to extract the attachments and feed them to sa-learn.

(I for one can vouch that even forwarding as attachment won't work with
Eudora.. Eudora discards vital parts of multipart/alternative messages and
they cannot be recovered, ever.)

-------------------------- MailScanner list ----------------------
To leave, send    leave mailscanner    to jiscmail at jiscmail.ac.uk
Before posting, please see the Most Asked Questions at
http://www.mailscanner.biz/maq/     and the archives at
http://www.jiscmail.ac.uk/lists/mailscanner.html



More information about the MailScanner mailing list