I am looking for the ham and spam files

Ken Goods KGoods at AIAINSURANCE.COM
Fri Jul 15 17:47:17 IST 2005


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "US-ASCII" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Billy A. Pumphrey wrote:
>> Billy A. Pumphrey wrote:
>>>> Subject: Re: I am looking for the ham and spam files <snip>
<snip>
> 
> Yes, got the other question answered, thank you everyone :)
> 
> You don't have to, but If you would not mind expanding on how you are
> using your spam mailbox.  I have a exchange server also, and
> MailScanner is in between the internet and the exchange server.  Like
> yours I would suspect.  How are you taking the un-caught mail and
> putting it in the spam mailbox?
> 
> Are you taking the ID of the message (by using mailwatch) and doing a
> cp command and moving the message to the spam mbox?  Such as:
> Cp /var/spool/MailScanner/quarantine/20050713/j6CH7SSN017953
> /home/user/mail/
>

I'm not completely sure how mbox format is put together so I couldn't tell
you if the above would work. But it seems to me if it is in quarantine it
has already been caught as spam (if that's your setup) and the only benefit
of learning it at that point would be if the bayes score was abnormally low.

I only learn from spams that were not caught by my gateway. In other words,
they are already sitting in Exchange. So there are a couple tricks to get
them back to the Linux e-mail filter box with the headers intact.

A gentleman named Ray Gibson helped me out a ton getting my bayes learning
set up from Exchange. I'll send you an e-mail exchange we had (off list
since it's a little long) but for the list's benefit I will put a link to
his tutorial page here: 

http://www.raygibson.net/kb/amavis/

Great stuff if you use Debian/Exim/Amavisd-new or if you're just building a
machine for use as an e-mail filtering gateway.

I use Mailscanner (of course ;)), sendmail, SA and ClamAV. But the concept
is the same. 
1.Create a user on your NT network and a mailbox in Exchange for this user. 
2.Log on to the network as that user, create two folders in Outlook named
spam and ham and give permissions to anyone authorized to move spam (or ham)
into that mailbox.
3.Use fetchmail (or something similar) to fetch the mails from Exchange to
your Linux e-mail filter box.
4.Use sa-learn on the resulting mbox.

I don't normally learn hams any more as I've not had any false positives.
The main thing to be concerned about when learning from mails after they've
hit an Exchange server/Outlook is to drag and drop the un-caught spam into a
folder for later fetching, otherwise the headers get mangled to the point of
being useless for bayes learning.

My intention is to put together a wiki page outlining the whole process but
I just haven't had the time. You'd be surprised how often this subject comes
up here on the list.

Take a look at the above page and be expecting another email off list.
Hopefully I'll get to the wiki in the next couple weeks.

Kind regards,
Ken

Ken Goods
Network Administrator
AIA Insurance, Inc.

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the Wiki (http://wiki.mailscanner.info/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).

Support MailScanner development - buy the book off the website!



More information about the MailScanner mailing list