how to pre-process forwarded mail for sa-learn

Matt Kettler mkettler at EVI-INC.COM
Tue Aug 16 18:52:30 IST 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "US-ASCII" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Michael Shiloh wrote:
\
> 
> So my question is this: How can I best collect the missed spam from my
> Windows users, and train spamassassin on it? Is there a formal
> procedure, or is the state of the art really the hacked scripts that I
> find on the web? At the very least, can someone point me to a good
> example of such a script?
> 

Recovering a message that has been forwarded in the normal way is impossible.
The whole message will be re-encoded, body text added, headers removed, etc. As
far as sa-learn is concerned, half the message has been completely replaced by
new content.

For example, a spammer sends you a multipart/alternative message with a
text/plain and a text/html. When you forward that message, only the text/html is
likely to be used. Although the client might generate a new text/plain based on
the HTML part, this isn't always the same as the original. In particular
spammers often insert "book quotes" in the text/plain section and have a
porn-site ad in the HTML section.

Also, all the original headers from the message will be gone and will be
replaced by new headers. The subject gets copied, but the Received: headers, the
Message-ID, Return-Path and other important headers are toast.



The only thing you can do is use a "redirect" "bounce" or "forward as
attachment" feature of the mail client. Those all require some pre-processing,
but at least all the information you need is present.

Generally speaking, the forward as attachment is easiest to deal with, you just
need to have a script rip off the attachment and feed that to sa-learn.

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the Wiki (http://wiki.mailscanner.info/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).

Support MailScanner development - buy the book off the website!



More information about the MailScanner mailing list