Anti-Phishing Update -- New data feed

Steve Freegard steve.freegard at
Mon Jun 15 21:35:08 IST 2009

Julian Field wrote:
> On 15/06/2009 21:02, Steve Freegard wrote:
>> Alex Broens wrote:
>>>> I need to apply the rules to the entire message body and headers, as
>>>> they frequently put the email address just in the body of the message
>>>> inside some link or other. So how would creating separate header and
>>>> body rules be any better?
>>> I'm not savvy enough in Perl&  SA to give you the scientific reason, but
>>> its been common practive to avoid full rules if possible.
>>> You'd have to ask one of the core SA devs...  maybe Matt Kettler can
>>> jump in and tell me I'm totally off and that my understanding is wrong.
>> 'full' rules are simply inefficient as IIRC the regexps have to be run
>> multiple times across each block of text (IIRC: SA splits into paragraph
>> style chunks) to prevent excessive memory use.  They also evaluate all
>> other MIME structures e.g. attachments, images etc. as per the docs.

> I don't think they include binary attachments, I had to add that
> specifically for the MCP stuff with a patch to the SA code.

>From 'man Mail::SpamAssassin::Conf':

       full SYMBOLIC_TEST_NAME /pattern/modifiers
           Define a full message pattern test.  "pattern" is a Perl regular
           expression.  Note: as per the header tests, "#" must be escaped
           ("\#") or else it is considered the beginning of a comment.

           The full message is the pristine message headers plus the
           message body, including all MIME data such as images, other
           attachments, MIME boundaries, etc.

The reason it wouldn't work for MCP is that a 'full' rule is not going
to decode base64/QP parts before evaluating the regexp (I think!).

>> If you are simply looking to get any e-mail addresses out of the message
>> body; then a 'uri' rule is far more appropriate e.g.
>> uri BLAH  /^mailto:email\@domain\.com$/
>> (SA converts all e-mail URIs into mailto: types even those with no
>> scheme).
> But surely that wouldn't work when email addresses just appear in the
> text in text/plain bodies, would they?

Sure does:

[root at mail ~]# cat test.eml
Return-path: <testfrom at>
To: test <test at>
From: test <testfrom at>
Subject: test
Content-type: text/plain

Test body

bodytest at this is a test bodytest2 at

[root at mail ~]# /mnt/jungledisk/smf/scripts/ test.eml
URI:mailto:bodytest2 at
URI:mailto:bodytest at

( uses SA to extract URIs in the same way the eval()
rules do; I use this for testing amongst other things).


More information about the MailScanner mailing list