Anti-Phishing Update -- New data feed

David Lee david at bass.net.au
Wed Jun 17 06:05:13 IST 2009


Julian Field wrote:
>
>
> On 16/06/2009 08:42, Julian Field wrote:
>>
>>
>> On 15/06/2009 21:35, Steve Freegard wrote:
>>> Julian Field wrote:
>>>>
>>>> On 15/06/2009 21:02, Steve Freegard wrote:
>>>>> Alex Broens wrote:
>>>>>
>>>>>>> I need to apply the rules to the entire message body and 
>>>>>>> headers, as
>>>>>>> they frequently put the email address just in the body of the 
>>>>>>> message
>>>>>>> inside some link or other. So how would creating separate header 
>>>>>>> and
>>>>>>> body rules be any better?
>>>>>>>
>>>>>> I'm not savvy enough in Perl&   SA to give you the scientific 
>>>>>> reason, but
>>>>>> its been common practive to avoid full rules if possible.
>>>>>>
>>>>>> You'd have to ask one of the core SA devs...  maybe Matt Kettler can
>>>>>> jump in and tell me I'm totally off and that my understanding is 
>>>>>> wrong.
>>>>>>
>>>>> 'full' rules are simply inefficient as IIRC the regexps have to be 
>>>>> run
>>>>> multiple times across each block of text (IIRC: SA splits into 
>>>>> paragraph
>>>>> style chunks) to prevent excessive memory use.  They also evaluate 
>>>>> all
>>>>> other MIME structures e.g. attachments, images etc. as per the docs.
>>>>>
>>>> I don't think they include binary attachments, I had to add that
>>>> specifically for the MCP stuff with a patch to the SA code.
>>> > From 'man Mail::SpamAssassin::Conf':
>>>
>>>         full SYMBOLIC_TEST_NAME /pattern/modifiers
>>>             Define a full message pattern test.  "pattern" is a Perl 
>>> regular
>>>             expression.  Note: as per the header tests, "#" must be 
>>> escaped
>>>             ("\#") or else it is considered the beginning of a comment.
>>>
>>>             The full message is the pristine message headers plus the
>>> pristine
>>>             message body, including all MIME data such as images, other
>>>             attachments, MIME boundaries, etc.
>>>
>>> The reason it wouldn't work for MCP is that a 'full' rule is not going
>>> to decode base64/QP parts before evaluating the regexp (I think!).
>>>
>>>>> If you are simply looking to get any e-mail addresses out of the 
>>>>> message
>>>>> body; then a 'uri' rule is far more appropriate e.g.
>>>>>
>>>>> uri BLAH  /^mailto:email\@domain\.com$/
>>>>>
>>>>> (SA converts all e-mail URIs into mailto: types even those with no
>>>>> scheme).
>>>>>
>>>> But surely that wouldn't work when email addresses just appear in the
>>>> text in text/plain bodies, would they?
>>> Sure does:
>>>
>>> [root at mail ~]# cat test.eml
>>> Return-path:<testfrom at example.com>
>>> To: test<test at example.com>
>>> From: test<testfrom at example.com>
>>> Subject: test
>>> Content-type: text/plain
>>>
>>> Test body
>>>
>>> bodytest at example.com this is a test bodytest2 at example.com
>>>
>>> [root at mail ~]# /mnt/jungledisk/smf/scripts/uri-extractor.pl test.eml
>>> URI-Domain:example.com
>>> URI:mailto:bodytest2 at example.com
>>> URI:mailto:bodytest at example.com
>>>
>>> (uri-extractor.pl uses SA to extract URIs in the same way the eval()
>>> rules do; I use this for testing amongst other things).
>> Thanks for that lot, I stand corrected!
>>
>> So I want to do
>> header PHISH_1H ALL =~ /huge|regexp|here/i
>> uri PHISH_1B /mailto:(huge|regexp|here)/i
>> And then do the meta rule to join them altogether.
>>
>> Does that sound better to you?
> I have published an improved much faster version 2.01 which is 
> available from
>
>     http://www.jules.fm/Logbook/files/anti-phishing-v2.html
>
> You might well want to upgrade...
>
> Jules
>
I assume the spamassassin rules generated by your improved script are 
different to those obtained via the 'spear.bastionmail.com' channel 
using sa-update.


David

-- 
-----------------------------------------------------------------------
David Lee                                         
Systems Administrator                             Tel: +61-8-8205-2467
BASS South Australia                              Fax: +61-8-8205-0550
GPO Box 1269, Adelaide 5000                    http://www.bass.net.au/
-----------------------------------------------------------------------




More information about the MailScanner mailing list