Anti-Phishing Update -- New data feed
Julian Field
MailScanner at ecs.soton.ac.uk
Tue Jun 16 08:42:18 IST 2009
On 15/06/2009 21:35, Steve Freegard wrote:
> Julian Field wrote:
>
>>
>> On 15/06/2009 21:02, Steve Freegard wrote:
>>
>>> Alex Broens wrote:
>>>
>>>
>>>>> I need to apply the rules to the entire message body and headers, as
>>>>> they frequently put the email address just in the body of the message
>>>>> inside some link or other. So how would creating separate header and
>>>>> body rules be any better?
>>>>>
>>>>>
>>>> I'm not savvy enough in Perl& SA to give you the scientific reason, but
>>>> its been common practive to avoid full rules if possible.
>>>>
>>>> You'd have to ask one of the core SA devs... maybe Matt Kettler can
>>>> jump in and tell me I'm totally off and that my understanding is wrong.
>>>>
>>>>
>>> 'full' rules are simply inefficient as IIRC the regexps have to be run
>>> multiple times across each block of text (IIRC: SA splits into paragraph
>>> style chunks) to prevent excessive memory use. They also evaluate all
>>> other MIME structures e.g. attachments, images etc. as per the docs.
>>>
>>>
>
>> I don't think they include binary attachments, I had to add that
>> specifically for the MCP stuff with a patch to the SA code.
>>
> > From 'man Mail::SpamAssassin::Conf':
>
> full SYMBOLIC_TEST_NAME /pattern/modifiers
> Define a full message pattern test. "pattern" is a Perl regular
> expression. Note: as per the header tests, "#" must be escaped
> ("\#") or else it is considered the beginning of a comment.
>
> The full message is the pristine message headers plus the
> pristine
> message body, including all MIME data such as images, other
> attachments, MIME boundaries, etc.
>
> The reason it wouldn't work for MCP is that a 'full' rule is not going
> to decode base64/QP parts before evaluating the regexp (I think!).
>
>
>>> If you are simply looking to get any e-mail addresses out of the message
>>> body; then a 'uri' rule is far more appropriate e.g.
>>>
>>> uri BLAH /^mailto:email\@domain\.com$/
>>>
>>> (SA converts all e-mail URIs into mailto: types even those with no
>>> scheme).
>>>
>>>
>> But surely that wouldn't work when email addresses just appear in the
>> text in text/plain bodies, would they?
>>
> Sure does:
>
> [root at mail ~]# cat test.eml
> Return-path:<testfrom at example.com>
> To: test<test at example.com>
> From: test<testfrom at example.com>
> Subject: test
> Content-type: text/plain
>
> Test body
>
> bodytest at example.com this is a test bodytest2 at example.com
>
> [root at mail ~]# /mnt/jungledisk/smf/scripts/uri-extractor.pl test.eml
> URI-Domain:example.com
> URI:mailto:bodytest2 at example.com
> URI:mailto:bodytest at example.com
>
> (uri-extractor.pl uses SA to extract URIs in the same way the eval()
> rules do; I use this for testing amongst other things).
>
Thanks for that lot, I stand corrected!
So I want to do
header PHISH_1H ALL =~ /huge|regexp|here/i
uri PHISH_1B /mailto:(huge|regexp|here)/i
And then do the meta rule to join them altogether.
Does that sound better to you?
Jules
--
Julian Field MEng CITP CEng
www.MailScanner.info
Buy the MailScanner book at www.MailScanner.info/store
Follow me at twitter.com/JulesFM
MailScanner customisation, or any advanced system administration help?
Contact me at Jules at Jules.FM
PGP footprint: EE81 D763 3DB0 0BFD E1DC 7222 11F6 5947 1415 B654
PGP public key: http://www.jules.fm/julesfm.asc
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
More information about the MailScanner
mailing list