Anti-Phishing Update -- New data feed
MailScanner at ecs.soton.ac.uk
Tue Jun 16 08:42:18 IST 2009
On 15/06/2009 21:35, Steve Freegard wrote:
> Julian Field wrote:
>> On 15/06/2009 21:02, Steve Freegard wrote:
>>> Alex Broens wrote:
>>>>> I need to apply the rules to the entire message body and headers, as
>>>>> they frequently put the email address just in the body of the message
>>>>> inside some link or other. So how would creating separate header and
>>>>> body rules be any better?
>>>> I'm not savvy enough in Perl& SA to give you the scientific reason, but
>>>> its been common practive to avoid full rules if possible.
>>>> You'd have to ask one of the core SA devs... maybe Matt Kettler can
>>>> jump in and tell me I'm totally off and that my understanding is wrong.
>>> 'full' rules are simply inefficient as IIRC the regexps have to be run
>>> multiple times across each block of text (IIRC: SA splits into paragraph
>>> style chunks) to prevent excessive memory use. They also evaluate all
>>> other MIME structures e.g. attachments, images etc. as per the docs.
>> I don't think they include binary attachments, I had to add that
>> specifically for the MCP stuff with a patch to the SA code.
> > From 'man Mail::SpamAssassin::Conf':
> full SYMBOLIC_TEST_NAME /pattern/modifiers
> Define a full message pattern test. "pattern" is a Perl regular
> expression. Note: as per the header tests, "#" must be escaped
> ("\#") or else it is considered the beginning of a comment.
> The full message is the pristine message headers plus the
> message body, including all MIME data such as images, other
> attachments, MIME boundaries, etc.
> The reason it wouldn't work for MCP is that a 'full' rule is not going
> to decode base64/QP parts before evaluating the regexp (I think!).
>>> If you are simply looking to get any e-mail addresses out of the message
>>> body; then a 'uri' rule is far more appropriate e.g.
>>> uri BLAH /^mailto:email\@domain\.com$/
>>> (SA converts all e-mail URIs into mailto: types even those with no
>> But surely that wouldn't work when email addresses just appear in the
>> text in text/plain bodies, would they?
> Sure does:
> [root at mail ~]# cat test.eml
> Return-path:<testfrom at example.com>
> To: test<test at example.com>
> From: test<testfrom at example.com>
> Subject: test
> Content-type: text/plain
> Test body
> bodytest at example.com this is a test bodytest2 at example.com
> [root at mail ~]# /mnt/jungledisk/smf/scripts/uri-extractor.pl test.eml
> URI:mailto:bodytest2 at example.com
> URI:mailto:bodytest at example.com
> (uri-extractor.pl uses SA to extract URIs in the same way the eval()
> rules do; I use this for testing amongst other things).
Thanks for that lot, I stand corrected!
So I want to do
header PHISH_1H ALL =~ /huge|regexp|here/i
uri PHISH_1B /mailto:(huge|regexp|here)/i
And then do the meta rule to join them altogether.
Does that sound better to you?
Julian Field MEng CITP CEng
Buy the MailScanner book at www.MailScanner.info/store
Follow me at twitter.com/JulesFM
MailScanner customisation, or any advanced system administration help?
Contact me at Jules at Jules.FM
PGP footprint: EE81 D763 3DB0 0BFD E1DC 7222 11F6 5947 1415 B654
PGP public key: http://www.jules.fm/julesfm.asc
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
More information about the MailScanner