Anti-Phishing Update -- New data feed
steve.freegard at fsl.com
Mon Jun 15 21:02:45 IST 2009
Alex Broens wrote:
>> I need to apply the rules to the entire message body and headers, as
>> they frequently put the email address just in the body of the message
>> inside some link or other. So how would creating separate header and
>> body rules be any better?
> I'm not savvy enough in Perl & SA to give you the scientific reason, but
> its been common practive to avoid full rules if possible.
> You'd have to ask one of the core SA devs... maybe Matt Kettler can
> jump in and tell me I'm totally off and that my understanding is wrong.
'full' rules are simply inefficient as IIRC the regexps have to be run
multiple times across each block of text (IIRC: SA splits into paragraph
style chunks) to prevent excessive memory use. They also evaluate all
other MIME structures e.g. attachments, images etc. as per the docs.
If you are simply looking to get any e-mail addresses out of the message
body; then a 'uri' rule is far more appropriate e.g.
uri BLAH /^mailto:email\@domain\.com$/
(SA converts all e-mail URIs into mailto: types even those with no scheme).
Then use header rules for the To/Cc/Bcc/Sender headers.
Might also be worth using Regexp::Assemble to generate the initial
regexps if you aren't already.
Once lists like these reach over a certain size; regexps are going to be
memory hungry and far less efficient; at which point the EmailBL style
DNS lists are more appropriate and scalable as the addresses are exact
More information about the MailScanner