Phishing net getting confused

John Wilcock john at tradoc.fr
Wed Jul 18 08:19:16 IST 2007


Scott Silva wrote:
> John Wilcock spake the following on 7/17/2007 8:28 AM:
>> I think I've uncovered a buglet in the phishing net code (MailScanner
>> version 4.61.7).
...
>> Mailscanner's phishing net detected this as follows:
>>
>>> MailScanner[12590]: Found phishing fraud from promos.hotbar.com
>>> claiming to be
>>> www.<imgmoz-do-not-send="true"title=""alt="upgradeyouremail-clickhere!"src="http
>>>
>>> in 6F13B8053.635D4
>> Clearly the moz-do-not-send is causing a problem, since the original
>> message without those tags correctly passed through the net undetected.
>>
>> John.
>>
> Did sending user tell Thunderbird it was not junk "before" forwarding? I think
> that is how it disables stuff it thinks is bad.

Quite possibly, but that's not the point here. MS is getting "confused" 
by the hyphens in the html attribute name.

Looking at the code, there's a tag detection regex that searches for tag 
names and attribute names with \w+ whereas in fact the HTML spec, or 
rather the underlying SGML spec also allows names to contain (but not 
start with) -_.: as well. I've attached a patch that seems to do the 
trick, for Julian's perusal.

John.

-- 
-- Over 3000 webcams from ski resorts around the world - www.snoweye.com
-- Translate your technical documents and web pages    - www.tradoc.fr
-------------- next part --------------
--- ./Message.pm.orig	2007-07-18 08:44:27.000000000 +0200
+++ /usr/lib/MailScanner/MailScanner/Message.pm	2007-07-18 09:04:08.000000000 +0200
@@ -5832,7 +5832,7 @@
     $squashedtext =~ s/\\/\//g; # Change \ to / as many browsers do this
     $squashedtext =~ s/^\[\d*\]//; # Removing leading [numbers]
     #$squashedtext =~ s/(\<\/?[^>]*\>)*//ig; # Remove tags
-    $squashedtext =~ s/(\<\/?\w+((\s+\w+(\s*=\s*(?:\".*?\"|\'.*?\'|[^\'\">\s]+))?)+\s*|\s*)\/?\>)*//ig; # Remove tags, better re from snifer_ at hotmail.com
+    $squashedtext =~ s/(\<\/?[a-z][a-z0-9:._-]*((\s+[a-z][a-z0-9:._-]*(\s*=\s*(?:\".*?\"|\'.*?\'|[^\'\">\s]+))?)+\s*|\s*)\/?\>)*//ig; # Remove tags, re from snifer_ at hotmail.com adapted by JW 
     $squashedtext =~ s/\s+//g; # Remove any whitespace
     $squashedtext =~ s/^[^\/:]+\@//; # Remove username of email addresses
     #$squashedtext =~ s/\&\w*\;//g; # Remove things like &lt; and &gt;


More information about the MailScanner mailing list