Phishing detection gets confused by malformed HTML

John Wilcock john at TRADOC.FR
Fri Feb 18 07:41:16 GMT 2005

    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "US-ASCII" character set.  ]
    [ Some characters may be displayed incorrectly. ]

John Wilcock wrote:
> <a href=""/><span style=invisible>
></span>yourbank.<span style=tiny> </span>com</a>
> I can't see any way you could detect things like that and worse, yet not
> trigger on my example above. Forget I asked - you're one step ahead of
> most of us as usual.

One false positive yesterday where I think your phishing net *has* been

> <a href= "">
> <font color="red"><b>MailScanner has detected a possible fraud attempt
> from "" claiming to be</b></font> To start up
> here, companies hire over there</a>

Presumably this is due to the comma in there which could conceivably be
used to disguise a URL.

Idea: would it be possible to be more lenient with what you allow if
there's no markup within the <a> tag, and only apply your more
aggressive net if it looks like the sender is using markup (small size,
white on white or whatever) to disguise ordinary text as a URL, as in my
example quoted at the top of this message.

By "more lenient" I'm thinking along the lines of only triggering if
there's something looks more like a URL, e.g. require that there be a
dot followed by an actual TLD (either a known gTLD or any two-letter
ccTLD) in there, for example.


-- Over 2500 webcams from ski resorts around the world -
-- Translate your technical documents and web pages    -

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at with the words:
'leave mailscanner' in the body of the email.
Before posting, read the MAQ ( and
the archives (

Support MailScanner development - buy the book off the website!

More information about the MailScanner mailing list