Phishing detection gets confused by malformed HTML

John Wilcock john at TRADOC.FR
Thu Feb 17 08:14:46 GMT 2005

    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "US-ASCII" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Given the following input (admittedly malformed, but this occurred in a
a genuine newsletter received by one of my users):

> <a href="
> <a
> href=""></a>

MailScanner detects a phish, but apparently gets confused:

> <a href="
> <a
> href=""><font color="red"><b>MailScanner has detected a possible fraud attempt from "<a
> href=" claiming to be</b></font></a>

Logging is confused too, split over two lines:

> Feb 17 08:53:27 gate MailScanner[4662]: Found phishing fraud from <a
> Feb 17 08:53:27 gate MailScanner[4662]: href= claiming to be in 676A6E100C.5A3EA

Also, in trying to reproduce this I noticed that the same input but
without the quote on the malformed leading <a> tag is detected as being
IP-based phishing.

> <a href=
> <a
> href=""></a>

Logged as:

> Feb 17 08:46:39 gate MailScanner[4662]: Found ip-based phishing fraud from <a in 3F5CEE100C.57EBC

The HTML is completely malformed and doesn't result in a working link,
but you might like to take a fresh look at the code in case there are
other ways to craft a malformed link that might actually work but get
through MS.


-- Over 2500 webcams from ski resorts around the world -
-- Translate your technical documents and web pages    -

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at with the words:
'leave mailscanner' in the body of the email.
Before posting, read the MAQ ( and
the archives (

Support MailScanner development - buy the book off the website!

More information about the MailScanner mailing list