HTML Conditionals inside <a> tag mangled by phishing filter

Eric Wirt ewr at erols.com
Wed Sep 21 20:12:10 UTC 2016


I have some MS Outlook users who have been complaining to me about emails arriving mangled, and I finally took the time to dive into what was going on.  I am using MailScanner version 5.0.3.

What I found is that if an email contains HTML conditionals inside an <a> tag, and any part of the email ends up triggering the phishing filters, those conditionals can get mangled. The problem was mostly only visible to Outlook users, since that is the Mail client that is typically targeted with conditionals.  I came up with a little stripped down HTML that demonstrates the problem.

Unfortunately my Perl skills are severely lacking, so while I did peak around Message.pm, and see that it is using HTML::Parser to evaluate each tag, then make changes (if necessary) and write the tags back out, I didn't dig in enough to be able to determine if this is an issue with HTML::Parser itself, or the way Mailscanner rebuilds the email, or something else.  I did check the HTML::Parser version on the server and upgrade it from 3.7.1 to 3.7.2, but that didn’t make a difference.

Here is the "original" HTML email body on an email.  l stripped it down to be easily readable, but in real-life the point is to provide different styling in order to deal with Outlook’s eccentricities than the styling for other email clients.

ORIGINAL HTML:
<html>
<body>

<a href="http://google.com">
  <!--[if gte mso 9]>
    <img src="http://placehold.it/350x150">
    <div style="mso-hide:all;">
  <![endif]-->
  <img src="http://placehold.it/350x150">
  <!--[if gte mso 9]>
    </div>
  <![endif]-->
</a>

<a href="http://google.com">abc.com</a>

</body>
</html>

----------

If you send the above as an email through MailScanner (and leave the 2nd href that triggers the phishing filters), the resulting output of the first A tag is below.  As you can see, the second conditional inside the <a> tag is being moved up above the <img> tag, when it should still be below.

<a href="http://google.com">
  <!--[if gte mso 9]>
    <img src="http://placehold.it/350x150">
    <div style="mso-hide:all;">
  <![endif]-->
  <!--[if gte mso 9]>
    </div>
  <![endif]-->
  <img src="http://placehold.it/350x150">
</a>

----------

In practice, it doesn’t matter where (or how many) of these structures are in the HTML, they all end up mangled, even though they are not the actual part of the email that is triggering the phishing filter.

If I remove the bottom <a href="http://google.com">abc.com</a> so that the phishing filters don't trigger on the email, the email comes through correctly.

Also, if I remove the <a> tag from around the set of two conditionals, it makes it through with the correct order intact  (still triggering the phishing filters), so it only seems to happen when embedded in an <a> tag.  However, it wouldn’t surprise me if there are other enclosing tags that could trigger the same situation, but I haven’t done any testing on that yet.

Any suggestions on how to resolve this would be greatly appreciated.  Thanks!

Eric



More information about the MailScanner mailing list