A difficult false positive example is:

Nov 19 04:15:18 cheviot5 MailScanner[14191]: Found phishing fraud from
support at for j.bloggs at
claiming to be emailsupport at

Is it impossible to parse this safely before comparing the strings?

A more common type of false positive is:

Nov 19 05:51:14 cheviot5 MailScanner[14163]: Found phishing fraud from claiming to be

I can see why you might be unwilling to remove the "www." from the actual
link before doing the comparison but is it really that unsafe?

What is a good and useful feature still has a false positive rate that is
unacceptably high.

Could your editing of the strings in the hypertext link be done more
aggressively before comparison? I know this may risk a possible rise in
the false negative rate but there are other detectors in MailScanner which
you acknowledge have a non-zero false negative rate.

I would be willing to see the false negative rate increase slightly in
order to reduce the number of times we cry "wolf!"

