running html2text but still the e-mails are not completely clean?

Remco Barendse mailscanner at BARENDSE.TO
Fri Jan 10 16:26:17 GMT 2003


I am trying out the html2text feature.

When I look through a mail box I can see that not all html crap is
removed. The filtered e-mails are about half the size before they went
through th2 html2text filter but still there are loads of crap visible
when looking at these mails in pine.

This problem mostly seems to occur when the sender is using M$ Word as
their e-mail editor for Outlook, the rest is filtered out pretty nicely.

In pine loads of this chatter is visible:
@font-face { font-family: MS Mincho; } @font-face { font-family: @MS Mincho; } @page Section1
{size: 595.35pt 842.0pt; margin: 26.95pt 70.9pt 1.0in 70.9pt; mso-header-margin: .5in;
mso-footer-margin: .5in; mso-paper-source: 0; } P.MsoNormal { FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt;
FONT-FAMILY: Arial; mso-style-parent: ""; mso-pagination: widow-orphan; mso-fareast-font-family:
"MS Mincho"; mso-bidi-font-family: "Times New Roman"; mso-ansi-language: NL; mso-fareast-language:
JA; mso-bidi-font-weight: bold } LI.MsoNormal { FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY:
Arial; mso-style-parent: ""; mso-pagination: widow-orphan; mso-fareast-font-family: "MS Mincho";
mso-bidi-font-family: "Times New Roman"; mso-ansi-language: NL; mso-fareast-language: JA;

Is this a bug in the filter?


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the MailScanner mailing list