Lynx for html-> text rendering
David H.
dh at UPTIME.AT
Sat Sep 27 17:40:15 IST 2003
Joe Baker wrote:
> I've noticed spam that inserts tags in the middle
> of words to defeat phrase detection.
>
> There's a great text only web browser I've used
> in the past called Lynx which might be used
> to weed out the html tags for phrase analysis.
>
> I'm not able to make this modification myself, but would only
> offer this up as a suggestion.
>
> Perhaps there could be an interactive lynx session/daemon
> which could be directed to parse text blocks rather than
> forcing Lynx to start up over and over again.
>
> -Joe Baker
Joe, there is HTML::Parser and HTML::TokeParser along many others. I am
sure Julian will come up with an ingenious way of using them ;)
-d
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 186 bytes
Desc: not available
Url : http://lists.mailscanner.info/pipermail/mailscanner/attachments/20030927/c36af643/attachment.bin
More information about the MailScanner
mailing list