Lynx for html-> text rendering

David H. dh at UPTIME.AT
Sat Sep 27 17:40:15 IST 2003


Joe Baker wrote:
> I've noticed spam that inserts tags in the middle
> of words to defeat phrase detection.
> 
> There's a great text only web browser I've used
> in the past called Lynx which might be used
> to weed out the html tags for phrase analysis.
> 
> I'm not able to make this modification myself, but would only
> offer this up as a suggestion.
> 
> Perhaps there could be an interactive lynx session/daemon
> which could be directed to parse text blocks rather than
> forcing Lynx to start up over and over again.
> 
> -Joe Baker
Joe, there is HTML::Parser and HTML::TokeParser along many others. I am 
sure Julian will come up with an ingenious way of using them ;)

-d

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 186 bytes
Desc: not available
Url : http://lists.mailscanner.info/pipermail/mailscanner/attachments/20030927/c36af643/attachment.bin


More information about the MailScanner mailing list