HTML image only spam and OCR

Ian cobalt-users1 at fishnet.co.uk
Thu Mar 9 17:14:28 GMT 2006


On 8 Mar 2006 at 11:14, Taso Chatziantoniou wrote:

<snip>

> Also one other question ..
> Does anyone know of a good site or forum that we can submit sample spams to help us figure 
> out a way to block them. We keep getting these stock html image only files with bayes poisining 
> on the bottom that we cannot seem to find a pattern to to block.

Hi,

After reading this bit I had though about maybe using ocr when these types of messages are 
found.

A (not-so) quick experiment using netpbm and gocr on a linux machine here produces some 
ASCII output from one of these gif images.

The question is:  how can I get MailScanner / SpamAssassin to use this method?

The command line I am using is:


giftopnm test.gif | gocr -


which then produces the text on stdout.

Thoughts anyone?


Ian
-- 



More information about the MailScanner mailing list