FuzzyOcr customisations

Gareth list-mailscanner at linguaphone.com
Wed May 9 10:49:45 IST 2007


I have version 0.2 available now and the wrapper script now takes the
threshold as a parameter. There is a 3rd example image included (and on
my website) which requires a slightly lower threshold in order to give
the best results.

Unforunetly even with the cleaned images ocrad and gocr still dont
appear to be able to recognise the images very well. I think the problem
with the first two is that the letters touch each other but I dont know
why the 3rd doesn't work. Perhaps its an issue with the font.


On Wed, 2007-05-09 at 09:16, --[ UxBoD ]-- wrote:
> Very clever indeed Sir. Thats really cool, got me thinking now ;)
> 
> On Tue, 8 May 2007 18:33:06 +0100, "Gareth" <list-mailscanner at linguaphone.com> wrote:
> > Thought people might be interested in the forum at
> > http://www.freespamfilter.org/forum/viewforum.php?f=25 where there are
> > some
> > good tips for customising FuzzyOcr.
> > 
> > Today I have also had a stab at creating an image utility which can be
> > added
> > to a scanset to hopefully improve its detection. Basically it works by
> > producing a grayscale image which contains the differences between a pixel
> > and the average colour over the whole image (rgb calculated separately)
> > 
> > You can see a couple of examples and download and have a play with it
> > yourself on my webpage at
> > http://www.gbnetwork.co.uk/mailscanner/gbpgmdiff/
> > It is still very much a work in progress and I haven't even got round to
> > putting it into one of my scansets yet.
> > 
> > --
> > MailScanner mailing list
> > mailscanner at lists.mailscanner.info
> > http://lists.mailscanner.info/mailman/listinfo/mailscanner
> > 
> > Before posting, read http://wiki.mailscanner.info/posting
> > 
> > Support MailScanner development - buy the book off the website!
> > 
> > --
> > This message has been scanned for viruses and dangerous content by
> > MailScanner, and is
> > believed to be clean.
> -- 
> --[ UxBoD ]--
> // PGP Key: "curl -s http://www.splatnix.net/uxbod.asc | gpg --import"
> // Fingerprint: 543A E778 7F2D 98F1 3E50 9C1F F190 93E0 E8E8 0CF8
> // Keyserver: www.keyserver.net Key-ID: 0xE8E80CF8
> // Phone: +44 (0) 845 869 2749  SIP: uxbod at sip.splatnix.net
> 
> 
> -- 
> This message has been scanned for viruses and dangerous content by MailScanner, and is
> believed to be clean.



More information about the MailScanner mailing list