Whitelists and Fuzzy

René Berber r.berber at computer.org
Thu Aug 23 19:08:57 IST 2007


Denis Beauchemin wrote:

> René Berber a écrit :
>>
>> That's easy to fix, in FuzzyOcr.words change the line to:
>>
>> cialis::0.1
>>
>> Just to make the answer complete, most people probably already know
>> this, the
>> above tells FuzzyOcr to match with high certainty; the example you
>> name does
>> have 5 of the 6 letters of the word in order, so it is no wonder it
>> matches --
>> you may even have to lower the 0.1 to 0.01 .
>>
>> In general I use a changed word list with short words and a few other
>> words that
>> are (mis)matched easily using the lower fuzzy factor (higher
>> certainty).  A few
>> other words I use with a high fuzzy factor.
>>   
> René,
> 
> Could you tell me where you found this information please?  I cannot
> find it anywhere...

On the configuration file:

# Default detection treshold (see manual)
# Default value: 0.25 (Can be changed on a per word basis in the wordlist).

There's no manual yet, but for more detail on how this "threshold" is used you
can read `perldoc String::Approx`.
-- 
René Berber



More information about the MailScanner mailing list