Whitelists and Fuzzy
r.berber at computer.org
Thu Aug 23 19:08:57 IST 2007
Denis Beauchemin wrote:
> René Berber a écrit :
>> That's easy to fix, in FuzzyOcr.words change the line to:
>> Just to make the answer complete, most people probably already know
>> this, the
>> above tells FuzzyOcr to match with high certainty; the example you
>> name does
>> have 5 of the 6 letters of the word in order, so it is no wonder it
>> matches --
>> you may even have to lower the 0.1 to 0.01 .
>> In general I use a changed word list with short words and a few other
>> words that
>> are (mis)matched easily using the lower fuzzy factor (higher
>> certainty). A few
>> other words I use with a high fuzzy factor.
> Could you tell me where you found this information please? I cannot
> find it anywhere...
On the configuration file:
# Default detection treshold (see manual)
# Default value: 0.25 (Can be changed on a per word basis in the wordlist).
There's no manual yet, but for more detail on how this "threshold" is used you
can read `perldoc String::Approx`.
More information about the MailScanner