identical messages -- some get bayes score, some don't

Cannon Watts cwatts at
Mon Jan 12 17:33:52 GMT 2009

On Sun, January 11, 2009 12:31 pm, Kai Schaetzl wrote:
> Cannon Watts wrote on Sat, 10 Jan 2009 11:28:13 -0600 (CST):
>> Thanks, that certainly cuts down on the timeouts,  The URIBL tests are
>> still generating 281 timeouts on those 28 messages, but that's a minor
>> concern now that the bayes issues seem to be sorted out (see below).
> As said earlier, there is surely something wrong either with your dns
> setup or
> with your software (e.g. DNS::Net too old or so). Have you set
> dns_available
> yes or do you let SA check that? If set to yes set it to no and let SA
> show
> you the outcome.

I'll look into the DNS::Net module.  I have not tried setting dns_available
to 'no', but I did set it to 'test' and the debugging messages showed it
successfully contacting both DNS servers in my /etc/resolv.conf (the first
of those being the localhost)

>> I guess my database was either corrupt, or just too big.
> For being "too big" it should have had at least 5 million tokens (I
> haven't
> ever seen a database over that size, but I can say that databases in this
> range are still fine performance-wise).

I'm not sure it's worth the time and effort to figure out _why_ the old
database was performing so poorly.  After removing it, and starting fresh,
every incoming mail appears to get a BAYES score, and where some users were
getting as many as 20 spams per day slipping through the filter, those same
users have not had one since rebuilding the database.

I am seeing a few false positives, but I think a little bayes re-training
will sort that out in short order.

Thanks again for your help.

More information about the MailScanner mailing list