identical messages -- some get bayes score, some don't

Kai Schaetzl maillists at conactive.com
Sun Jan 11 18:31:35 GMT 2009


Cannon Watts wrote on Sat, 10 Jan 2009 11:28:13 -0600 (CST):

> Thanks, that certainly cuts down on the timeouts,  The URIBL tests are
> still generating 281 timeouts on those 28 messages, but that's a minor
> concern now that the bayes issues seem to be sorted out (see below).

As said earlier, there is surely something wrong either with your dns setup or 
with your software (e.g. DNS::Net too old or so). Have you set dns_available 
yes or do you let SA check that? If set to yes set it to no and let SA show 
you the outcome.

> It probably averages around 6000 per day.

That's not much and should be ok even for the old server, given enough RAM.

'time spamassassin --lint'
> returns
>      real    0m2.450s
>      user    0m2.309s
>      sys     0m0.141s

Hm, I'm not sure if timeouts would be counted in these figures at all. Figure 
looks ok.

> I ran spamassassin --lint -D, and did find something peculiar in the output.
> 
>   dbg: bayes: tie-ing to DB file R/O /etc/MailScanner/bayes/bayes_toks
>   dbg: bayes: tie-ing to DB file R/O /etc/MailScanner/bayes/bayes_seen
>   .....
>   dbg: bayes: not available for scanning, only 0 spam(s) in bayes DB < 200
> 
> /etc/MailScanner/bayes is the correct location for those files, and sa-learn
> has been updating them without any errors, but something is obviously not
> right.

You may have learned the wrong files (belonging to a different user). You have 
to set a site-wide Bayes with MS.

I moved the old bayes_toks and bayes_seen files, then fed bayes
> around 500 spams and hams via sa-learn to create a new database.
> 
> Now, running spamassassin on those 28 messages generates a BAYES_99 score
> for each one with no bayes timeouts.

Good.

> 
> I guess my database was either corrupt, or just too big.

For being "too big" it should have had at least 5 million tokens (I haven't 
ever seen a database over that size, but I can say that databases in this 
range are still fine performance-wise).


Kai

-- 
Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services: http://www.conactive.com





More information about the MailScanner mailing list