Bayes problem with MailScanner

Julian Field mailscanner at ecs.soton.ac.uk
Mon Mar 10 17:03:31 GMT 2003


Does it change if you do a
         sa-learn --rebuild
?

At 16:13 10/03/2003, you wrote:
>Hi,
>
>I am still trying to figure out why MailScanner is not using Bayes at
>the moment. Therefore I hacked the code a bit to write all SA debug
>output to a log file even when being called by MailScanner. Here is an
>interesting part:
>
>using "/usr/local/share/spamassassin" for default rules dir
>using "/usr/local/etc/mail/spamassassin" for site rules dir
>using "/var/spool/exim.in/.spamassassin" for user state dir
>using "/usr/local/MailScanner/etc/spam.assassin.prefs.conf" for user
>prefs file
>bayes: tie-ing to DB file R/O /var/spool/spamassassin/bayes_toks
>bayes: tie-ing to DB file R/O /var/spool/spamassassin/bayes_seen
>debug: Only 53 spam(s) in Bayes DB < 200
>
>So the MailScanner/SA combination thinks it only has 53 spams. But now
>have a look at this:
>
>root at proxy:/usr/ports/mail/p5-Mail-SpamAssassin/work/Mail-SpamAssassin-2
>.50/tools # ./check_bayes_db -db /var/spool/spamassassin/bayes
>0.000        0        0        0  non-token data: db format = on-the-fly
>probs, expiry, scan-counting
>0.000        0      269        0  non-token data: nspam
>0.000        0     2320        0  non-token data: nham
>0.000        0        0        0  non-token data: ntokens
>0.000        0        0        0  non-token data: oldest age
>0.000        0      270        0  non-token data: current scan-count
>0.000        0        0        0  non-token data: last expiry scan-count
>
>Or this:
>
>root at proxy:/tmp # spamassassin -t <
>1047306210_0.78770.proxy.intern.akctech.de
>debug: using "/usr/local/etc/mail/spamassassin" for site rules dir
>debug: using "/root/.spamassassin" for user state dir
>debug: using "/root/.spamassassin/user_prefs" for user prefs file
>debug: bayes: tie-ing to DB file R/O /var/spool/spamassassin/bayes_toks
>debug: bayes: tie-ing to DB file R/O /var/spool/spamassassin/bayes_seen
>debug: Score set 3 chosen.
>--- snipp ---
>debug: bayes corpus size: nspam = 269, nham = 2320
>
>And of course bayes is used by spamassassin -t....
>
>I simply do not see the difference... both ways use the same database
>obviously. Why does the SA/MS combination say 53 spams in the DB?
>
>BTW: /var/spool/exim.in/.spamassassin and /root/.spamassassin are equal
>so that sould not be it.
>
>Regards,
>    JP

--
Julian Field
www.MailScanner.info
MailScanner thanks transtec Computers for their support



More information about the MailScanner mailing list