Bayesian shenanigans (i.e. problems)

Peter Bates Peter.Bates at LSHTM.AC.UK
Thu Jan 15 09:50:09 GMT 2004

Hello all...

Just before Christmas (a great time for things to happen!), I had
general problems with SpamAssassin timing out constantly with
This was naturally leading to a lot of unwanted material sneaking

I upgraded to MS 4.25 (RPM version), and SA 2.61, but eventually
shifted to disabling Bayes with 'use_bayes 0'. I'm also using DCC and
Razor, so thought my 'hit-rate' would still be reasonable...

I'm running with Postfix, hence all the files being owned by

I have in spam.assassin.prefs.conf :

bayes_path                 /var/spool/MailScanner/spamassassin/bayes
bayes_file_mode            0600

Here's an 'ls -lh':

-rw-------    1 postfix  postfix       661 Dec 27 23:06 bayes_journal
-rw-r--r--    1 postfix  postfix       40M Dec 27 23:06 bayes_seen
-rw-------    1 postfix  postfix      265M Dec 27 23:06 bayes_toks
-rw-------    1 postfix  postfix      2.7G Dec 27 23:01
-rw-r--r--    1 postfix  postfix      4.8M Oct 15 09:22 old_bayes_seen
-rw-r--r--    1 postfix  postfix       22M Oct 15 09:22 old_bayes_toks

This system has only been auto-learning, and I've also tried sa-learn

Are these unreasonable sizes? Should I be setting some other
configuration parameter to ensure smaller sizes? Which of these files
(presumably not the 2.7G one!) is actually being used anyway?

I still have Bayes off for now, but would like to reinstate it, but at
the moment I'm almost tempted to desire each message to run once through
the Bayes stuff, and then be run again through SA even if that times
out... with the Bayes, the timeouts were very clear to see.

... any advice would be most appreciated!

Peter Bates, Systems Support Officer, Network Support Team.
London School of Hygiene & Tropical Medicine.
Telephone:0207-958 8353 / Fax: 0207- 636 9838

