Bayesian shenanigans (i.e. problems)

Martin Hepworth martinh at SOLID-STATE-LOGIC.COM
Tue Jan 20 10:05:29 GMT 2004

David Lee wrote:
> On Fri, 16 Jan 2004, Peter Bates wrote:
>>>mkettler at EVI-INC.COM 15/01/04 15:56:50 >>>
>>At 05:38 AM 1/15/2004, David Lee wrote:
>>>"Me, too!" (bayes_toks ~ 50MB, ~ 1.4GB).  Glad I'm not
>>>Yes those sizes are unreasonable... It sounds like expiry is never
>>>on your system.
>>>Try running expiry manualy using sa-learn --force-expire and see if
>>>clears things up.
>>Well, I've done a --force-expire, and got:
>>-rw-r--r--    1 postfix  postfix       40M Jan 16 14:51 bayes_seen
>>-rw-------    1 postfix  postfix      123k Jan 16 14:51 bayes_journal
>>-rw-------    1 postfix  postfix      265M Jan 16 14:51 bayes_toks
>>-rw-------    1 postfix  postfix      2.7G Jan 16 13:08
>>-rw-r--r--    1 postfix  postfix      4.8M Oct 15 09:22 old_bayes_seen
>>-rw-r--r--    1 postfix  postfix       22M Oct 15 09:22 old_bayes_toks
>>now... and my SA/MS is timing out once again, now I've re-enabled Bayes
>>with use_bayes...
>>I'm almost tempted to have a normal SA run without Bayes, and then use
>>MCP to reprocess the message again with Bayes (or vice versa)... the
>>fact that the Bayes is making it time out, and then effectively timing
>>out the rest of the stuff despite it probably being 'positive' in a lot
>>of cases is proving far from jolly...
> Hmmm... "sa-learn --force-expire --rebuild", for SA 2.61, seems to help
> sometimes.  But that is soon to be history, replaced by another problem!
> Executive warning:  If you were suffering from this problem, and are
> thinking of moving to 2.62, then check the following beforehand.
> At 2.62, the SA folk seem to have recognised the 2.61 "bayes_toks"
> problem, and instead of "" are now using filename patterns
> "bayes_toks.expire$$" (where $$ is the process id).  (Do a diff of the
> 2.61 and 2.62 versions of "lib/Mail/SpamAssassin/".)
> BUT... the result is that instead of one huge "" file, there
> now seem to be an increasing number of orphaned "bayes_toks.expire$$"
> files.  (Given that $$ could typically span all integers up to 30,000, the
> accumulating disk usage results could become 'interesting'...)
> I realise such SA details take us somewhat off-topic from strict
> MailScanner.  But has anyone here got any experience of this with SA 2.62,
> or monitoring it on SA lists?  (Perhaps I need to rejoing an SA list or at
> least ferret through their recent archives...)

Can't say that (1) I've seen this on my server or (2) on the sa-talk list.

Perhaps you need to get back on the sa-talk list and ask them??

Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300


This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.


More information about the MailScanner mailing list