Bayesian shenanigans (i.e. problems)
t.d.lee at DURHAM.AC.UK
Thu Jan 22 09:58:52 GMT 2004
On Tue, 20 Jan 2004, Martin Hepworth wrote:
> David Lee wrote:
> > [...]
> > At 2.62, the SA folk seem to have recognised the 2.61 "bayes_toks"
> > problem, and instead of "bayes_toks.new" are now using filename patterns
> > "bayes_toks.expire$$" (where $$ is the process id). (Do a diff of the
> > 2.61 and 2.62 versions of "lib/Mail/SpamAssassin/BayesStore.pm".)
> > BUT... the result is that instead of one huge "bayes_toks.new" file, there
> > now seem to be an increasing number of orphaned "bayes_toks.expire$$"
> > files. (Given that $$ could typically span all integers up to 30,000, the
> > accumulating disk usage results could become 'interesting'...)
> > I realise such SA details take us somewhat off-topic from strict
> > MailScanner. But has anyone here got any experience of this with SA 2.62,
> > or monitoring it on SA lists? (Perhaps I need to rejoing an SA list or at
> > least ferret through their recent archives...)
> Can't say that (1) I've seen this on my server or (2) on the sa-talk list.
> Perhaps you need to get back on the sa-talk list and ask them??
Thanks, Martin. I posted a note on sa-talk a couple of days ago, but had
not one reply.
But I think we need to come back to MS despite my earlier thought that
this SA/bayes thing might be taking us somewhat off-topic.
Meanwhile, looking deeper locally, I had seen some things which suggest
that the problem may actually be MS's, or at least its use of SA. We
(durham.ac.uk) have 3 MX records: two of equal low-value (preferred), and
one of higher value (i.e. quasi-backup, our production-test). As far as
we know, all are identically configured.
But we only see the problem on the two main, busy servers, not on the
lightly-loaded background one. In addition (and here's the clincher which
pulls us back to MS, or at least MS-triggering):
1. The busy servers, which suffer from this problem, have many "maillog"
entries of the form "MailScanner[...]: Delete bayes lockfile for $$"
(where "$$" looks like a process number), and have these orphaned files
called "bayes_toks.expire$$" (same value "$$").
2. The backup, quiet server has no such maillog messages, and no such
So there is clearly something in MS's use of SA on busy machines (in a
timeout/locking-like area) that is causing these orphaned files (SA2.62)
and presumably the equivalent huge "bayes_toks.new" (SA 2.61)).
Thoughts, anyone? How to begin to try to trace this??
: David Lee I.T. Service :
: Systems Programmer Computer Centre :
: University of Durham :
: http://www.dur.ac.uk/t.d.lee/ South Road :
: Durham :
: Phone: +44 191 334 2752 U.K. :
More information about the MailScanner