bayes expire tokens
Chris Conn
cconn at ABACOM.COM
Wed Mar 9 15:45:31 GMT 2005
[ The following text is in the "ISO-8859-1" character set. ]
[ Your display is set for the "US-ASCII" character set. ]
[ Some characters may be displayed incorrectly. ]
Kai Schaetzl wrote:
> Chris, simply run a "sa-learn --force-expire" and see what you get as an
> output. If it has a problem rebuilding it may hang for quite a while, just
> let it run, it will eventually finish. When it says it could not rebuild
> then you have a (known) problem. You can then either throw the db away or
> dig (long) in the spamassassin mailing list archives for help. From your
> explanation it looks like your db is quite old and you probably added much
> more spam in its early time than later on. Then it's likely that SA cannot
> create a good delta for expiring (because it would expire much more tokens
> than it is supposed to do) and goes in a trial loop to find one. However,
> in most cases it won't find a good one. Those expire files are from such
> attempts. They took so long that the MS timeout was reached and the
> process called off. This is not a MailScanner problem, it's due to the
> expiry algorithm used in SA.
>
> Kai
>
sa-learn works fine:
synced Bayes databases from journal in 1 seconds: 2720 unique entries
(4235 total entries)
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
...........................................expired old Bayes database
entries in 114 seconds
228547 entries kept, 375363 deleted
token frequency: 1-occurence tokens: 63.13%
token frequency: less than 8 occurrences: 20.49%
114 seconds.
Even while MailScanner was running, no timeouts, no expire dead files.
============
What I find curious as well is that in the logs, MailScanner claims to
have finished the rebuild:
Mar 8 21:59:59 mx2 MailScanner[23057]: SpamAssassin Bayes database
rebuild completed
HOWEVER
all of these expire toks files are created AFTER the expiry is
supposedly finished. And, each file is created _after_ a logged:
Mar 8 22:10:57 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 22:16:14 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 22:21:38 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 22:26:38 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 22:32:08 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 22:53:34 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 22:59:11 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 23:04:11 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 23:09:47 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 23:19:49 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 23:25:32 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 23:30:39 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 23:41:18 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 8 23:46:28 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 00:07:16 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 00:12:26 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 00:17:31 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 00:22:36 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 00:28:11 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 00:33:42 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 00:39:04 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 00:45:03 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 00:50:10 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 00:55:31 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 01:01:05 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 01:06:36 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 01:11:45 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 01:17:26 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 01:37:39 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 01:42:51 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 01:48:37 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 01:53:44 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar 9 01:58:50 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
My MailScanner-induced expiry ran for a total of 1.5 minutes and ENDED
AT 10pm, yet these files are still being created 3 hours later???
And then it eventually stops. Until the next day.
Chris
------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the MAQ (http://www.mailscanner.biz/maq/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).
Support MailScanner development - buy the book off the website!
More information about the MailScanner
mailing list