bayes expire tokens

Chris Conn cconn at ABACOM.COM
Wed Mar 9 15:45:31 GMT 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "US-ASCII" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Kai Schaetzl wrote:
> Chris, simply run a "sa-learn --force-expire" and see what you get as an
> output. If it has a problem rebuilding it may hang for quite a while, just
> let it run, it will eventually finish. When it says it could not rebuild
> then you have a (known) problem. You can then either throw the db away or
> dig (long) in the spamassassin mailing list archives for help. From your
> explanation it looks like your db is quite old and you probably added much
> more spam in its early time than later on. Then it's likely that SA cannot
> create a good delta for expiring (because it would expire much more tokens
> than it is supposed to do) and goes in a trial loop to find one. However,
> in most cases it won't find a good one. Those expire files are from such
> attempts. They took so long that the MS timeout was reached and the
> process called off. This is not a MailScanner problem, it's due to the
> expiry algorithm used in SA.
>
> Kai
>

sa-learn works fine:

synced Bayes databases from journal in 1 seconds: 2720 unique entries
(4235 total entries)

................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
...........................................expired old Bayes database
entries in 114 seconds
228547 entries kept, 375363 deleted
token frequency: 1-occurence tokens: 63.13%
token frequency: less than 8 occurrences: 20.49%

114 seconds.

Even while MailScanner was running, no timeouts, no expire dead files.

============

What I find curious as well is that in the logs, MailScanner claims to
have finished the rebuild:

Mar  8 21:59:59 mx2 MailScanner[23057]: SpamAssassin Bayes database
rebuild completed

HOWEVER

all of these expire toks files are created AFTER the expiry is
supposedly finished.  And, each file is created _after_ a logged:

Mar  8 22:10:57 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 22:16:14 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 22:21:38 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 22:26:38 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 22:32:08 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 22:53:34 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 22:59:11 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 23:04:11 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 23:09:47 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 23:19:49 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 23:25:32 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 23:30:39 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 23:41:18 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  8 23:46:28 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 00:07:16 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 00:12:26 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 00:17:31 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 00:22:36 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 00:28:11 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 00:33:42 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 00:39:04 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 00:45:03 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 00:50:10 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 00:55:31 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 01:01:05 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 01:06:36 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 01:11:45 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 01:17:26 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 01:37:39 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 01:42:51 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 01:48:37 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 01:53:44 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20
Mar  9 01:58:50 mx2 MailScanner[23057]: SpamAssassin timed out and was
killed, failure 0 of 20


My MailScanner-induced expiry ran for a total of 1.5 minutes and ENDED
AT 10pm, yet these files are still being created 3 hours later???

And then it eventually stops.  Until the next day.

Chris

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the MAQ (http://www.mailscanner.biz/maq/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).

Support MailScanner development - buy the book off the website!




More information about the MailScanner mailing list