RV: RV: Problem with BayesDB?

David Lee t.d.lee at DURHAM.AC.UK
Tue Jan 18 09:01:47 GMT 2005


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "US-ASCII" character set.  ]
    [ Some characters may be displayed incorrectly. ]

On Mon, 17 Jan 2005, [iso-8859-1] Ricardo Luis Cañavate wrote:

> I still have two bayes_toks what is the only file that I would have,
> bayes_toks since 13 ene have 675Mb and bayes_toks.new 278Mb.
>
> [root at servnozar bayes]# ls -ls
> total 652384
>     4 -rw-rw-rw-    1 root     apache         46 ene 17 18:59 bayes.lock
>     4 -rw-------    1 root     apache       1356 nov 23 08:55 bayes.mutex
>  1148 -rw-rw----    1 root     apache    1335296 ene 13 12:28 bayes_seen
> 461244 -rw-rw----    1 root     apache   677400576 ene 13 12:28 bayes_toks
> 189984 -rw-rw----    1 root     apache   278929408 ene 17 19:00
> bayes_toks.new

Your 677MB "bayes_toks" looks very large.  (On our own main machines it is 
around 10-12MB, on a lighter machine around 6MB.)  My guess is that it is 
somehow bad.  Certainly, any rebuild will take a long time, simply 
shuffling that vast quantity of data on the disk.

Also the ".new" extension looks odd.  From memory that corresponds to 
around version 2.50 of SA.  I had thought (but I may be wrong) that around 
SA 2.62, this filename pattern was changed to the ".expire*" with which 
this email discussion started.  I don't recall seeing both ".expire*" and 
".new" extensions in a single SA installation.

It might be worth starting again, either simply clearing out those files 
(and, if resonably possible, feeding it some known ham and spam), or
applying one of the "Bayes Starter DB" kindly offered by Steve Swaney of 
FSL:
    http://www.fsl.com/support/

Hope that helps.

-- 

:  David Lee                                I.T. Service          :
:  Senior Systems Programmer                Computer Centre       :
:                                           University of Durham  :
:  http://www.dur.ac.uk/t.d.lee/            South Road            :
:                                           Durham                :
:  Phone: +44 191 334 2752                  U.K.                  :

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the MAQ (http://www.mailscanner.biz/maq/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).

Support MailScanner development - buy the book off the website!


More information about the MailScanner mailing list