Quick 'bayes' question

Jason Williams jwilliams at COURTESYMORTGAGE.COM
Wed Jun 29 17:06:41 IST 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "US-ASCII" character set.  ]
    [ Some characters may be displayed incorrectly. ]

Scott Silva wrote:

>Jason Williams spake the following on 6/28/2005 3:57 PM:
>  
>
>>Well im back. This time, a question on bayes.
>>
>>I've been working to get bayes setup and running properly (and I don't
>>think bayes has evern been setup to work properly to be honest).
>>
>>First, in my spam.assassin.prefs.conf file, I have
>>
>>use_bayes 1
>>bayes_patch       /usr/local/etc/MailScanner/bayes/
>>bayes_file_mode 0660
>>
>># Bump up SpamAssassin scores on the high and low end
>># score BAYES_00 -15.0
>># score BAYES_05 -5.0
>># score BAYES_95 5.0
>># score BAYES_99 15.0
>>
>># To disable bayes autolearn
>># bayes_auto_learn 0
>>
>>Just trying to make sure I have the basics setup.
>>
>>I ran --lint, it found the bayes DB no problem. However, when I look in
>>the bayes directory, I see a bunch of files that look like this:
>>
>>_toks.expire98xxx  different numbers at the end.
>>
>>As I was reading over the site, it recommened to do a dump and look at
>>the magic. Well here it is:
>>
>>0.000          0          3          0  non-token data: bayes db version
>>0.000          0          0          0  non-token data: nspam
>>0.000          0          2          0  non-token data: nham
>>0.000          0         43          0  non-token data: ntokens
>>0.000          0 1083442244          0  non-token data: oldest atime
>>0.000          0 1083446498          0  non-token data: newest atime
>>0.000          0          0          0  non-token data: last journal
>>sync atime
>>0.000          0          0          0  non-token data: last expiry atime
>>0.000          0          0          0  non-token data: last expire
>>atime delta
>>0.000          0          0          0  non-token data: last expire
>>reduction count
>>
>>Reading over the wiki site, there are a lot of things going on with the
>>bayes system.
>>First question I have is that if I want to train the bayesian learning
>>system (or even to rebuild it) would I just point it to the quarantine
>>directory? Seems logical.
>>
>>I'm sure im missing something. Been rather long, mind numbing day.
>>
>>I appreciate any feedback.
>>
>>Jason
>>
>>    
>>
>Either you dumped the wrong database, or this one has very little in it.
>Try sa-learn --dump magic --dbpath /path/to/bayes/bayes
>Should be the bayes db path in spamassassin.prefs.conf.
>
>Mine has much more data;
>
>0.000          0          3          0  non-token data: bayes db version
>0.000          0      29146          0  non-token data: nspam
>0.000          0      81693          0  non-token data: nham
>0.000          0     124702          0  non-token data: ntokens
>0.000          0 1119230907          0  non-token data: oldest atime
>0.000          0 1120001312          0  non-token data: newest atime
>0.000          0 1119999608          0  non-token data: last journal
>sync atime
>0.000          0 1119929516          0  non-token data: last expiry atime
>0.000          0     691200          0  non-token data: last expire
>atime delta
>0.000          0      28272          0  non-token data: last expire
>reduction count
>
>  
>
Ok. I came in this morning and tried a few things. I tried using -p to 
specifiy my .conf file. THat seemed to work.

Using preference:
sa-learn --dump magic -p /usr/local/etc/MailScanner/spam.assassin.prefs.conf

0.000          0          3          0  non-token data: bayes db version
0.000          0       1708          0  non-token data: nspam
0.000          0       4297          0  non-token data: nham
0.000          0     116128          0  non-token data: ntokens
0.000          0 1117950182          0  non-token data: oldest atime
0.000          0 1120060416          0  non-token data: newest atime
0.000          0 1120059537          0  non-token data: last journal 
sync atime
0.000          0 1120056393          0  non-token data: last expiry atime
0.000          0    2096156          0  non-token data: last expire 
atime delta
0.000          0      38206          0  non-token data: last expire 
reduction coun

Looks much better. Like it should.

It appears to have some good data in it, now I just need to get bayes 
working better so it can start spam killing.

Jason

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the Wiki (http://wiki.mailscanner.info/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).

Support MailScanner development - buy the book off the website!




More information about the MailScanner mailing list