BAYES_00 is killing me

Glenn Steen glenn.steen at gmail.com
Tue Jun 17 15:39:07 IST 2008


2008/6/17 Devon Harding <devonharding at gmail.com>:
>
>
> On Tue, Jun 17, 2008 at 9:51 AM, Glenn Steen <glenn.steen at gmail.com> wrote:
>>
>> 2008/6/17 Devon Harding <devonharding at gmail.com>:
>> >
>> >
>> > On Tue, Jun 17, 2008 at 5:11 AM, Glenn Steen <glenn.steen at gmail.com>
>> > wrote:
>> >>
>> >> 2008/6/17 Glenn Steen <glenn.steen at gmail.com>:
>> >> > 2008/6/16 Devon Harding <devonharding at gmail.com>:
>> >> >>
>> >> >>
>> >> >> On Mon, Jun 16, 2008 at 4:12 PM, Glenn Steen <glenn.steen at gmail.com>
>> >> >> wrote:
>> >> >>>
>> >> >>> 2008/6/16 Devon Harding <devonharding at gmail.com>:
>> >> >>> >
>> >> >>> >
>> >> >>> > On Mon, Jun 16, 2008 at 2:46 PM, Glenn Steen
>> >> >>> > <glenn.steen at gmail.com>
>> >> >>> > wrote:
>> >> >>> >>
>> >> >>> >> 2008/6/16 Devon Harding <devonharding at gmail.com>:
>> >> >>> >> >>>>
>> >> >>> >> >>>>
>> >> >>> >> >>>>
>> >> >>> >> >>>>        Devon Harding wrote:
>> >> >>> >> >>>>        | I'm getting alot of spam coming through and it
>> >> >>> >> >>>> seems
>> >> >>> >> >>>> like
>> >> >>> >> >>>>        the cause of
>> >> >>> >> >>>>        | this is BAYES_00 scoring messages with -2.60.  I'm
>> >> >>> >> >>>> running
>> >> >>> >> >>>>        MS 4.68.8
>> >> >>> >> >>>>        | with SA *Le Service des Technologies de
>> >> >>> >> >>>> l'Information
>> >> >>> >> >>>> de
>> >> >>> >> >>>>        l'UdeS veut vous mettre en garde contre "3.2.4" qui
>> >> >>> >> >>>> semble
>> >> >>> >> >>>>        être une tentative de fraude envers* 3.2.4.
>> >> >>> >> >>>> <http://3.2.4.>
>> >> >>> >> >>>>        <*Le Service des Technologies de l'Information de
>> >> >>> >> >>>> l'UdeS
>> >> >>> >> >>>> veut
>> >> >>> >> >>>>        vous mettre en garde contre "3.2.4" qui semble être
>> >> >>> >> >>>> une
>> >> >>> >> >>>>        tentative de fraude envers* http://3.2.4.>  I've
>> >> >>> >> >>>> already
>> >> >>> >> >>>>        trained hundreds of
>> >> >>> >> >>>>
>> >> >>> >> >>>>        | messages like these as spam and it doesn't seem to
>> >> >>> >> >>>> work.
>> >> >>> >> >>>>         What else can
>> >> >>> >> >>>>        | I do?
>> >> >>> >> >>>>
>> >> >>> >> >>>>        My guess is that you are training the wrong
>> >> >>> >> >>>> database.
>> >> >>> >> >>>> You
>> >> >>> >> >>>>        train another
>> >> >>> >> >>>>        database and not the one you are using with
>> >> >>> >> >>>> MailScanner.
>> >> >>> >> >>>>
>> >> >>> >> >>>>        Hugo.
>> >> >>> >> >>>>
>> >> >>> >> >>>>
>> >> >>> >> >>>>
>> >> >>> >> >>>>    For MS, where is the Bayes DB path specified?  My DB is
>> >> >>> >> >>>> located
>> >> >>> >> >>>> here:
>> >> >>> >> >>>>
>> >> >>> >> >>>>    /etc/MailScanner/.spamassassin
>> >> >>> >> >>>>
>> >> >>> >> >>>>
>> >> >>> >> >>>> I think my BAYES is  all messed up.  How do I rebuild it
>> >> >>> >> >>>> from
>> >> >>> >> >>>> scratch?
>> >> >>> >> >>>>
>> >> >>> >> >>> Devon,
>> >> >>> >> >>>
>> >> >>> >> >>> Look here for a starter kit:
>> >> >>> >> >>> http://www.fsl.com/resources.html
>> >> >>> >> >>>
>> >> >>> >> >>> Denis
>> >> >>> >> >>>
>> >> >>> >> >>> --
>> >> >>> >> >>
>> >> >>> >> >> I've restored the starter DB and I do see the new files in
>> >> >>> >> >> /etc/MailScanner/.spamassassin (I stopped MailScanner and
>> >> >>> >> >> removed
>> >> >>> >> >> the
>> >> >>> >> >> one
>> >> >>> >> >> ones first), but SA Bayes DB Info from Mailwatch shows
>> >> >>> >> >> nothing.
>> >> >>> >> >>  When I
>> >> >>> >> >> do a
>> >> >>> >> >> lint from the Tools tab, i Get the following:
>> >> >>> >> >>
>> >> >>> >> >> [5637] dbg: bayes: no dbs present, cannot tie DB R/O:
>> >> >>> >> >> //.spamassassin/bayes_toks
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >> > Hmm....I thing Bayes IS working.  I just ran MailScanner
>> >> >>> >> > --debug
>> >> >>> >> > --debug-sa
>> >> >>> >> > after the restore and did see:
>> >> >>> >> >
>> >> >>> >> > 11:52:13 [5879] dbg: bayes: tie-ing to DB file R/W
>> >> >>> >> > /root/.spamassassin/bayes_toks
>> >> >>> >> > 11:52:13 [5879] dbg: bayes: tie-ing to DB file R/W
>> >> >>> >> > /root/.spamassassin/bayes_seen
>> >> >>> >> > 11:52:13 [5879] dbg: bayes: found bayes db version 3
>> >> >>> >> > 11:52:13 [5879] dbg: bayes: learned
>> >> >>> >> > '88a47a16459989c19d47893de31fec608aa8f41e at sa_generated',
>> >> >>> >> > atime:
>> >> >>> >> > 1213631520
>> >> >>> >> > 11:52:13 [5879] dbg: bayes: untie-ing
>> >> >>> >> > 11:52:13 [5879] dbg: bayes: files locked, now unlocking lock
>> >> >>> >> >
>> >> >>> >> > It seems that MailWatch is the one thats not working right.
>> >> >>> >> >  Any
>> >> >>> >> > way
>> >> >>> >> > to
>> >> >>> >> > relink this?
>> >> >>> >> >
>> >> >>> >> > -Devon
>> >> >>> >> >
>> >> >>> >> Make sure your apahce user (the one running your httpd
>> >> >>> >> processes...
>> >> >>> >> hence the one running MailWatch:-) can actually read the bayes
>> >> >>> >> files... "su" is your friend here... and if you want to be able
>> >> >>> >> to
>> >> >>> >> learn via MailWatch, make sure the same user can write them too.
>> >> >>> >>
>> >> >>> >> Cheers
>> >> >>> >> --
>> >> >>> >
>> >> >>> > I have the right permissions set, the thing is MailWatch is not
>> >> >>> > showing
>> >> >>> > any
>> >> >>> > data for 'Bayes Database Information'.  What is the tie in for
>> >> >>> > MailWatch?
>> >> >>> >
>> >> >>> > -rw-rw---- 1 root apache  78K Jun 16 15:17 bayes_journal
>> >> >>> > -rw-rw---- 1 root apache  895 Jun 16 15:17 bayes.mutex
>> >> >>> > -rw-rw---- 1 root apache 172K Jun 16 15:17 bayes_seen
>> >> >>> > -rw-rw---- 1 root apache 5.1M Jun 16 15:17 bayes_toks
>> >> >>> >
>> >> >>> > -Devon
>> >> >>> >
>> >> >>> But can the apache user access the directory?
>> >> >>> MailWatch isn't particularly "magical" here, it uses the same info
>> >> >>> as
>> >> >>> all else...
>> >> >>>
>> >> >>> Try something like "su - apache -s /bin/bash" and then "cd
>> >> >>> /path/to/where/you/have/the/bayes/files"... Might give a clue:-)
>> >> >>>
>> >> >>> Cheers
>> >> >>> --
>> >> >>> -- Glenn
>> >> >>
>> >> >> User apache can access this fine.  I didn't see anything  in the
>> >> >>  MailWatch
>> >> >> .conf file on  Bayes
>> >> >>
>> >> > That's because there is nothing there....:-).
>> >> > It uses the same info all else do (through the normal SA method...
>> >> > The
>> >> > .cf files).
>> >> >
>> >> > Unless this is something hardcoded into the scriptlet handling the SA
>> >> > db dump... Haven't checked that (and will not be anwhere I can check
>> >> > it until tomorrow... You have a look:-).
>> >> >
>> >> > Cheers
>> >>
>> >> Nope, nothing strange here, the call is to
>> >> sa-learn -p /path/to/MailScanner/spa.assassin.prefs.conf --dump-magic
>> >> in bayes_info.php ... Where /path/to/MailScanner likely expands as
>> >> /etc/MailScanner or similar (this is from the SA_PREFS setting in
>> >> conf.php).
>> >>
>> >> As the apache user, can you run the above command? What do you get?
>> >>
>> >> Cheers
>> >> --
>> >> -- Glenn
>> >
>> > This was run as apache:
>> >
>> > bash-3.1$ sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf --dump
>> > magic
>> > 0.000          0          3          0  non-token data: bayes db version
>> > 0.000          0        448          0  non-token data: nspam
>> > 0.000          0       1287          0  non-token data: nham
>> > 0.000          0     170860          0  non-token data: ntokens
>> > 0.000          0 1171294928          0  non-token data: oldest atime
>> > 0.000          0 1213703845          0  non-token data: newest atime
>> > 0.000          0 1213700281          0  non-token data: last journal
>> > sync
>> > atime
>> > 0.000          0 1213671060          0  non-token data: last expiry
>> > atime
>> > 0.000          0   11059200          0  non-token data: last expire
>> > atime
>> > delta
>> > 0.000          0      24264          0  non-token data: last expire
>> > reduction count
>> > bash-3.1$
>> >
>> Ok, and if you do (as the apache user)
>> spamassassin --lint -D -p /etc/MailScanner/spam.assassin.prefs.conf
>> (in reality, one should change MW to not use the -p preference file,
>> since this is included as a .cf already... Don't do much harm
>> though:-) Do you get the db error then?
>>
>> Cheers
>> --
>> -- Glenn
>
> No error and it even finds bayes installed.  I think its something with MW.
>
> [26297] dbg: replacetags: done replacing tags
> [26297] dbg: bayes: tie-ing to DB file R/O /var/www/.spamassassin/bayes_toks
> [26297] dbg: bayes: tie-ing to DB file R/O /var/www/.spamassassin/bayes_seen
> [26297] dbg: bayes: found bayes db version 3
> [26297] dbg: bayes: DB journal sync: last sync: 1213700281
> [26297] dbg: config: score set 2 chosen.
>
Ok, what is your MS_CONFIG setting and your SA_PREFS in conf.php
(sorry all you others, this should be on the MW list, I know)?
-- 
-- Glenn
email: glenn < dot > steen < at > gmail < dot > com
work: glenn < dot > steen < at > ap1 < dot > se


More information about the MailScanner mailing list