BAYES_00 is killing me

Devon Harding devonharding at gmail.com
Tue Jun 17 13:19:03 IST 2008


On Tue, Jun 17, 2008 at 5:11 AM, Glenn Steen <glenn.steen at gmail.com> wrote:

> 2008/6/17 Glenn Steen <glenn.steen at gmail.com>:
> > 2008/6/16 Devon Harding <devonharding at gmail.com>:
> >>
> >>
> >> On Mon, Jun 16, 2008 at 4:12 PM, Glenn Steen <glenn.steen at gmail.com>
> wrote:
> >>>
> >>> 2008/6/16 Devon Harding <devonharding at gmail.com>:
> >>> >
> >>> >
> >>> > On Mon, Jun 16, 2008 at 2:46 PM, Glenn Steen <glenn.steen at gmail.com>
> >>> > wrote:
> >>> >>
> >>> >> 2008/6/16 Devon Harding <devonharding at gmail.com>:
> >>> >> >>>>
> >>> >> >>>>
> >>> >> >>>>
> >>> >> >>>>        Devon Harding wrote:
> >>> >> >>>>        | I'm getting alot of spam coming through and it seems
> like
> >>> >> >>>>        the cause of
> >>> >> >>>>        | this is BAYES_00 scoring messages with -2.60.  I'm
> >>> >> >>>> running
> >>> >> >>>>        MS 4.68.8
> >>> >> >>>>        | with SA *Le Service des Technologies de l'Information
> de
> >>> >> >>>>        l'UdeS veut vous mettre en garde contre "3.2.4" qui
> semble
> >>> >> >>>>        être une tentative de fraude envers* 3.2.4. <
> http://3.2.4.>
> >>> >> >>>>        <*Le Service des Technologies de l'Information de l'UdeS
> >>> >> >>>> veut
> >>> >> >>>>        vous mettre en garde contre "3.2.4" qui semble être une
> >>> >> >>>>        tentative de fraude envers* http://3.2.4.>  I've
> already
> >>> >> >>>>        trained hundreds of
> >>> >> >>>>
> >>> >> >>>>        | messages like these as spam and it doesn't seem to
> work.
> >>> >> >>>>         What else can
> >>> >> >>>>        | I do?
> >>> >> >>>>
> >>> >> >>>>        My guess is that you are training the wrong database.
> You
> >>> >> >>>>        train another
> >>> >> >>>>        database and not the one you are using with MailScanner.
> >>> >> >>>>
> >>> >> >>>>        Hugo.
> >>> >> >>>>
> >>> >> >>>>
> >>> >> >>>>
> >>> >> >>>>    For MS, where is the Bayes DB path specified?  My DB is
> located
> >>> >> >>>> here:
> >>> >> >>>>
> >>> >> >>>>    /etc/MailScanner/.spamassassin
> >>> >> >>>>
> >>> >> >>>>
> >>> >> >>>> I think my BAYES is  all messed up.  How do I rebuild it from
> >>> >> >>>> scratch?
> >>> >> >>>>
> >>> >> >>> Devon,
> >>> >> >>>
> >>> >> >>> Look here for a starter kit: http://www.fsl.com/resources.html
> >>> >> >>>
> >>> >> >>> Denis
> >>> >> >>>
> >>> >> >>> --
> >>> >> >>
> >>> >> >> I've restored the starter DB and I do see the new files in
> >>> >> >> /etc/MailScanner/.spamassassin (I stopped MailScanner and removed
> >>> >> >> the
> >>> >> >> one
> >>> >> >> ones first), but SA Bayes DB Info from Mailwatch shows nothing.
> >>> >> >>  When I
> >>> >> >> do a
> >>> >> >> lint from the Tools tab, i Get the following:
> >>> >> >>
> >>> >> >> [5637] dbg: bayes: no dbs present, cannot tie DB R/O:
> >>> >> >> //.spamassassin/bayes_toks
> >>> >> >
> >>> >> >
> >>> >> > Hmm....I thing Bayes IS working.  I just ran MailScanner --debug
> >>> >> > --debug-sa
> >>> >> > after the restore and did see:
> >>> >> >
> >>> >> > 11:52:13 [5879] dbg: bayes: tie-ing to DB file R/W
> >>> >> > /root/.spamassassin/bayes_toks
> >>> >> > 11:52:13 [5879] dbg: bayes: tie-ing to DB file R/W
> >>> >> > /root/.spamassassin/bayes_seen
> >>> >> > 11:52:13 [5879] dbg: bayes: found bayes db version 3
> >>> >> > 11:52:13 [5879] dbg: bayes: learned
> >>> >> > '88a47a16459989c19d47893de31fec608aa8f41e at sa_generated', atime:
> >>> >> > 1213631520
> >>> >> > 11:52:13 [5879] dbg: bayes: untie-ing
> >>> >> > 11:52:13 [5879] dbg: bayes: files locked, now unlocking lock
> >>> >> >
> >>> >> > It seems that MailWatch is the one thats not working right.  Any
> way
> >>> >> > to
> >>> >> > relink this?
> >>> >> >
> >>> >> > -Devon
> >>> >> >
> >>> >> Make sure your apahce user (the one running your httpd processes...
> >>> >> hence the one running MailWatch:-) can actually read the bayes
> >>> >> files... "su" is your friend here... and if you want to be able to
> >>> >> learn via MailWatch, make sure the same user can write them too.
> >>> >>
> >>> >> Cheers
> >>> >> --
> >>> >
> >>> > I have the right permissions set, the thing is MailWatch is not
> showing
> >>> > any
> >>> > data for 'Bayes Database Information'.  What is the tie in for
> >>> > MailWatch?
> >>> >
> >>> > -rw-rw---- 1 root apache  78K Jun 16 15:17 bayes_journal
> >>> > -rw-rw---- 1 root apache  895 Jun 16 15:17 bayes.mutex
> >>> > -rw-rw---- 1 root apache 172K Jun 16 15:17 bayes_seen
> >>> > -rw-rw---- 1 root apache 5.1M Jun 16 15:17 bayes_toks
> >>> >
> >>> > -Devon
> >>> >
> >>> But can the apache user access the directory?
> >>> MailWatch isn't particularly "magical" here, it uses the same info as
> >>> all else...
> >>>
> >>> Try something like "su - apache -s /bin/bash" and then "cd
> >>> /path/to/where/you/have/the/bayes/files"... Might give a clue:-)
> >>>
> >>> Cheers
> >>> --
> >>> -- Glenn
> >>
> >> User apache can access this fine.  I didn't see anything  in the
>  MailWatch
> >> .conf file on  Bayes
> >>
> > That's because there is nothing there....:-).
> > It uses the same info all else do (through the normal SA method... The
> > .cf files).
> >
> > Unless this is something hardcoded into the scriptlet handling the SA
> > db dump... Haven't checked that (and will not be anwhere I can check
> > it until tomorrow... You have a look:-).
> >
> > Cheers
>
> Nope, nothing strange here, the call is to
> sa-learn -p /path/to/MailScanner/spa.assassin.prefs.conf --dump-magic
> in bayes_info.php ... Where /path/to/MailScanner likely expands as
> /etc/MailScanner or similar (this is from the SA_PREFS setting in
> conf.php).
>
> As the apache user, can you run the above command? What do you get?
>
> Cheers
> --
> -- Glenn
>

This was run as apache:

bash-3.1$ sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf --dump magic

0.000          0          3          0  non-token data: bayes db version
0.000          0        448          0  non-token data: nspam
0.000          0       1287          0  non-token data: nham
0.000          0     170860          0  non-token data: ntokens
0.000          0 1171294928          0  non-token data: oldest atime
0.000          0 1213703845          0  non-token data: newest atime
0.000          0 1213700281          0  non-token data: last journal sync
atime
0.000          0 1213671060          0  non-token data: last expiry atime
0.000          0   11059200          0  non-token data: last expire atime
delta
0.000          0      24264          0  non-token data: last expire
reduction count
bash-3.1$
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.mailscanner.info/pipermail/mailscanner/attachments/20080617/7c16ac99/attachment.html


More information about the MailScanner mailing list