BAYES_00 is killing me

Devon Harding devonharding at gmail.com
Tue Jun 17 17:09:05 IST 2008


On Tue, Jun 17, 2008 at 10:39 AM, Glenn Steen <glenn.steen at gmail.com> wrote:

> 2008/6/17 Devon Harding <devonharding at gmail.com>:
> >
> >
> > On Tue, Jun 17, 2008 at 9:51 AM, Glenn Steen <glenn.steen at gmail.com>
> wrote:
> >>
> >> 2008/6/17 Devon Harding <devonharding at gmail.com>:
> >> >
> >> >
> >> > On Tue, Jun 17, 2008 at 5:11 AM, Glenn Steen <glenn.steen at gmail.com>
> >> > wrote:
> >> >>
> >> >> 2008/6/17 Glenn Steen <glenn.steen at gmail.com>:
> >> >> > 2008/6/16 Devon Harding <devonharding at gmail.com>:
> >> >> >>
> >> >> >>
> >> >> >> On Mon, Jun 16, 2008 at 4:12 PM, Glenn Steen <
> glenn.steen at gmail.com>
> >> >> >> wrote:
> >> >> >>>
> >> >> >>> 2008/6/16 Devon Harding <devonharding at gmail.com>:
> >> >> >>> >
> >> >> >>> >
> >> >> >>> > On Mon, Jun 16, 2008 at 2:46 PM, Glenn Steen
> >> >> >>> > <glenn.steen at gmail.com>
> >> >> >>> > wrote:
> >> >> >>> >>
> >> >> >>> >> 2008/6/16 Devon Harding <devonharding at gmail.com>:
> >> >> >>> >> >>>>
> >> >> >>> >> >>>>
> >> >> >>> >> >>>>
> >> >> >>> >> >>>>        Devon Harding wrote:
> >> >> >>> >> >>>>        | I'm getting alot of spam coming through and it
> >> >> >>> >> >>>> seems
> >> >> >>> >> >>>> like
> >> >> >>> >> >>>>        the cause of
> >> >> >>> >> >>>>        | this is BAYES_00 scoring messages with -2.60.
>  I'm
> >> >> >>> >> >>>> running
> >> >> >>> >> >>>>        MS 4.68.8
> >> >> >>> >> >>>>        | with SA *Le Service des Technologies de
> >> >> >>> >> >>>> l'Information
> >> >> >>> >> >>>> de
> >> >> >>> >> >>>>        l'UdeS veut vous mettre en garde contre "3.2.4"
> qui
> >> >> >>> >> >>>> semble
> >> >> >>> >> >>>>        être une tentative de fraude envers* 3.2.4.
> >> >> >>> >> >>>> <http://3.2.4.>
> >> >> >>> >> >>>>        <*Le Service des Technologies de l'Information de
> >> >> >>> >> >>>> l'UdeS
> >> >> >>> >> >>>> veut
> >> >> >>> >> >>>>        vous mettre en garde contre "3.2.4" qui semble
> être
> >> >> >>> >> >>>> une
> >> >> >>> >> >>>>        tentative de fraude envers* http://3.2.4.>  I've
> >> >> >>> >> >>>> already
> >> >> >>> >> >>>>        trained hundreds of
> >> >> >>> >> >>>>
> >> >> >>> >> >>>>        | messages like these as spam and it doesn't seem
> to
> >> >> >>> >> >>>> work.
> >> >> >>> >> >>>>         What else can
> >> >> >>> >> >>>>        | I do?
> >> >> >>> >> >>>>
> >> >> >>> >> >>>>        My guess is that you are training the wrong
> >> >> >>> >> >>>> database.
> >> >> >>> >> >>>> You
> >> >> >>> >> >>>>        train another
> >> >> >>> >> >>>>        database and not the one you are using with
> >> >> >>> >> >>>> MailScanner.
> >> >> >>> >> >>>>
> >> >> >>> >> >>>>        Hugo.
> >> >> >>> >> >>>>
> >> >> >>> >> >>>>
> >> >> >>> >> >>>>
> >> >> >>> >> >>>>    For MS, where is the Bayes DB path specified?  My DB
> is
> >> >> >>> >> >>>> located
> >> >> >>> >> >>>> here:
> >> >> >>> >> >>>>
> >> >> >>> >> >>>>    /etc/MailScanner/.spamassassin
> >> >> >>> >> >>>>
> >> >> >>> >> >>>>
> >> >> >>> >> >>>> I think my BAYES is  all messed up.  How do I rebuild it
> >> >> >>> >> >>>> from
> >> >> >>> >> >>>> scratch?
> >> >> >>> >> >>>>
> >> >> >>> >> >>> Devon,
> >> >> >>> >> >>>
> >> >> >>> >> >>> Look here for a starter kit:
> >> >> >>> >> >>> http://www.fsl.com/resources.html
> >> >> >>> >> >>>
> >> >> >>> >> >>> Denis
> >> >> >>> >> >>>
> >> >> >>> >> >>> --
> >> >> >>> >> >>
> >> >> >>> >> >> I've restored the starter DB and I do see the new files in
> >> >> >>> >> >> /etc/MailScanner/.spamassassin (I stopped MailScanner and
> >> >> >>> >> >> removed
> >> >> >>> >> >> the
> >> >> >>> >> >> one
> >> >> >>> >> >> ones first), but SA Bayes DB Info from Mailwatch shows
> >> >> >>> >> >> nothing.
> >> >> >>> >> >>  When I
> >> >> >>> >> >> do a
> >> >> >>> >> >> lint from the Tools tab, i Get the following:
> >> >> >>> >> >>
> >> >> >>> >> >> [5637] dbg: bayes: no dbs present, cannot tie DB R/O:
> >> >> >>> >> >> //.spamassassin/bayes_toks
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> > Hmm....I thing Bayes IS working.  I just ran MailScanner
> >> >> >>> >> > --debug
> >> >> >>> >> > --debug-sa
> >> >> >>> >> > after the restore and did see:
> >> >> >>> >> >
> >> >> >>> >> > 11:52:13 [5879] dbg: bayes: tie-ing to DB file R/W
> >> >> >>> >> > /root/.spamassassin/bayes_toks
> >> >> >>> >> > 11:52:13 [5879] dbg: bayes: tie-ing to DB file R/W
> >> >> >>> >> > /root/.spamassassin/bayes_seen
> >> >> >>> >> > 11:52:13 [5879] dbg: bayes: found bayes db version 3
> >> >> >>> >> > 11:52:13 [5879] dbg: bayes: learned
> >> >> >>> >> > '88a47a16459989c19d47893de31fec608aa8f41e at sa_generated',
> >> >> >>> >> > atime:
> >> >> >>> >> > 1213631520
> >> >> >>> >> > 11:52:13 [5879] dbg: bayes: untie-ing
> >> >> >>> >> > 11:52:13 [5879] dbg: bayes: files locked, now unlocking lock
> >> >> >>> >> >
> >> >> >>> >> > It seems that MailWatch is the one thats not working right.
> >> >> >>> >> >  Any
> >> >> >>> >> > way
> >> >> >>> >> > to
> >> >> >>> >> > relink this?
> >> >> >>> >> >
> >> >> >>> >> > -Devon
> >> >> >>> >> >
> >> >> >>> >> Make sure your apahce user (the one running your httpd
> >> >> >>> >> processes...
> >> >> >>> >> hence the one running MailWatch:-) can actually read the bayes
> >> >> >>> >> files... "su" is your friend here... and if you want to be
> able
> >> >> >>> >> to
> >> >> >>> >> learn via MailWatch, make sure the same user can write them
> too.
> >> >> >>> >>
> >> >> >>> >> Cheers
> >> >> >>> >> --
> >> >> >>> >
> >> >> >>> > I have the right permissions set, the thing is MailWatch is not
> >> >> >>> > showing
> >> >> >>> > any
> >> >> >>> > data for 'Bayes Database Information'.  What is the tie in for
> >> >> >>> > MailWatch?
> >> >> >>> >
> >> >> >>> > -rw-rw---- 1 root apache  78K Jun 16 15:17 bayes_journal
> >> >> >>> > -rw-rw---- 1 root apache  895 Jun 16 15:17 bayes.mutex
> >> >> >>> > -rw-rw---- 1 root apache 172K Jun 16 15:17 bayes_seen
> >> >> >>> > -rw-rw---- 1 root apache 5.1M Jun 16 15:17 bayes_toks
> >> >> >>> >
> >> >> >>> > -Devon
> >> >> >>> >
> >> >> >>> But can the apache user access the directory?
> >> >> >>> MailWatch isn't particularly "magical" here, it uses the same
> info
> >> >> >>> as
> >> >> >>> all else...
> >> >> >>>
> >> >> >>> Try something like "su - apache -s /bin/bash" and then "cd
> >> >> >>> /path/to/where/you/have/the/bayes/files"... Might give a clue:-)
> >> >> >>>
> >> >> >>> Cheers
> >> >> >>> --
> >> >> >>> -- Glenn
> >> >> >>
> >> >> >> User apache can access this fine.  I didn't see anything  in the
> >> >> >>  MailWatch
> >> >> >> .conf file on  Bayes
> >> >> >>
> >> >> > That's because there is nothing there....:-).
> >> >> > It uses the same info all else do (through the normal SA method...
> >> >> > The
> >> >> > .cf files).
> >> >> >
> >> >> > Unless this is something hardcoded into the scriptlet handling the
> SA
> >> >> > db dump... Haven't checked that (and will not be anwhere I can
> check
> >> >> > it until tomorrow... You have a look:-).
> >> >> >
> >> >> > Cheers
> >> >>
> >> >> Nope, nothing strange here, the call is to
> >> >> sa-learn -p /path/to/MailScanner/spa.assassin.prefs.conf --dump-magic
> >> >> in bayes_info.php ... Where /path/to/MailScanner likely expands as
> >> >> /etc/MailScanner or similar (this is from the SA_PREFS setting in
> >> >> conf.php).
> >> >>
> >> >> As the apache user, can you run the above command? What do you get?
> >> >>
> >> >> Cheers
> >> >> --
> >> >> -- Glenn
> >> >
> >> > This was run as apache:
> >> >
> >> > bash-3.1$ sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf --dump
> >> > magic
> >> > 0.000          0          3          0  non-token data: bayes db
> version
> >> > 0.000          0        448          0  non-token data: nspam
> >> > 0.000          0       1287          0  non-token data: nham
> >> > 0.000          0     170860          0  non-token data: ntokens
> >> > 0.000          0 1171294928          0  non-token data: oldest atime
> >> > 0.000          0 1213703845          0  non-token data: newest atime
> >> > 0.000          0 1213700281          0  non-token data: last journal
> >> > sync
> >> > atime
> >> > 0.000          0 1213671060          0  non-token data: last expiry
> >> > atime
> >> > 0.000          0   11059200          0  non-token data: last expire
> >> > atime
> >> > delta
> >> > 0.000          0      24264          0  non-token data: last expire
> >> > reduction count
> >> > bash-3.1$
> >> >
> >> Ok, and if you do (as the apache user)
> >> spamassassin --lint -D -p /etc/MailScanner/spam.assassin.prefs.conf
> >> (in reality, one should change MW to not use the -p preference file,
> >> since this is included as a .cf already... Don't do much harm
> >> though:-) Do you get the db error then?
> >>
> >> Cheers
> >> --
> >> -- Glenn
> >
> > No error and it even finds bayes installed.  I think its something with
> MW.
> >
> > [26297] dbg: replacetags: done replacing tags
> > [26297] dbg: bayes: tie-ing to DB file R/O
> /var/www/.spamassassin/bayes_toks
> > [26297] dbg: bayes: tie-ing to DB file R/O
> /var/www/.spamassassin/bayes_seen
> > [26297] dbg: bayes: found bayes db version 3
> > [26297] dbg: bayes: DB journal sync: last sync: 1213700281
> > [26297] dbg: config: score set 2 chosen.
> >
> Ok, what is your MS_CONFIG setting and your SA_PREFS in conf.php
> (sorry all you others, this should be on the MW list, I know)?
> --
> -- Glenn
>

Here are paths:

// Paths
define(MAILWATCH_HOME, '/var/www/html/mailscanner');
define(MS_CONFIG_DIR, '/etc/MailScanner/');
define(MS_LIB_DIR, '/usr/lib/MailScanner/');
define(CACHE_DIR, './images/cache/'); // JpGraph cache
define(TTF_DIR,'./jpgraph/fonts/'); // JpGraph fonts
define(SA_DIR,'/usr/bin/');
define(SA_RULES_DIR, '/usr/share/spamassassin/');
define(SA_PREFS, MS_CONFIG_DIR.'spam.assassin.prefs.conf');
define(FPDF_FONTPATH,'./fpdf/font/');
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.mailscanner.info/pipermail/mailscanner/attachments/20080617/627ee090/attachment.html


More information about the MailScanner mailing list