BAYES_00 is killing me

Glenn Steen glenn.steen at gmail.com
Tue Jun 17 14:51:25 IST 2008


2008/6/17 Devon Harding <devonharding at gmail.com>:
>
>
> On Tue, Jun 17, 2008 at 5:11 AM, Glenn Steen <glenn.steen at gmail.com> wrote:
>>
>> 2008/6/17 Glenn Steen <glenn.steen at gmail.com>:
>> > 2008/6/16 Devon Harding <devonharding at gmail.com>:
>> >>
>> >>
>> >> On Mon, Jun 16, 2008 at 4:12 PM, Glenn Steen <glenn.steen at gmail.com>
>> >> wrote:
>> >>>
>> >>> 2008/6/16 Devon Harding <devonharding at gmail.com>:
>> >>> >
>> >>> >
>> >>> > On Mon, Jun 16, 2008 at 2:46 PM, Glenn Steen <glenn.steen at gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> 2008/6/16 Devon Harding <devonharding at gmail.com>:
>> >>> >> >>>>
>> >>> >> >>>>
>> >>> >> >>>>
>> >>> >> >>>>        Devon Harding wrote:
>> >>> >> >>>>        | I'm getting alot of spam coming through and it seems
>> >>> >> >>>> like
>> >>> >> >>>>        the cause of
>> >>> >> >>>>        | this is BAYES_00 scoring messages with -2.60.  I'm
>> >>> >> >>>> running
>> >>> >> >>>>        MS 4.68.8
>> >>> >> >>>>        | with SA *Le Service des Technologies de l'Information
>> >>> >> >>>> de
>> >>> >> >>>>        l'UdeS veut vous mettre en garde contre "3.2.4" qui
>> >>> >> >>>> semble
>> >>> >> >>>>        être une tentative de fraude envers* 3.2.4.
>> >>> >> >>>> <http://3.2.4.>
>> >>> >> >>>>        <*Le Service des Technologies de l'Information de
>> >>> >> >>>> l'UdeS
>> >>> >> >>>> veut
>> >>> >> >>>>        vous mettre en garde contre "3.2.4" qui semble être une
>> >>> >> >>>>        tentative de fraude envers* http://3.2.4.>  I've
>> >>> >> >>>> already
>> >>> >> >>>>        trained hundreds of
>> >>> >> >>>>
>> >>> >> >>>>        | messages like these as spam and it doesn't seem to
>> >>> >> >>>> work.
>> >>> >> >>>>         What else can
>> >>> >> >>>>        | I do?
>> >>> >> >>>>
>> >>> >> >>>>        My guess is that you are training the wrong database.
>> >>> >> >>>> You
>> >>> >> >>>>        train another
>> >>> >> >>>>        database and not the one you are using with
>> >>> >> >>>> MailScanner.
>> >>> >> >>>>
>> >>> >> >>>>        Hugo.
>> >>> >> >>>>
>> >>> >> >>>>
>> >>> >> >>>>
>> >>> >> >>>>    For MS, where is the Bayes DB path specified?  My DB is
>> >>> >> >>>> located
>> >>> >> >>>> here:
>> >>> >> >>>>
>> >>> >> >>>>    /etc/MailScanner/.spamassassin
>> >>> >> >>>>
>> >>> >> >>>>
>> >>> >> >>>> I think my BAYES is  all messed up.  How do I rebuild it from
>> >>> >> >>>> scratch?
>> >>> >> >>>>
>> >>> >> >>> Devon,
>> >>> >> >>>
>> >>> >> >>> Look here for a starter kit: http://www.fsl.com/resources.html
>> >>> >> >>>
>> >>> >> >>> Denis
>> >>> >> >>>
>> >>> >> >>> --
>> >>> >> >>
>> >>> >> >> I've restored the starter DB and I do see the new files in
>> >>> >> >> /etc/MailScanner/.spamassassin (I stopped MailScanner and
>> >>> >> >> removed
>> >>> >> >> the
>> >>> >> >> one
>> >>> >> >> ones first), but SA Bayes DB Info from Mailwatch shows nothing.
>> >>> >> >>  When I
>> >>> >> >> do a
>> >>> >> >> lint from the Tools tab, i Get the following:
>> >>> >> >>
>> >>> >> >> [5637] dbg: bayes: no dbs present, cannot tie DB R/O:
>> >>> >> >> //.spamassassin/bayes_toks
>> >>> >> >
>> >>> >> >
>> >>> >> > Hmm....I thing Bayes IS working.  I just ran MailScanner --debug
>> >>> >> > --debug-sa
>> >>> >> > after the restore and did see:
>> >>> >> >
>> >>> >> > 11:52:13 [5879] dbg: bayes: tie-ing to DB file R/W
>> >>> >> > /root/.spamassassin/bayes_toks
>> >>> >> > 11:52:13 [5879] dbg: bayes: tie-ing to DB file R/W
>> >>> >> > /root/.spamassassin/bayes_seen
>> >>> >> > 11:52:13 [5879] dbg: bayes: found bayes db version 3
>> >>> >> > 11:52:13 [5879] dbg: bayes: learned
>> >>> >> > '88a47a16459989c19d47893de31fec608aa8f41e at sa_generated', atime:
>> >>> >> > 1213631520
>> >>> >> > 11:52:13 [5879] dbg: bayes: untie-ing
>> >>> >> > 11:52:13 [5879] dbg: bayes: files locked, now unlocking lock
>> >>> >> >
>> >>> >> > It seems that MailWatch is the one thats not working right.  Any
>> >>> >> > way
>> >>> >> > to
>> >>> >> > relink this?
>> >>> >> >
>> >>> >> > -Devon
>> >>> >> >
>> >>> >> Make sure your apahce user (the one running your httpd processes...
>> >>> >> hence the one running MailWatch:-) can actually read the bayes
>> >>> >> files... "su" is your friend here... and if you want to be able to
>> >>> >> learn via MailWatch, make sure the same user can write them too.
>> >>> >>
>> >>> >> Cheers
>> >>> >> --
>> >>> >
>> >>> > I have the right permissions set, the thing is MailWatch is not
>> >>> > showing
>> >>> > any
>> >>> > data for 'Bayes Database Information'.  What is the tie in for
>> >>> > MailWatch?
>> >>> >
>> >>> > -rw-rw---- 1 root apache  78K Jun 16 15:17 bayes_journal
>> >>> > -rw-rw---- 1 root apache  895 Jun 16 15:17 bayes.mutex
>> >>> > -rw-rw---- 1 root apache 172K Jun 16 15:17 bayes_seen
>> >>> > -rw-rw---- 1 root apache 5.1M Jun 16 15:17 bayes_toks
>> >>> >
>> >>> > -Devon
>> >>> >
>> >>> But can the apache user access the directory?
>> >>> MailWatch isn't particularly "magical" here, it uses the same info as
>> >>> all else...
>> >>>
>> >>> Try something like "su - apache -s /bin/bash" and then "cd
>> >>> /path/to/where/you/have/the/bayes/files"... Might give a clue:-)
>> >>>
>> >>> Cheers
>> >>> --
>> >>> -- Glenn
>> >>
>> >> User apache can access this fine.  I didn't see anything  in the
>> >>  MailWatch
>> >> .conf file on  Bayes
>> >>
>> > That's because there is nothing there....:-).
>> > It uses the same info all else do (through the normal SA method... The
>> > .cf files).
>> >
>> > Unless this is something hardcoded into the scriptlet handling the SA
>> > db dump... Haven't checked that (and will not be anwhere I can check
>> > it until tomorrow... You have a look:-).
>> >
>> > Cheers
>>
>> Nope, nothing strange here, the call is to
>> sa-learn -p /path/to/MailScanner/spa.assassin.prefs.conf --dump-magic
>> in bayes_info.php ... Where /path/to/MailScanner likely expands as
>> /etc/MailScanner or similar (this is from the SA_PREFS setting in
>> conf.php).
>>
>> As the apache user, can you run the above command? What do you get?
>>
>> Cheers
>> --
>> -- Glenn
>
> This was run as apache:
>
> bash-3.1$ sa-learn -p /etc/MailScanner/spam.assassin.prefs.conf --dump magic
> 0.000          0          3          0  non-token data: bayes db version
> 0.000          0        448          0  non-token data: nspam
> 0.000          0       1287          0  non-token data: nham
> 0.000          0     170860          0  non-token data: ntokens
> 0.000          0 1171294928          0  non-token data: oldest atime
> 0.000          0 1213703845          0  non-token data: newest atime
> 0.000          0 1213700281          0  non-token data: last journal sync
> atime
> 0.000          0 1213671060          0  non-token data: last expiry atime
> 0.000          0   11059200          0  non-token data: last expire atime
> delta
> 0.000          0      24264          0  non-token data: last expire
> reduction count
> bash-3.1$
>
Ok, and if you do (as the apache user)
spamassassin --lint -D -p /etc/MailScanner/spam.assassin.prefs.conf
(in reality, one should change MW to not use the -p preference file,
since this is included as a .cf already... Don't do much harm
though:-) Do you get the db error then?

Cheers
-- 
-- Glenn
email: glenn < dot > steen < at > gmail < dot > com
work: glenn < dot > steen < at > ap1 < dot > se


More information about the MailScanner mailing list