MailWatch Spam Learn / Bayes DB

Brett Carruthers bcarruthers at iii.net.au
Tue May 13 00:14:29 IST 2008



-----Original Message-----
From: Glenn Steen [mailto:glenn.steen at gmail.com] 
Sent: Monday, 12 May 2008 5:50 PM
To: MailScanner discussion
Subject: Re: MailWatch Spam Learn / Bayes DB

2008/5/12 Glenn Steen <glenn.steen at gmail.com>:
> 2008/5/12 Mohammed Alli <malli at mcrirents.com>:
>
>
> >
>  >  ________________________________
>  >
>  >  From: mailscanner-bounces at lists.mailscanner.info on behalf of 
Brett Carruthers
>  >  Sent: Sun 5/11/2008 8:25 PM
>  >  To: mailscanner at lists.mailscanner.info
>  >  Subject: MailWatch Spam Learn / Bayes DB
>  >
>  >
>  >
>  >
>  >
>  >  Hi,
>  >
>  >  I have a few week old install of MailScanner / MailWatch / Scalix 
on CentOS 5.1 and my bayes DB is going OK but is still giving me 0% 
chances on some spam mail.
>  >
>  >  So I want to train it a bit more so the bayes works even better.
>  >
>  >  What do I have to do to get MailWatch to be able to manually 
spam/ham learn on its 'Message Operations' report?
>  >
>  >  Currently, if I try and learn spam it gives me an error about the 
message not being in the quarantine eg.
>  >
>  >  Message m4BK4Wg4005306 not found in quarantine
>  >
>  >  Some settings from MailScanner.conf
>  >
>  >  --
>  >
>  >  Quarantine dir = /var/spool/MailScanner/quarantine
>  >
>  >  Quarantine Infections = yes
>  >
>  >  Quarantine Whole Messages As Queue Files = yes
>  >
>  >  Quarantine Whole Message = yes
>  >
>  >  Spam Actions = store-spam
>  >
>  >  High Scoring Spam Actions = delete (don't need to worry about 
these as its already learnt these are Spam!)
>  >
>  >  Non Spam Actions = deliver store header "X-Spam-Status: No"
>  >
>  >  If anyone could lean me in the right direction I would appreciate 
it very much!
>  >
>  >  Thanks,
>  >
>  >  Brett
>  >
>  >
>  >
>  >
>  >
>  >  Hi Brett,
>  >
>  >
>  >
>  >  Try the following to fix your message operation error:
>  >
>  >  9.23 Fix for the Reporting Function in Message Operations
>  >
>  >  Change the following in /var/www/mailscanner/do_message_ops.php 
file:
>  >
>  >  $id = $Regs[1];
>  >
>  >  to
>  >
>  >  $id = str_replace("_", ".",$Regs[1]);
>  >
>  >  Good Luck
>  >
>  >  Mohammed
>  >
>  That IS a good fix, provided you use Postfix, which Brett doesn't 
seem to do.
>  Much more likely that there simply is nothing in the quarantine to
>  learn from. There is a simple one-line fix to message_ops.php that
>  "enhance the SQL to actually check that the message is quarantined
>  (fix by Dhawal Doshy, go search the MailWatch list archives...)... If
>  one want to be able to do this at all, one need to include "store" in
>  the Non Spam Actions ...
>  Brett, if you look at the details page for the message, do you have
>  the "learn/release" block at the bottom? I'm pretty certain you 
don't.
>
>  Oh and BTW, this should've gone to the MailWatch list rather than the
>  MailScanner one... Slightly OT here:-).
>
>  Cheers

Oh, sorry... you do have "store" set. Hm. This tyraining, is it from
the MessageOps page, or the details? What happens if you try train
manually? Is the message content "there"? Do you have a
clean_quarantine script in cron, and what have you set the cleani9ng
period to be?

Cheers
-- 
-- Glenn
email: glenn < dot > steen < at > gmail < dot > com
work: glenn < dot > steen < at > ap1 < dot > se
-- 

Hi Glen,

Sorry for the topic being OT, I have recognized the problem being 
quarantine related and started at MailScanner... I'll go away and ask 
the MailWatch list if I can confirm my quarantine settings!

I don't seem to see the quarantine messages anymore since I turned off 
'Quarantine Whole Messages As Queue Files' yesterday, before that I 
would get the files in the quarantine albeit with a different queue 
identifier than the SQL logged eg qfm4C5WuYl031783 for queue, SQL wants 
a format of m4CKXUKw031678.

I have been using the 'MessageOps' page in reports and my individual 
message's don't show learn spam/ham. I can't seem to find my messages in 
quarantine unless I have the full queue file logged. I think I need to 
use just the message content if I were to train the bayes engine (not 
the full queue file), can you please confirm if I am right in this 
assumption? I would like to try and train manually but I haven't been 
able to store all my messages...

I have the clean_quarantine script ready to go but don't have it 
currently running...

Thanks for your help,
Brett



More information about the MailScanner mailing list