Logging scores of non-spam (patch)

John Ireland J.Ireland at HGU.MRC.AC.UK
Fri Jan 23 14:54:57 GMT 2004


I am interested in the "Log Non Spam" option.  FWIW I patch SA.pm to include the
SpamAssassin 'autolearn=' field in the SpamCheck header so I can check the
Bayesian auto learning - the header changes from,

    X-MailScanner-SpamCheck: spam, SpamAssassin (score=37.097,
        required 4, BAYES_99 5.40, BIZ_TLD 0.10,
        DATE_IN_FUTURE_12_24 3.33, DCC_CHECK 2.70, FORGED_OUTLOOK_TAGS ...
to,

    X-MailScanner-SpamCheck: spam, SpamAssassin (score=37.097,
        required 4, autolearn=spam, BAYES_99 5.40, BIZ_TLD 0.10,
        DATE_IN_FUTURE_12_24 3.33, DCC_CHECK 2.70, FORGED_OUTLOOK_TAGS ...

Would this be of general interest?

I do this now, without a config option, with the following changes to MailScanner-4.25-14,

*** SA.pm.orig  Fri Nov  7 12:41:41 2003
--- SA.pm.new   Mon Dec 15 14:56:54 2003
***************
*** 262,264 ****
     my($pipe);
!   my($SAHitList, $SAHits, $SAReqHits, $IsItSpam, $IsItHighScore);
     my($HighScoreVal, $pid2delete, $IncludeScores);
--- 262,264 ----
     my($pipe);
!   my($SAHitList, $SAHits, $SAReqHits, $IsItSpam, $IsItHighScore, $AutoLearn);
     my($HighScoreVal, $pid2delete, $IncludeScores);
***************
*** 299,300 ****
--- 299,309 ----
       }
+    # Get the autolearn status
+    if (!defined $spamness->{auto_learn_status}) {
+         $AutoLearn = "no";
+    } elsif ($spamness->{auto_learn_status}) {
+          $AutoLearn = "spam";
+    } else {
+          $AutoLearn = "ham";
+    }
+    print $pipe $AutoLearn . "\n";
       $spamness->finish();
***************
*** 312,313 ****
--- 321,323 ----
       #print STDERR "Read SAHits = $SAHits " . scalar(localtime) . "\n";
+     $AutoLearn = <$pipe>;
       $SAHitList = <$pipe>;
***************
*** 321,322 ****
--- 331,333 ----
       chomp $SAHits;
+     chomp $AutoLearn;
       chomp $SAHitList;
***************
*** 339,341 ****
                  MailScanner::Config::LanguageValue($Message, 'required') .' ' .
!                $SAReqHits . ($SAHitList?", $SAHitList":'');

--- 350,352 ----
                  MailScanner::Config::LanguageValue($Message, 'required') .' ' .
!                $SAReqHits . ', ' . 'autolearn' . '=' . $AutoLearn .  ($SAHitList?", $SAHitList":'

Walker Aumann wrote:
> For my site, I thought it could be interesting to know the scores of mail
> getting through MailScanner/SpamAssassin without having to archive all
> the messages, to get an idea of how close messages were getting to the
> threshhold.  The following two patches (against MailScanner 4.25-14) add
> a "Log Non Spam" option that works just like the "Log Spam" option.
> Hopefully someone else will also find this data useful.
>
> Walker


--
John Ireland - Systems Manager    Email: mailto:J.Ireland at hgu.mrc.ac.uk
MRC Human Genetics Unit           Tel. : +44-31-332-2471
Western General Hospital          Fax. : +44-31-343-2620
Edinburgh, EH4 2XU, UK            WWW  : http://www.hgu.mrc.ac.uk



More information about the MailScanner mailing list