Check which rules hit

Denis Beauchemin Denis.Beauchemin at usherbrooke.ca
Fri Jun 19 12:26:30 UTC 2015


I created this script a while back just to do that:
#!/usr/bin/perl -w
#
# Script that looks through maillog to find all messages tagged as spam
# by MailScanner.  It then tallies the different SpamAssassin rules that
# fired.
# Denis Beauchemin, 20050516

use Getopt::Long;

# Where some commands reside:
my $GREP   = "/bin/grep";
my $GUNZIP = "/bin/gunzip";

# Value of "Spam =" in %report-dir%/languages.conf
my $isSpamString  = "est un polluriel, SpamAssassin";
my $isHamString   = "est pas un polluriel, SpamAssassin";
my $allString     = " un polluriel, SpamAssassin";
# Value of "score =" in %report-dir%/languages.conf
my $scoreString   = "score=";
# Value of "required =" in %report-dir%/languages.conf
my $reqdString    = "requis ";
my $autoString    = "autolearn=spam";
my $cachedString  = "cached, ";
my $nCachedString = "not cached, ";

my $maillog = "/var/log/maillog";
@maillogs = ();

my $sortByName = 0;
my $sortByHits = 0;
my $getHam = 0;
my $getAll = 0;
my $help = 0;

GetOptions(
    'sortbyname|byname' => \$sortByName,
    'sortbyhits|byhits' => \$sortByHits,
    'log=s' => \@maillogs,
    'ham'   => \$getHam,
    'all'   => \$getAll,
    'help'  => \$help,
);

if ( $help ) {
    print '
This program tallies SpamAssassin\'s rules that were triggered when
an email was detected as spam by MailScanner.

You can search for ham with the --ham option. 

You can search for all SpamAssassin results with the --all option.

By default it sorts the results by rule name. It can also sort them
by number of hits if called with --sortbyhits (or --byhits).

The option --sortbyname (or --byname) is the default one.

If you don\'t want to use the current maillog, specify a different
one with --log new-maillog.

All unknown command line parameters will be treated as additional
file names to process.

It is OK for a log file to be gzipped.
';
    exit;
}

push @maillogs, @ARGV;
@maillogs = ( $maillog ) if ( @maillogs  == 0 );
#print "Maillogs: @maillogs\n";
#my $searchString = $getHam ? $isHamString : $isSpamString;
my $searchString;
if ( $getAll ) {
    $searchString = "$allString";
} elsif ( $getHam ) {
    $searchString = "$isHamString";
} else {
    $searchString = "$isSpamString";
}

foreach my $maillog ( @maillogs ) {
    print "Processing $maillog...\n";

    $sortByName++ if ( ( $sortByName == 0 ) && ( $sortByHits == 0 ) );

    my $openCmd = "LANG=C $GREP \"$searchString\" $maillog |";
    if ( $maillog =~ /\.gz$/ ) {
        $openCmd = "$GUNZIP -c $maillog | LANG=C $GREP \"$searchString\" |";
    }
    open LOG, "$openCmd" || die "Cannot open $maillog";

    while ( <LOG> ) {
        next unless /$searchString \((?:$cachedString|$nCachedString)$scoreString[-\d.]+, $reqdStrin
g[-\d.]+,(?: $autoString,)?(.*)$/;
        my $hits = $1;
        foreach my $hit ( $hits =~ / ([^\s]+) -?[\d.]+(?:,|\))/g ) {
            $hit{$hit}++;
        }
    }

    close LOG;
}

if ( $sortByName ) {
    foreach my $hit ( sort keys %hit ) {
        printf "%27s %5d\n", $hit, $hit{$hit};
    }
} elsif ( $sortByHits ) {
    foreach my $hit ( sort {$hit{$b}<=>$hit{$a}} keys %hit ) {
        printf "%27s %5d\n", $hit, $hit{$hit};
    }
}


-----Message d'origine-----
De : MailScanner [mailto:mailscanner-bounces at lists.mailscanner.info] De la part de Peter Nitschke
Envoyé : 19 juin 2015 02:21
À : mailscanner at lists.mailscanner.info
Objet : Check which rules hit

I have built up a large number of rules for SA to use with MS and many are probably now obsolete.

How can I monitor which rules are getting hits?

Thanks.

Peter




--
MailScanner mailing list
mailscanner at lists.mailscanner.info
http://lists.mailscanner.info/listinfo/mailscanner



More information about the MailScanner mailing list