postfix-specific method to feed spam/ham to sa-learn

Eric Dantan Rzewnicki rzewnickie at RFA.ORG
Thu Jan 22 22:04:16 GMT 2004


I worked out one (naive?) way to get user feedback on false positives
and false negatives using postfix and MS when users are only using pop
and outlook. I'm not going to be using this method, but wanted to share
it in case someone else needed to do it this way. (and also to get
feedback on whether it's even a valid approach). Probably someone better
than me at shell scripting and regular expressions could do this better
and more succinctly.

I set this in MailScanner.conf:
Archive Mail = /var/spool/MailScanner/archive/

Which creates a directory for each day containing a copy of the original
queuefile for each message. I took this approach because it seemed it
would be very easy to set up a simple cronjob that deleted directories
older than x days.

Basically the only feedback required from the user is to send the
headers to either a spam at dom.tld or notspam at dom.tld.

To get to the full headers in Outlook:
1) open the message in its own window
2) select View -> Options
3) the dialog box contains a scroll window at the bottom labeled
   "Internet Headers". Cut and paste the text from there into a new
   message.

Getting the headers this way has been deemed too much work for the
users, so I'm working out another feedback method. But, anyway, with the
headers in the bodies of messages in an mbox I was able to get the date
and queuefile-id with these two command lines (broken with \):

# message id
cat /var/mail/spam | \
formail -I "" -s | \
grep -A2 "^Received:" | \
grep "by host.dom.tld (Postfix) with .*SMTP" | \
cut -d" " -f8

# date directory
cat /var/mail/spam | \
formail -I "" -s | \
grep -A3 "^Received:" | \
grep "for <.*@dom.tld>; ..., .. ... ...." | \
cut -d";" -f 2 | \
xargs -i date -d {} +%Y%m%d

Thus far, I've just been feeding that output in pairs into the simple
script below. If I were going further with this approach I'd have added
the commands above to the script.

#!/bin/bash

archive_dir=/var/spool/MailScanner/archive
sa_prefs=/opt/MailScanner/etc/spam.assassin.prefs.conf
date_dir=$1
queue_file=$2
spam_or_ham=$3
queue_file_path=$archive_dir/$date_dir/$queue_file
line_count=`postcat $queue_file_path | wc -l`
postcat $queue_file_path | \
tail -$(($line_count-6)) | \
head -$(($line_count-10)) | \
sa-learn --$spam_or_ham -p $sa_prefs


Well, that's it. Hopefully it's useful to someone. Appologies if this is
silly or useless.

-Eric Rz.
(now to figure this out with forwarded fp/fn and archive as an mbox ...
should be doable.)



More information about the MailScanner mailing list