Python Script help (Harvesting Spam from Exchange)

Peter Russell pete at enitech.com.au
Fri Nov 3 03:28:14 GMT 2006


Some one else on this list (i am sorry i dont recall who) let me use the 
attached python script to learn from spam (then delete it) from an 
Exchange public folder.

I was going to add it all to the wiki but after some more thorough 
testing i notice the script doesnt always learn and delete all of the 
spam in the public folder on a single run - the script must be re run 
several times before all of the spam is learned and deleted.

Is anyone here python proficient enough to have a look and see if there 
is a way of getting it to run a little more reliably?

Once this is worked out i will write wiki doc on setting up exchange and 
the script.

Many thanks in advance if anyone is able to help
Pete
-------------- next part --------------
#!/usr/bin/env python
import commands, os, time
import imaplib
import sys, re
import string, random
import StringIO, rfc822

# Set required variables
PREFS = "/etc/MailScanner/spam.assassin.prefs.conf"
TMPFILE = "/var/tmp/salearn.tmp"
SALEARN = "/usr/bin/sa-learn"
SERVER = "x.x.x.x"
USER  = "someuserwithaccesstopublicfolder"
PASSWORD = "somepassword"
LOGFILE = "/var/log/learn.spam.log"
log = file(LOGFILE, 'a+')
log.write("\n\nTraining SpamAssassin on %s at %s\n" % (time.strftime("%Y-%m-%d"), time.strftime("%H:%M:%S")))

# connect to server
server = imaplib.IMAP4(SERVER)

# login
server.login(USER, PASSWORD)
server.select("Public Folders/Spam")

# Get messages
typ, data = server.search(None, 'ALL')
for num in data[0].split():
        typ, data = server.fetch(num, '(RFC822)')
        tmp = file(TMPFILE, 'w+')
        tmp.write(data[0][1])
        tmp.close()
        log.write(commands.getoutput("%s --prefs-file=%s --spam %s" % \
                (SALEARN, PREFS, TMPFILE)))
        log.write("\n")
        # Mark learned spam as "Deleted"
        server.store(num, '+FLAGS', '\\Deleted')
# Delete messages marked as "Deleted" from server
        server.expunge()
server.logout


More information about the MailScanner mailing list