Python Script help (Harvesting Spam from Exchange)
Glenn Steen
glenn.steen at gmail.com
Fri Nov 3 08:50:48 GMT 2006
On 03/11/06, Peter Russell <pete at enitech.com.au> wrote:
> Some one else on this list (i am sorry i dont recall who) let me use the
> attached python script to learn from spam (then delete it) from an
> Exchange public folder.
>
> I was going to add it all to the wiki but after some more thorough
> testing i notice the script doesnt always learn and delete all of the
> spam in the public folder on a single run - the script must be re run
> several times before all of the spam is learned and deleted.
>
> Is anyone here python proficient enough to have a look and see if there
> is a way of getting it to run a little more reliably?
>
> Once this is worked out i will write wiki doc on setting up exchange and
> the script.
>
> Many thanks in advance if anyone is able to help
> Pete
>
>
> #!/usr/bin/env python
> import commands, os, time
> import imaplib
> import sys, re
> import string, random
> import StringIO, rfc822
>
> # Set required variables
> PREFS = "/etc/MailScanner/spam.assassin.prefs.conf"
> TMPFILE = "/var/tmp/salearn.tmp"
> SALEARN = "/usr/bin/sa-learn"
> SERVER = "x.x.x.x"
> USER = "someuserwithaccesstopublicfolder"
> PASSWORD = "somepassword"
> LOGFILE = "/var/log/learn.spam.log"
> log = file(LOGFILE, 'a+')
> log.write("\n\nTraining SpamAssassin on %s at %s\n" % (time.strftime("%Y-%m-%d"), time.strftime("%H:%M:%S")))
>
> # connect to server
> server = imaplib.IMAP4(SERVER)
>
> # login
> server.login(USER, PASSWORD)
> server.select("Public Folders/Spam")
>
> # Get messages
> typ, data = server.search(None, 'ALL')
> for num in data[0].split():
> typ, data = server.fetch(num, '(RFC822)')
> tmp = file(TMPFILE, 'w+')
> tmp.write(data[0][1])
> tmp.close()
> log.write(commands.getoutput("%s --prefs-file=%s --spam %s" % \
> (SALEARN, PREFS, TMPFILE)))
> log.write("\n")
> # Mark learned spam as "Deleted"
> server.store(num, '+FLAGS', '\\Deleted')
> # Delete messages marked as "Deleted" from server
> server.expunge()
> server.logout
>
Not sure about anything (not really proficient in python:-), but try
moving the expunge out of the for loop, and see if that helps (you'd
just do one big expunge after you're done, thus preserving the "order"
for the for loop). Haven't tested anything either:-):-). Another
thought would be if the M-Sexchange IMAP service had some foolery
going on, like "pagination".... Not returning more than X heads for
you to operate on...
--
-- Glenn
email: glenn < dot > steen < at > gmail < dot > com
work: glenn < dot > steen < at > ap1 < dot > se
More information about the MailScanner
mailing list