create stat report(s) on spam score stats

Kearney, Rob RKearney at AZERTY.COM
Mon Jul 28 20:56:33 IST 2003


Yep, 
As Mariano stated, use the SQLLogging function, (we actually modified to do
this real-time, instead of "batch" with no real degradation of performance,
delivering 48k message of which 28k were spam).

use something like this to get avg.. 


mysql> select avg(sascore) from maillog_mail where time > '20030724' and
time < '20030725' and isspam = 1;
+--------------+
| avg(sascore) |
+--------------+
|    11.845470 |
+--------------+
1 row in set (0.38 sec)

-rob


-----Original Message-----
From: Mariano Absatz [mailto:mailscanner at LISTS.COM.AR]
Sent: Monday, July 28, 2003 3:43 PM
To: MAILSCANNER at JISCMAIL.AC.UK
Subject: Re: create stat report(s) on spam score stats


Take a look at the SQLLogging functions in CustomConfig.pm, it does what you

want (log messages in a sql database).

It inserts one record/message in one table, plus one record/recipient in 
another table which you sould be able to join and get the results you
want... 
if you're using Sendmail... that is, the id's used by Sendmail I think are 
expected to be unique (they are timestamp-based), whereas the id's in
ZMailer 
(the ones I use) are only unique during a brief period, since they are only 
based in the inode number of the file and these get reused quite frequently 
(at least in Linux).

Anyway, you should be able to modify the code tu suit you.

El 28 Jul 2003 a las 12:23, Chris W. Parker escribió:

> Hello,
> 
> I've just had a good idea (well, *I* think it's a good idea) and I'd
> like to survey the list to find out the feasibility of it and
> usefulness.
> 
> One thing that I've just recently started doing was actually putting the
> high-spam score (and it's related action) to use. Right now I have it
> set that 9 points and greater get deleted. 4 to <9 get the regular
> {Spam?} in the subject.
> 
> What I want to do is better decide what the best scores for our business
> is. My idea is this. Each time a mail comes through it's spam score is
> written to a database (or flat file, or whatever) along with the
> recipient (each recipient above one will be treated as a separate email
> and therefore get a record of it's own) and the sender.
> 
> Then a php page (or your script of choice) could be created to determine
> the low spam score and the high spam score. This page would basically
> just pull out the records, do some calculations based on the score and
> find the average and standard deviations.
> 
> Here's why I'm thinking this is a good idea, the only way for me to get
> the scores of the emails is by right-clicking them and looking at the
> headers by choosing properties (I'm using Outlook 2000). Then I have to
> scroll down to the spam score section and read the score.
> 
> After doing about 10 emails I can kind of get an idea of what the
> average spam score is. Right now, although my high-spam score is at 9, I
> think I may be able to move it down to 8. But these numbers don't take
> into account other users spam mails. For example my boss gets a lot of
> spam.
> 
> I'm thinking that a legitimate email that gets marked as spam
> (false-positive) is most likely NOT going to get into the high-spam
> score range.
> 
> What would it take to do this? Or more specifically, what would it take
> to have a record written to a mysql db for each and every mail?
> 
> 
> I will appreciate any/all feedback,
> Chris.


--
Mariano Absatz
El Baby
----------------------------------------------------------
Make yourself at home! Clean my kitchen.




More information about the MailScanner mailing list