idea for next version

Steve Campbell campbell at cnpapers.com
Wed Oct 11 03:17:52 IST 2006


Quoting Scott Silva <ssilva at sgvwater.com>:

> Logan Shaw spake the following on 10/10/2006 3:12 PM:
> > Roger wrote:
> >>> So I was checking mailwatch this evening and I found out that the
> >>> spam / ham percentage is 60% / 40% at daytime and 95% / 5% at night.
> >>> This is quiet logical because at daytime everybody is working and at
> >>> night (well here in europe) only spammers are working. This can be
> >>> used for the spamfiltering. I think if it is possible to f.e. do,
> >>> "spamscore * 1.2" between 11:00 pm and 7:00 am, it will hit more
> >>> highscoring spam at night. Offcourse it will also hit ham, but as
> >>> there is much less ham at night the possibility is less.
> > 
> > On Tue, 10 Oct 2006, Steve Campbell wrote:
> >> I tend to look at this in a different light. Spam is spam, and should
> >> be caught by rules, etc regardless of the time it arrives. Ham is the
> >> same also regardless of it's arrival time. A good set of rules should
> >> work fine any time of the day. The percentages only indicate when
> >> people are sending mail, so this is a useless figure for comparing
> >> day/night averages.
> > 
>

My point here was that using percentages is only dependent on spam received. If
you receive no spam, you're going to see 100% good mail. If you receive floods
of spam, your percentage ratio changes. Now one or the other needs to change for
the ratio to change. A good rule that blocks spam will block spam at either noon
or 3:00 a.m.

My reported ratio changed drastically by installing MimeDefang. My MTA still
received the spam, but blocked a lot of it from MS/SA. The amount of mail
reaching the MTA did not change.

Percentages have always been a bad indicator of everything (except for 100% or
0%), Anything in between is relative. Would you rather receive 80% of $1.00 or
20% of $1,000.00? You have to apply the percentages in the proper context.


> >> For instance, if the same message that came in at night were resent
> >> during the day, how should the mail be treated? Different score and
> >> action?
> > 
> > While I share the feeling that it is a little bit odd that the
> > time a message arrives could sway its score, this is already
> > true to some extent:  real-time blacklists change over time
> > (otherwise they wouldn't be real-time), and the score a message
> > gets can be different one hour from what it is at the next hour.

But these lists are changing due to actual mail and the content of that mail,
not because of the time of day that is current. If I were a spammer, and I
discovered the fact that you are basing your score value on the time of day (or
night), I would just change the time I send out my spam. This would adversely
affect your system in a negative way. As a matter of fact, I am seeing more and
more spam showing up during daytime hours. Nightly spam is still the more
dominant norm though.
 
I don't mind seeing that my ratio of spam to ham is high because it means I am
stopping it. On the other hand, if total messages are low, the reverse ratio is
OK. I'm just using CPU cycles to block all of that junk. If the total message
count is high, and the spam to ham ratio is low, then I have to assume I can do
better at some rules. But then, what will the ratio be whenever I have the
perfect system using perfect rules? Zero spam to 100% ham!! But that won't
happen, so the best I can do is try for something in between.

Ultimately, you have to stop spam before it gets to the MS/SA before percentages
mean anything, or accept high spam ratio.

I think that is what I mean.

Steve


-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/


More information about the MailScanner mailing list