Kai Schaetzl maillists at CONACTIVE.COM
Wed Feb 25 15:31:43 GMT 2004

Rob Vicchiullo wrote on         Tue, 24 Feb 2004 14:26:08 -0500:


DSpam has already been discussed here. Let's have a look at the article:

> Nuclear Elephant writes "The authors of two spam filters, CRM114 and

"Nuclear elephant" *is* the author of DSpam if I may conclude that from
the domain name. And the author of CRM114 is apparently a co-author of
some portions of it.

announced recently that their filters have achieved accuracy
> rates ten times better than a human is capable of.

This is simply impossible, see below. And it's misleading use of
statistics. If we were to use the figures, anyway, it means A has a
correct detection of 6240 out of 6250 messages while B has one of 6249 out
of 6250. So, the increase in detection is about 0,0015%. It's ten times
more "accurate"? And if we go here:
we see these figures are based on a "real mailbox" , the author's one.
There's no problem to train and code a filter for such a result. See
below, we have 100% accuracy for our mailboxes.

Based on a study by
> Bill Yerazunis of CRM114, the average human is only 99.84% accurate.

Nonsense. It's only the recipient who can classify something as spam.

> Both filters are reporting to have reached accuracy levels between
> 99.983% and 99.984% (1 misclassification in 6250 messages) using
> completely different approaches

Well, I see that *all* of our spam is getting a BAYES_99 from SA and the
low-scoring spam is identified *only* by BAYES_99. So, what does this tell
me? That SA is 100% accurate? Maybe. That we have a bayes database which
is very well trained for our needs? Most certainly yes.

Both tools are probably quite good in detecting spam, but the article(s)
is/are just a marketing blurb.



Kai Schätzl, Berlin, Germany
Get your web at Conactive Internet Services:
IE-Center: &

More information about the MailScanner mailing list