Why doesn't DCC help against image spam?

Pete Russell pete at enitech.com.au
Wed Dec 27 03:17:04 CET 2006

Glenn Steen wrote:
> On 26/12/06, Ken A <ka at pacific.net> wrote:
>> Glenn Steen wrote:
>> > On 26/12/06, Scott Silva <ssilva at sgvwater.com> wrote:
>> >> Remco Barendse spake the following on 12/24/2006 7:43 AM:
>> >> > Now that ORDB is down my mailscanner is not filtering any spam 
>> anymore,
>> >> > i might as well disable it.
>> >> >
>> >> > But out of curiosity, why doesn't DCC work for the image spam?
>> >> >
>> >> > A checksum should be reasonably effective against the image spam i
>> >> > think? Assuming that they are not dynamically building each 
>> picture a
>> >> > bit differently for each e-mail that is sent?
>> >> But that could be what they are doing. Spammers are like cockroaches.
>> >> They
>> >> adapt very quickly, and after they mass-fire their crap, they change
>> >> up a bit,
>> >> and reload for the next salvo.
>> >>
>> >> It's war, and we are always on the defense.
>> > Depressing but true... I think I'll have another Julsnaps... To
>> > enliven my defenses... (If the snaps fails to do that.... well, at
>> > least I'll be having more fun...:-)
>> >
>> > Seriously though, I think the only real effective defenses (on my
>> > sysytems at least) against image-based spam has been a combination of
>> > the digests (yes, they do take _some_ of it), RFC "strictness" checks
>> > (in PF) and ImageInfo (and some TVD rules picked up by an sa-update).
>> > When these fail I'll be going for FuzzyOcr (have just tested this so
>> > far, but ... it really needs muscle that the production boxes lack).
>> > Or someone really clever will have found another method:-).
>> FuzzyOCR runs by default with a low priority (runs as last SA test), so
>> it only run when the SA score (so far) is > $X, so set that to your low
>> threshold, and FuzzyOCR only runs on spam that hasn't been tagged yet.
>> Works quite well, and doesn't take all that much cpu, since > 70% of the
>> image spam is caught by the other methods.
> True enough... When I've been testing I haven't been taking that into
> consideration (looking at "synthetic" situations can blind one to
> things:-). Will likely implement it in production some time early next
> year then. Thanks Ken.

Should one use imageinfo OR FuzzyOCR, or both together?

More information about the MailScanner mailing list