Max SpamAssassin Size problems -- round 2

Scott Silva ssilva at
Mon Aug 28 19:12:52 IST 2006

DAve spake the following on 8/28/2006 10:21 AM:
> Julian Field wrote:
>> Hash: SHA1
>> Kash, Howard (Civ, ARL/CISD) wrote:
>>>> Why not just set the Max SpamAssassin Size to 50k
>>> You'll still truncate images.  I currently have it at 150k and it
>>> still truncates images (either large ones or messages with lots of
>>> attached images).
>>>> or the partial-image-detection rules to 0?
>>> This is an option, but you give up some SPAM detection capability. 
>>> The plugin doesn't specifically test for partial images, but corrupt
>>> images in general, which truncated images are a subset of.  Some
>>> image spammers have intentionally corrupted the image in such a way
>>> that many email clients will still render them readable, but image
>>> analysis utilities balk on them.  So messages with corrupt images are
>>> given a higher score.
>>> And this isn't just about images, supposedly someone is working on a
>>> plugin to analyze Word documents for spam content.   It may have the
>>> same problem with truncated Word attachments.
>> All fair points. Which brings us back to the beginning.
>> The option which got the biggest number of votes was along the lines
>> of this:
>> for ($lines=$size=0; $lines<100 && $size<20_000; $lines++)
>> {
>>    $line = getnextline();
>>    $size += length($line);
>>    last if $size>20_000;
>>    push @SAinput, $line;
>>    last if $line =~ /^\s*$/;
>> }
>> It should keep copying lines until we hit a line that is only
>> whitespace (or blank) or until we have copied 20k of extra data,
>> whichever comes first. And it won't be confused by nearly 20k of extra
>> data followed by 1 huge line lasting for mbytes.
>> Is that a reasonable compromise?
> That is still work for you, and wouldn't a 20k chunk of a 20.1k image
> still cause the plugin to fail to properly inspect the image?
> DAve
But you also have to take into account the original 4K (or whatever
MailScanner is set to) added to that extra 20K. No solution is going to be
perfect, except fixing the image plugins, or sending the entire message to
spamassassin if the admin so desires, and is willing to take the chance of
being dossed. Julian, you are probably not going to be able to make everybody
happy, so go with what is easiest for you to maintain, or has less chance of
breaking something. Then you will have to decide how you will set the defaults
for newbie installations.
If you get too many complaints about the 20K limit, those few admins can go in
the code and change it if they feel so inclined. Or it can be a variable and
set it in MailScanner.conf, with a suitable default, and a warning about the
implications of what could happen. You have to decide how much you wish to
complicate the code.


MailScanner is like deodorant...
You hope everybody uses it, and
you notice quickly if they don't!!!!

More information about the MailScanner mailing list