Max SpamAssassin Size problems -- round 2
DAve
dave.list at pixelhammer.com
Mon Aug 28 18:21:50 IST 2006
Julian Field wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
>
> Kash, Howard (Civ, ARL/CISD) wrote:
>>> Why not just set the Max SpamAssassin Size to 50k
>>
>> You'll still truncate images. I currently have it at 150k and it still truncates images (either large ones or messages with lots of attached images).
>>
>>> or the partial-image-detection rules to 0?
>>
>> This is an option, but you give up some SPAM detection capability. The plugin doesn't specifically test for partial images, but corrupt images in general, which truncated images are a subset of. Some image spammers have intentionally corrupted the image in such a way that many email clients will still render them readable, but image analysis utilities balk on them. So messages with corrupt images are given a higher score.
>>
>> And this isn't just about images, supposedly someone is working on a plugin to analyze Word documents for spam content. It may have the same problem with truncated Word attachments.
>
> All fair points. Which brings us back to the beginning.
> The option which got the biggest number of votes was along the lines of
> this:
>
> for ($lines=$size=0; $lines<100 && $size<20_000; $lines++)
> {
> $line = getnextline();
> $size += length($line);
> last if $size>20_000;
> push @SAinput, $line;
> last if $line =~ /^\s*$/;
> }
>
> It should keep copying lines until we hit a line that is only whitespace
> (or blank) or until we have copied 20k of extra data, whichever comes
> first. And it won't be confused by nearly 20k of extra data followed by
> 1 huge line lasting for mbytes.
>
> Is that a reasonable compromise?
That is still work for you, and wouldn't a 20k chunk of a 20.1k image
still cause the plugin to fail to properly inspect the image?
DAve
--
Three years now I've asked Google why they don't have a
logo change for Memorial Day. Why do they choose to do logos
for other non-international holidays, but nothing for
Veterans?
Maybe they forgot who made that choice possible.
More information about the MailScanner
mailing list