Max SpamAssassin Size problems -- round 2
Anthony Peacock
a.peacock at chime.ucl.ac.uk
Tue Aug 29 09:19:43 IST 2006
Logan Shaw wrote:
> On Mon, 28 Aug 2006, Julian Field wrote:
>> All fair points. Which brings us back to the beginning.
>> The option which got the biggest number of votes was along the lines of
>> this:
>>
>> for ($lines=$size=0; $lines<100 && $size<20_000; $lines++)
>> {
>> $line = getnextline();
>> $size += length($line);
>> last if $size>20_000;
>> push @SAinput, $line;
>> last if $line =~ /^\s*$/;
>> }
>>
>> It should keep copying lines until we hit a line that is only whitespace
>> (or blank) or until we have copied 20k of extra data, whichever comes
>> first. And it won't be confused by nearly 20k of extra data followed by
>> 1 huge line lasting for mbytes.
>>
>> Is that a reasonable compromise?
>
> I like the idea of trying to be a little intelligent and
> flexible about where you chop the message is a good one.
> That seems to me to have value. If you can chop at an
> attachment boundary, that's good, so chopping at the first
> boundary within a window (of bytes and/or lines) is a good
> thing. It will work some of the time.
If we agree that MS should be as friendly to SA as possible, and Julian
is happy to make some changes, then I think this is the best option.
I do not like the idea of just ignoring messages over the "Max SA Size"
and not passing them to SA at all. That would lower the overall
effectiveness of scanning. I think that having a flexible window around
the "Max SA Size" to try to find the end of an image is a good idea.
> However, I still think there needs to be an answer to the
> question of what to do when the window method fails to solve
> the problem. I think that will happen frequently enough that
> it's important to be intentional about it.
Agreed!
> So, if the boundary does not lie in the window, what is the best
> thing to do? It seems to me you have three reasonable options:
> (1) chop somewhere inside the window anyway,
> (2) keep going to the end of the current attachment and
> chop after it's over,
> (3) roll back to the beginning of the current attachment,
> and chop before it begins.
I would vote for No 3, as long as this did not make the code changes too
complicated. I think that this has the advantage of passing something
to SA to scan (headers, leading text, etc), without risking sending a
broken image to SA.
<SNIP good summary of effects of above options>
--
Anthony Peacock
CHIME, Royal Free & University College Medical School
WWW: http://www.chime.ucl.ac.uk/~rmhiajp/
"If you have an apple and I have an apple and we exchange apples
then you and I will still each have one apple. But if you have an
idea and I have an idea and we exchange these ideas, then each of us
will have two ideas." -- George Bernard Shaw
More information about the MailScanner
mailing list