Max SpamAssassin Size problems

DAve dave.list at
Sun Aug 27 23:15:16 IST 2006

Julian Field wrote:
> Hash: SHA1
> Anthony Peacock wrote:
>> Ken A wrote:
>>> Logan Shaw wrote:
>>>> On Thu, 24 Aug 2006, Julian Field wrote:
>>>>> Anthony Peacock wrote:
>>>>>> Julian Field wrote:
>>>>>>> Sounds survivable. After the limit I will keep going until I hit the
>>>>>>> first line that only contains white space.
>>>>>> I have been watching this discussion with a growing uneasiness.  I
>>>>>> could be wrong but doesn't this behaviour open up the system to
>>>>>> problems with huge image files...
>>>>> Yes, you are absolutely correct. Non-spam may well include huge images.
>>>>> The problem with rewinding to the previous boundary is that you may end
>>>>> up not giving SpamAssassin _anything_ to work with.
>>>>> So it's up for a vote:
>>>>> do I chop half way through an image?
>>>>> do I chop at the end of an image?
>>>>> do I carry on for a max of 100 lines of Base64 data or until the end of
>>>>> an image, which is earlier?
>>>> I don't like the last option at all.  It still easily allows
>>>> a situation where a valid message with a valid image in it
>>>> gets detected as a corrupt image and hits a rule that scores
>>>> it as spam.
>>>> If we assume there are 80 columns of base64 data per line, then
>>>> we get 60 bytes per line (since each base64 character carries
>>>> 6 bits of data).  That means 100 lines only holds 6K, maximum.
>>>> So this option only works if the chop-off point randomly
>>>> happens to fall within the last 6K (or less) of the image.
>>>> If the max message size causes the initial chop-off point to
>>>> fall any earlier, it still creates an invalid image.  If you
>>>> have a 50K max message size and someone sends a 75K image
>>>> (which is not out of the ordinary at all), this method will
>>>> keep going up to 56K and then quit.
>>>> Basically, adding the 100 extra lines is really not much better
>>>> than chopping right at the max message size barrier, unless
>>>> you assume that most images aren't much larger than 6K, which
>>>> I don't think is a valid assumption at all.  So, this option
>>>> adds extra complexity and doesn't really give much benefit.
>>>>   - Logan
>>> I'm all for #3 and and just set "score FUZZY_OCR_CORRUPT_IMG 0" if you 
>>> are worried about false positives. Fuzzyocr will get better at sorting 
>>> this out. And of course in the mean time, don't use outlook, since it 
>>> will probably render corrupt images just fine. (it's a feature)
>> This could be controversial here...
>> <Evil Grin>
>> I have another suggestion, why don't we agree to leave the MailScanner 
>> code alone.  Those people who are experiencing problems with broken 
>> images can raise the value of "Max SpamAssassin Size" in *THEIR* 
>> configurations, the rest of us can carry on as normal.
>> There is already a way for people to adjust how much information SA gets 
>> from MailScanner, people who need more information can used that on 
>> their systems.
>> </Evil Grin>
>> <Ducks and Runs>
> Quack, quack, scamper, scamper....
> In my book, that is a remarkably good idea. It would be much simpler for 
> me to implement than any of the other, increasingly complicated versions.
> What objections to people have to simply letting you set this yourself?

I've resisted this thread on another list. It seems to me that there is 
nothing wrong with MailScanner. I believe the only way the users of 
these plugins will be happy (considering the possible up and coming Word 
plugin) will be if MailScanner could selectively send either a partial 
message or a whole message to SpamAssassin. Determined by.... dunno.

I for one want no part of a plugin that requires I send every single 
message in it's entirety to SA every time. I'd be DOS'ed within a month. 
I also think this issue is not a MS issue as spamc/spamd have message 
size limitations by default. In fact if the message exceeds the size 
limit I don't believe it is even sent to spamd by spamc is it? (Can't 

In any event I vote no to sending every message and it's attachment to 
SA. Please let me decide how much of a message is sent to SA.


Three years now I've asked Google why they don't have a
logo change for Memorial Day. Why do they choose to do logos
for other non-international holidays, but nothing for

Maybe they forgot who made that choice possible.

More information about the MailScanner mailing list