Mishandling multi-line ISO encoded subject headers

Julian Field MailScanner at ecs.soton.ac.uk
Thu May 11 08:34:06 IST 2006


The patch looks okay. I have re-phrased it a tiny bit, and it could  
still be optimised a bit, but it looks good otherwise.
It will be in the next release.
Thanks!

On 10 May 2006, at 22:16, Nick Smith wrote:

> OK - so it's rather sad to reply to your own post, but I've written a
> patch against Postfix.pm which seems to fix this issue (in the limited
> amount of testing I've done)
>
> What it does is to extend the Subject: header extraction handling in
> the ReadQf subroutine I referred to previously:
>
> if ($recdata =~ /^Subject:\s*(\S.*)?$/i) {
>  $message->{subject} = $1;
>  next;
> }
>
> ...so that now it handles the case where Subject: is multiple folded
> lines. It uses a flag to know when it's hit a Subject: header and
> assumes that any following line is a continuation of the subject until
> it finds a line that does not begin with whitespace
>
> For each continuation line it finds like this it will strip off a
> single leading whitespace (because the leading whitespace is denoting
> the fold and is not part of the subject) and concatenate to the
> previous contents of the Subject:
>
> I think it's safe, because come what may we should always find at
> least a Message-ID: header after the Subject: with Postfix and this
> should be sufficient to turn off the flag
>
> As I said, I have done limited testing with this patch and all my
> scenarios are working now including both those which were working
> previously and those that were broken
>
> The thing which still confuses me is how any of it worked before if
> this really was the issue
>
> In any case, I'd appreciate it if the MS developers would review this
> patch and either apply it as-is or modify as they see fit - if of
> course it is agreed that this is/was a problem to begin with
>
> Thanks
>
> Nick
>
> On 5/10/06, Nick Smith <nick.smith67 at googlemail.com> wrote:
>> Hi,
>>
>> MS 4.53.8, Postfix 2.2.10
>>
>> I have a big problem right now with MailScanner apparently  
>> mangling some
>> (but not all) multi-line folded Subject headers - typically those  
>> containing
>> ISO encoded subjects for multi-byte character sets. Consider these  
>> two
>> examples:
>>
>> Subject: =?iso-2022-jp?B? 
>> GyRCJCIkIiQiJCIkIiQiJCIkIiQiJCIkIiQiJCIbKEIgYWFhYWFhYQ==?=
>>  =?iso-2022-jp?B?YSAbJEIkIiQiJCIkIiQiJCIkIiQiJCIkIhsoQg==?=
>>
>> Subject: =?iso-2022-jp?B? 
>> GyRCJCIkIiQiJCIkIiQiJCIkIiQiJCIkIiQiGyhCIGFhYWFhYWFhIA==?=
>>  =?iso-2022-jp?B?GyRCJCIkIiQiJCIkIiQiJCIkIiQiJCIbKEI=?=
>>
>> The first one is an ISO-2022-JP encoded representation of 13 Japanese
>> double-byte "a" followed by a single space, 8 lower case ASCII  
>> "a", another
>> single space and finally 10 more Japanese double-byte "a".
>>
>> The second is identical, except one of the 13 double-byte "a" has  
>> been
>> removed so there are only 12
>>
>> Given that the first character of the second line in each example  
>> is a
>> space, they ought to be treated as a single header per RFC822's  
>> folded
>> header specification
>>
>> The weird part is that the first example works, shows up unchanged  
>> in the
>> user's mailbox, and the Subject: header looks exactly as it did  
>> when it was
>> sent while the shorter second one gets broken. What shows up in  
>> the user's
>> mailbox (and the headers) is a decoded version of just the first  
>> line -
>> which looks like this:
>>
>> Subject: ^[$B$"$"$"$"$"$"$"$"$"$"$"$"^[(B aaaaaaaa
>>
>> MailScanner running with Sendmail does not seem to experience this  
>> problem,
>> and neither does Postfix running without MailScanner so it looks  
>> to be
>> something to do with MailScanner's Postfix-specific code
>>
>> I did notice whilst digging in MailScanner's Postfix.pm that maybe  
>> the
>> complete handling of folded Subject: headers is not implemented - for
>> example lines 449-452:
>>
>> if ($recdata =~ /^Subject:\s*(\S.*)?$/i) {
>>   $message->{subject} = $1;
>>   next;
>> }
>>
>> This doesn't seem to handle the case where Subject: is on more  
>> than one
>> line, and will result in $message->{subject} containing only the  
>> first line
>> of a folded Subject:
>>
>> Clearly this is not the whole story and an incomplete $message-> 
>> {subject} is
>> not enough to kill it every time because otherwise it would never  
>> work with
>> a folded subject at all - as I said previously, the longer example  
>> above
>> does work OK as do many other folded subject headers.
>>
>> Unfortunately, this is causing quite a big problem - it would be  
>> great if
>> somebody could suggest a fix. Failing a proper fix, is there any  
>> way to
>> modify the existing header_check:
>>
>> /^Received:/                    HOLD
>>
>> ...so that it will exclude messages with (for example) /^Subject:
>> =*iso-2022-jp/ - that way I could maybe have these messages bypass  
>> MS for
>> the time being
>>
>> Thanks
>>
>> Nick
>>
>> <ms-4.58.8-pfsubject.patch>
> -- 
> MailScanner mailing list
> mailscanner at lists.mailscanner.info
> http://lists.mailscanner.info/mailman/listinfo/mailscanner
>
> Before posting, read http://wiki.mailscanner.info/posting
>
> Support MailScanner development - buy the book off the website!

-- 
Julian Field
www.MailScanner.info
Buy the MailScanner book at www.MailScanner.info/store
PGP footprint: EE81 D763 3DB0 0BFD E1DC 7222 11F6 5947 1415 B654


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
MailScanner thanks transtec Computers for their support.



More information about the MailScanner mailing list