Mishandling multi-line ISO encoded subject headers

Wed May 10 22:16:36 IST 2006

OK - so it's rather sad to reply to your own post, but I've written a
patch against Postfix.pm which seems to fix this issue (in the limited
amount of testing I've done)

What it does is to extend the Subject: header extraction handling in
the ReadQf subroutine I referred to previously:

if ($recdata =~ /^Subject:\s*(\S.*)?$/i) {
  $message->{subject} = $1;
  next;
}

...so that now it handles the case where Subject: is multiple folded
lines. It uses a flag to know when it's hit a Subject: header and
assumes that any following line is a continuation of the subject until
it finds a line that does not begin with whitespace

For each continuation line it finds like this it will strip off a
single leading whitespace (because the leading whitespace is denoting
the fold and is not part of the subject) and concatenate to the
previous contents of the Subject:

I think it's safe, because come what may we should always find at
least a Message-ID: header after the Subject: with Postfix and this
should be sufficient to turn off the flag

As I said, I have done limited testing with this patch and all my
scenarios are working now including both those which were working
previously and those that were broken

The thing which still confuses me is how any of it worked before if
this really was the issue

In any case, I'd appreciate it if the MS developers would review this
patch and either apply it as-is or modify as they see fit - if of
course it is agreed that this is/was a problem to begin with

Thanks

Nick

On 5/10/06, Nick Smith <nick.smith67 at googlemail.com> wrote:
> Hi,
>
> MS 4.53.8, Postfix 2.2.10
>
> I have a big problem right now with MailScanner apparently mangling some
> (but not all) multi-line folded Subject headers - typically those containing
> ISO encoded subjects for multi-byte character sets. Consider these two
> examples:
>
> Subject: =?iso-2022-jp?B?GyRCJCIkIiQiJCIkIiQiJCIkIiQiJCIkIiQiJCIbKEIgYWFhYWFhYQ==?=
>  =?iso-2022-jp?B?YSAbJEIkIiQiJCIkIiQiJCIkIiQiJCIkIhsoQg==?=
>
> Subject: =?iso-2022-jp?B?GyRCJCIkIiQiJCIkIiQiJCIkIiQiJCIkIiQiGyhCIGFhYWFhYWFhIA==?=
>  =?iso-2022-jp?B?GyRCJCIkIiQiJCIkIiQiJCIkIiQiJCIbKEI=?=
>
> The first one is an ISO-2022-JP encoded representation of 13 Japanese
> double-byte "a" followed by a single space, 8 lower case ASCII "a", another
> single space and finally 10 more Japanese double-byte "a".
>
> The second is identical, except one of the 13 double-byte "a" has been
> removed so there are only 12
>
> Given that the first character of the second line in each example is a
> space, they ought to be treated as a single header per RFC822's folded
> header specification
>
> The weird part is that the first example works, shows up unchanged in the
> user's mailbox, and the Subject: header looks exactly as it did when it was
> sent while the shorter second one gets broken. What shows up in the user's
> mailbox (and the headers) is a decoded version of just the first line -
> which looks like this:
>
> Subject: ^[$B$"$"$"$"$"$"$"$"$"$"$"$"^[(B aaaaaaaa
>
> MailScanner running with Sendmail does not seem to experience this problem,
> and neither does Postfix running without MailScanner so it looks to be
> something to do with MailScanner's Postfix-specific code
>
> I did notice whilst digging in MailScanner's Postfix.pm that maybe the
> complete handling of folded Subject: headers is not implemented - for
> example lines 449-452:
>
> if ($recdata =~ /^Subject:\s*(\S.*)?$/i) {
>   $message->{subject} = $1;
>   next;
> }
>
> This doesn't seem to handle the case where Subject: is on more than one
> line, and will result in $message->{subject} containing only the first line
> of a folded Subject:
>
> Clearly this is not the whole story and an incomplete $message->{subject} is
> not enough to kill it every time because otherwise it would never work with
> a folded subject at all - as I said previously, the longer example above
> does work OK as do many other folded subject headers.
>
> Unfortunately, this is causing quite a big problem - it would be great if
> somebody could suggest a fix. Failing a proper fix, is there any way to
> modify the existing header_check:
>
> /^Received:/                    HOLD
>
> ...so that it will exclude messages with (for example) /^Subject:
> =*iso-2022-jp/ - that way I could maybe have these messages bypass MS for
> the time being
>
> Thanks
>
> Nick
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ms-4.58.8-pfsubject.patch
Type: application/octet-stream
Size: 1316 bytes
Desc: not available
Url : http://lists.mailscanner.info/pipermail/mailscanner/attachments/20060510/89c31e00/ms-4.58.8-pfsubject.obj