Corrupt pdf files, any advice.

Julian Field mailscanner at ecs.soton.ac.uk
Mon Jul 28 21:10:18 IST 2003


I have found a serious bug in the QuotedPrint.pm distributed with Perl 5.8,
among other versions. This will corrupt any file encoded using
quoted-printable encoding that should really have been encoded using Base64.

I found the file in
         /usr/lib/perl5/5.8.0/i386-linux-thread-multi/MIME
but you may find it in another location. On a Linux box do a
         locate QuotedPrint.pm
and see what it says.

The patch to fix (or at least work around) the bug is here:

--- QuotedPrint.pm.original     2003-07-28 21:03:37.000000000 +0100
+++ QuotedPrint.pm      2003-07-28 21:03:24.000000000 +0100
@@ -133,6 +133,7 @@
  {
      my $res = shift;
      $res =~ s/[ \t]+?(\r?\n)/$1/g;  # rule #3 (trailing space must be
deleted)
+    $res =~ s/=0A=\n/\n\r/g; # JKF handle PC encoding of CRLF
      $res =~ s/=\r?\n//g;            # rule #5 (soft line breaks)
      if (ord('A') == 193) { # EBCDIC style machine
          if (ord('[') == 173) {

This simply spots the end-of-line sequence that PC's use when Outlook
encodes files. The problem lies in the definition of \n on different OS-es.

Please try this out and let me know how you get on. If this fixes the
problem, I will incorporate the fix into the MailScanner distribution (via
function-overloading) and produce an August release as this needs fixing
a.s.a.p..

I don't think this problem was introduced necessarily in MailScanner 4.22
but is a fault in the QuotedPrint.pm decoder shipped with recent versions
of Perl as the file decoder is broken, so any attempts to rebuild the
original message from the attachment files will produce broken attachments.

At 11:50 28/07/2003, you wrote:
>The only relevant change to Message.pm was this:
>
>@@ -1432,6 +1432,8 @@
>    if (MailScanner::Config::Value('signalreadyscanned', $this) ||
>        !$entity->head->count($scannerheader)) {
>      $this->SignCleanEntity($entity);
>+    $entity->head->add('MIME-Version', '1.0')
>+      unless $entity->head->get('mime-version');
>      $this->{bodymodified} = 1;
>    }
>  }
>
>Fancy trying undoing that change to see if it makes any difference?
>Also, if you could send me the badly-behaved PDF files, I could try them
>out here. Hopefully I will be able to reproduce the problem...
>
>At 11:34 28/07/2003, you wrote:
>>Julian
>>
>>It appears that the "corrupted PDF file" problem is more extensive and
>>serious than described below. I said in a message to the list a few days
>>ago that:
>>
>>"We are running MS 4.22-5 with SA 2.55 and users are suffering some
>>serious problems sending/receiving PDF files. The problem seems to have
>>started with MS 4.22-5. Most (all?) incoming PDF attachments are flagged
>>as corrupt.
>>
>>"The problem is even apparent with a local sender who is using Pine on a
>>Unix system. Zipping the PDF file appears to make no difference. It thus
>>appears that it is within MS or one of the modules that is processing
>>the PDF file that the problem occurs."
>>
>>I have two sample PDF files that cannot be sent from this site because
>>of the behaviour of MS or my configuring of it.
>>
>>Quentin
>>---
>>PHONE: +44 191 222 8209    Computing Service, University of Newcastle
>>FAX:   +44 191 222 8765    Newcastle upon Tyne, United Kingdom, NE1 7RU.
>>------------------------------------------------------------------------
>>"Any opinion expressed above is mine. The University can get its own."
>>
>> > -----Original Message-----
>> > From: Plant, Dean [mailto:dean.plant at ROKE.CO.UK]
>> > Sent: 28 July 2003 11:24
>> > To: MAILSCANNER at JISCMAIL.AC.UK
>> > Subject: Corrupt pdf files, any advice.
>> >
>> >
>> > Julian,
>> >
>> > I hope you had a good holiday.
>> >
>> > I am having a problem with pdf files corrupting when the
>> > disclaimer gets added to email passing through the MailScanner.
>> >
>> > RH 8.0
>> > Mailscanner 4.21-9
>> > Spamassassin/dcc/razor2
>> > F-prot 4.1.1
>> >
>> > Our setup is internal MS exchange 5.5 / outlook 2000 clients
>> > with Mailscanner as a relay to the internet.
>> >
>> > The problem only occurs when the mail is generated from MS
>> > exchange which sometimes encodes pdf files as quoted
>> > printable (looking on TechNet shows that MS exchange decides
>> > the encoding depending on the pdf file version). If the mail
>> > is generated with base64 the pdf file passes through correctly.
>> >
>> > I have found discussions of the problem within mailing lists
>> > of similar products to Mailscanner which also seem to suffer
>> > from this problem, so I think it must be down to the
>> > MIME::QuotedPrint perl module and the ambiguity of the QP
>> > standard on different OS's.
>> >
>>http://mailtools.anomy.net/archives/anomy-bugs/2002-06/0003.shtml
>>
>>As a quick fix I have advised users to zip all pdf files to ensure
>>correct delivery. I have also discussed the possibility of the MS
>>exchange people forcing the attachment encoding to base64 but they are
>>cautious in case something else breaks. I think they are unwilling to
>>change as it all worked before I put in the Mailscanner relay.
>>
>>Is there any advice you can offer on this problem. (Suggestions of
>>smashing up the MS exchange box can not be accepted).
>>
>>Thanks in advance.
>>
>>Dean Plant.
>>
>>--
>>Registered Office: Roke Manor Research Ltd, Siemens House, Oldbury,
>>Bracknell, Berkshire. RG12 8FZ
>>
>>The information contained in this e-mail and any attachments is
>>confidential to Roke Manor Research Ltd and must not be passed to any
>>third party without permission. This communication is for information
>>only and shall not create or change any contractual relationship.
>
>--
>Julian Field
>www.MailScanner.info
>Professional Support Services at www.MailScanner.biz
>MailScanner thanks transtec Computers for their support

--
Julian Field
www.MailScanner.info
Professional Support Services at www.MailScanner.biz
MailScanner thanks transtec Computers for their support



More information about the MailScanner mailing list