PDF Woes

Julian Field mailscanner at ecs.soton.ac.uk
Thu May 27 16:48:41 IST 2004


At 09:22 27/05/2004, you wrote:
>Karl Bailey wrote:
> > Guys,
> >
> > I'm having a very frustrating problem. We run a production process
> > that uses mutt to mail PDF's to customers. Now I know mutt has some
> > known issues with PDF's, but, the problems introduced are compounded
> > by adding a signature to the email after scanning using MailScanner.
> > The footer seems to cause the PDF to corrupt to the point it is
> > unusable in SOME CASES. I know this is to do with the fact that mutt
> > uses quoted-printable content transfer encoding cos if I use mutt
> > interactivly & force the encoding type to base64 then everything
> > works.. attach from the command line & it all corrupts.
> >
>
>Below is information Julian posted after I found out our pdf's were getting
>mangled after passing through MailScanner. This problem is a
>quoted-printable/signing messages problem. In our case MS Exchange
>incorrectly decides to encode some binary pdf's as quoted-printable, which
>in turn is corrupted when MailScanner signs them. Base64 always passes
>through correctly. We took the view of always zipping up pdf's which gets
>around the problem. Another thing to note is that I found pdf's created in
>different software are treated differently when being encoded in MS
>Exchange, so it seems that the pdf file version is also taken into
>consideration when the message is created.
>
>Hope this helps
>
>Dean Plant
>
>Previous post from Julian.
>
>Dean has kindly sent me the qf+df files from a message containing a PDF
>file that is corrupted. He has also sent me the original untouched PDF file
>to compare with the df file.
>
>Well, whatever generated the original quoted-printable message
>          X-Mailer: Internet Mail Service (5.5.2653.19)
>did it wrong.
>
>If you do an "od -c" on the test1.pdf file you get this:
>0000000   %   P   D   F   -   1   .   2  \r   %   â   ã   Ï   Ó  \r  \n
>0000020   6   3   2   6       0       o   b   j  \r   <   <      \r   /
>Note the \r\n at the end of the first line, just before the 6326.
>
>but if you do an "od -c" of the quoted-printable message contents (so you
>can see any embedded newline characters and so on), you get this:
>0000000   %   P   D   F   -   1   .   2   =   0   D   %   =   E   2   =
>0000020   E   3   =   C   F   =   D   3  \n   6   3   2   6       0
>0000040   o   b   j   =   0   D   <   <       =   0   D   /
>Now look what has happened to the data just before the 6326. It has been
>squashed into 1 \n character, thereby destroying the \r in the original.
>
>I can only imagine that Outlook/Exchange saw the \r\n sequence near the
>start of the file, and concluded that it was a text-based file. It
>therefore saw nothing wrong in squashing \r\n into just \n, which would
>work fine on a text file. Unfortunately its original decision about the
>file was wrong in this case :-(
>
>This makes it
>a) Microsoft's fault
>and
>b) Not a problem I can work around, as their software has destroyed data
>that I cannot reconstruct.
>
>Outlook XP always appears to use Base64, so I suspect the problem may just
>exist in Exchange 5.5 and/or Outlook 97. Don't know about Outlook 2000.
>
>Whether Acrobat Reader (on some platforms) will continue to be able to use
>the damaged file is another matter entirely, something over which I have no
>control.
>
>All I can suggest is you request people using the particular troublesome
>versions always zip their PDF files to stop Outlook destroying them.
>
>If anyone has any ideas about a software workaround I could implement,
>please let me know as I can't think of any way of doing it right now.

I have just tried it with a new PDF file from Acrobat 6, sent using Outlook 
2003, and it sent it as Base64 so I can't even investigate the problem any 
more :-(
And the PDF file I was using before (which Outlook 2003 sent as 
quoted-printable) turns out to be broken from the start, so I couldn't get 
any version to work.
I need a PDF file which was generated with Acrobat 5 which Outlook 2003 
will send as quoted-printable.
Then I stand a chance of being able to test it.

One thought I had was to traverse the MIME tree looking for 
quoted-printable sections and change them to Base64 (or even just do it to 
PDF attachments). Doing it to everything would make the message bigger and 
is probably unnecessary, it's just PDF which is the problem.
-- 
Julian Field
www.MailScanner.info
MailScanner thanks transtec Computers for their support

PGP footprint: EE81 D763 3DB0 0BFD E1DC 7222 11F6 5947 1415 B654

-------------------------- MailScanner list ----------------------
To leave, send    leave mailscanner    to jiscmail at jiscmail.ac.uk
Before posting, please see the Most Asked Questions at
http://www.mailscanner.biz/maq/     and the archives at
http://www.jiscmail.ac.uk/lists/mailscanner.html




More information about the MailScanner mailing list