PDF Woes

Plant, Dean dean.plant at ROKE.CO.UK
Thu May 27 09:22:36 IST 2004


Karl Bailey wrote:
> Guys,
> 
> I'm having a very frustrating problem. We run a production process
> that uses mutt to mail PDF's to customers. Now I know mutt has some
> known issues with PDF's, but, the problems introduced are compounded
> by adding a signature to the email after scanning using MailScanner.
> The footer seems to cause the PDF to corrupt to the point it is
> unusable in SOME CASES. I know this is to do with the fact that mutt
> uses quoted-printable content transfer encoding cos if I use mutt
> interactivly & force the encoding type to base64 then everything
> works.. attach from the command line & it all corrupts.        
> 

Below is information Julian posted after I found out our pdf's were getting
mangled after passing through MailScanner. This problem is a
quoted-printable/signing messages problem. In our case MS Exchange
incorrectly decides to encode some binary pdf's as quoted-printable, which
in turn is corrupted when MailScanner signs them. Base64 always passes
through correctly. We took the view of always zipping up pdf's which gets
around the problem. Another thing to note is that I found pdf's created in
different software are treated differently when being encoded in MS
Exchange, so it seems that the pdf file version is also taken into
consideration when the message is created.

Hope this helps

Dean Plant

Previous post from Julian.

Dean has kindly sent me the qf+df files from a message containing a PDF 
file that is corrupted. He has also sent me the original untouched PDF file 
to compare with the df file.

Well, whatever generated the original quoted-printable message
         X-Mailer: Internet Mail Service (5.5.2653.19)
did it wrong.

If you do an "od -c" on the test1.pdf file you get this:
0000000   %   P   D   F   -   1   .   2  \r   %   â   ã   Ï   Ó  \r  \n
0000020   6   3   2   6       0       o   b   j  \r   <   <      \r   /
Note the \r\n at the end of the first line, just before the 6326.

but if you do an "od -c" of the quoted-printable message contents (so you 
can see any embedded newline characters and so on), you get this:
0000000   %   P   D   F   -   1   .   2   =   0   D   %   =   E   2   =
0000020   E   3   =   C   F   =   D   3  \n   6   3   2   6       0
0000040   o   b   j   =   0   D   <   <       =   0   D   /
Now look what has happened to the data just before the 6326. It has been 
squashed into 1 \n character, thereby destroying the \r in the original.

I can only imagine that Outlook/Exchange saw the \r\n sequence near the 
start of the file, and concluded that it was a text-based file. It 
therefore saw nothing wrong in squashing \r\n into just \n, which would 
work fine on a text file. Unfortunately its original decision about the 
file was wrong in this case :-(

This makes it
a) Microsoft's fault
and
b) Not a problem I can work around, as their software has destroyed data 
that I cannot reconstruct.

Outlook XP always appears to use Base64, so I suspect the problem may just 
exist in Exchange 5.5 and/or Outlook 97. Don't know about Outlook 2000.

Whether Acrobat Reader (on some platforms) will continue to be able to use 
the damaged file is another matter entirely, something over which I have no 
control.

All I can suggest is you request people using the particular troublesome 
versions always zip their PDF files to stop Outlook destroying them.

If anyone has any ideas about a software workaround I could implement, 
please let me know as I can't think of any way of doing it right now.
-- 
Julian Field
www.MailScanner.info
MailScanner thanks transtec Computers for their support


-- 

Visit our website at www.roke.co.uk

Registered Office: Roke Manor Research Ltd, Siemens House, Oldbury, Bracknell,
Berkshire. RG12 8FZ

The information contained in this e-mail and any attachments is confidential to
Roke Manor Research Ltd and must not be passed to any third party without
permission. This communication is for information only and shall not create or
change any contractual relationship.

-------------------------- MailScanner list ----------------------
To leave, send    leave mailscanner    to jiscmail at jiscmail.ac.uk
Before posting, please see the Most Asked Questions at
http://www.mailscanner.biz/maq/     and the archives at
http://www.jiscmail.ac.uk/lists/mailscanner.html




More information about the MailScanner mailing list