Sophos and Corrupt Files

Scott Adkins adkinss at OHIO.EDU
Wed Feb 5 18:57:20 GMT 2003

--On Wednesday, February 05, 2003 5:16 PM +0000 Julian Field 
<mailscanner at ECS.SOTON.AC.UK> wrote:

> What version of Sophos are you running?

Running version 3.66.

> The "corrupt" errors seem to disappear with 3.66 (i.e. the very latest off
> the web). I have seen 3.62 - 3.65 complain about documents that 3.66 is
> perfectly happy with. And the fact that absolutely no-one except Sophos
> users are having any corrupted file problems does slightly point the
> finger at Sophos. Maybe when asked to disinfect a file that it thinks is
> corrupt, it damages it? Just a thought.

Sophos just released a 3.66a version of their product.  Apparently, this
version fixes a bunch of issues with PDF documents, especially when the
PDF document contains errors or does anything with the .z libraries.  I
was told that Sophos should better handle PDF documents in general with
that release.  This version is not available on their web site right now,
but they did give me a direct link for download.  I am in the process of
installing it, so I will let you know what happens.

As far as Sophos vs the rest of the world goes, there is no doubt that this
is a Sophos issue, namely, Sophos is having problems scanning the document,
and if it can't scan it, it claims it is corrupt.  I have more information
about this below.

However, this also doesn't mean that MailScanner shouldn't allow me the
option of allowing these documents through anyways, much like how external
MIME attachments are handled.

> I know Sophos are blaming me for this problem.
> But it strikes me as very odd that only Sophos users are having file
> corruption problems...
> And I can't reproduce it in Sophos 3.66.

The only blame Sophos is placing on MailScanner is the fact that if for
some reason Sophos can't scan the file, MailScanner automatically assumes
it is a virus.  They provided a bunch of reasons why this may be the case:

  1) A header in a file may be corrupted... such as an EXE file that says
     it is something else or just happens to be invalid.

  2) Excel and Word documents may have locked forms, locked/password
     protected cells, etc.  Apparently, the UNIX versions of Sophos have
     problems with Microsoft documents that do forms/cell locking and
     password protection.  They are in the process of trying to fix this,
     but they have to apparently write extra code for UNIX, since they
     don't have access to the same API calls that happen to be built into
     Windows that do the same things... That is the basic gist.

     We have tried to duplicate this with Excel ourselves, and I have yet
     to cause a document to get flagged as corrupt... I will probably have
     to ask them for a sample document.

  3) PDF files with errors in them or PDF files that used the libz library
     caused problems.  This issue in particular should be dealt with more
     cleanly in 3.66a.  However, this problem is probably not completely

  4) We were told that there were actually about 15 different reasons why
     a document couldn't be scanned... I couldn't write them down fast
     enough.  We emailed them asking for a complete list.  I can post here
     if anyone is interested.

They also told us that Sophos has the ability to send back extended error
codes as to the reason why it couldn't scan the document.  I was told that
"sweep -eec" would do that.  This is far better than seeing just the error
message of "(corrupt)".  However, I don't know what all changes when using
that option... I am going to play with it today.

Ideally, we would like to distinguish out of those 15 possible reasons
which cases should allow the message through (maybe with a message report
attached describing that the message wasn't scanned and what they can do
to change that) and which cases should automatically deny the message from
getting through.

For example, if we see an error that is "(password protected document)" or
something like that, we can just pass the document on to the user with a
warning about the attachment not being scanned and if they want it scanned
to have the sender remove the password protection from the document and
resend.  Since I don't know what all the error messages could be, I don't
know if any of them would be more harmful if simply passed through 

Anyways, that is what I know so far.


> At 15:50 05/02/2003, you wrote:
>> Julan,
>> I know we have had considerable discussion on this topic already, and I
>> need to find some resolution to it.
>> The issue seems to be that users are sending documents via attachments
>> that get flagged as corrupt by Sophos and labeled as a virus in
>> MailScanner. So far, all the documents I have managed to get my hands on
>> indicate that these documents are indeed in some way corrupt.  Most of
>> the time, I can't even open the documents myself on my desktop.
>> Periodically, I can find a PDF document that appears to open and look
>> fine without generating any errors, but scanning it with Sophos
>> indicates that the PDF is corrupt. This isn't necessarily untrue, as all
>> of the PDF tools that I have at my disposal (conversion utilities to
>> convert to postscript format, or other programs that can open and view
>> the document) also say that the document is corrupt and refuse to do
>> anything with it... It just happens to be that Adobe Acrobat Reader was
>> forgiving enough in that particular case to allow me to view it
>> successfully.
>> So, I see two problems here:
>>  1) Sophos is very strict in following the document format standards, and
>>     if the document doesn't follow that standard, it says that it can't
>>     scan the document and labels it corrupt.  I do not know how sctrict
>>     Sophos is on this, but most of the documents I have found does indeed
>>     have problems when trying to open them up with whatever standard
>>     software installed on my machine.
>>     Indicidentally, Sophos claims that it couldn't find the start *and*
>>     end of the document and that is why it claims it can't scan the
>>     document.  I really don't believe this claim.  The errors I typically
>>     see when opening the documents myself are things like invalid
>>     variable names, etc.  This could be the result of a newer version of
>>     document formats that Sophos doesn't yet understand, or non-standard
>>     software used to create those documents to begin with.
>>  2) When Sophos comes back and says that the document couldn't be scanned
>>     for whatever reason, MailScanner simply labels the file as a virus
>>     and moves on.  I don't agree with this, as I think the administrator
>>     is the one that should decide how to handle these situations.  This
>>     is no different than how external MIME attachments are handled, since
>>     those attachments can't be scanned by the virus scanner as well.
>> What are the solutions to this problem?
>>  1) Sophos probably should be a lot less restrictive when scanning some
>>     document formats.  Aren't virus patterns determined by the patterns
>>     themselves and not how closely a PDF document adheres to Adobe's
>>     format standards?  If you don't see the virus patterns, shouldn't
>>     you say the document is clean?  We are going to generate a support
>>     call to them on this later this morning.
>>  2) MailScanner should give us the option to allow documents that are
>>     unable to be scanned by the virus scanner through.  We are getting a
>>     lot of calls about this now to our Support Center, and it is being
>>     pushed through the higher ranks.  We are an educational institution,
>>     and what we think may be the right answer (i.e. no external MIME
>>     attachments, do filename checking, etc etc), politics dictate the
>>     policies.  Anyways, I think we need an option in the config file to
>>     allow these documents through.
>> Thanks,
>> Scott
>> --
>> +-----------------------------------------------------------------------+
>>      Scott W. Adkins      
>>   UNIX Systems Engineer                  mailto:adkinss at
>>        ICQ 7626282                 Work (740)593-9478 Fax (740)593-1944
>> +-----------------------------------------------------------------------+
>>     PGP Public Key available at
> --
> Julian Field
> MailScanner thanks transtec Computers for their support

      Scott W. Adkins      
   UNIX Systems Engineer                  mailto:adkinss at
        ICQ 7626282                 Work (740)593-9478 Fax (740)593-1944
     PGP Public Key available at
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 231 bytes
Desc: not available
Url :

More information about the MailScanner mailing list