Dangerous content detection with "file" command

Bruce R. Littlefield brucel at eece.maine.edu
Wed Apr 21 15:51:47 IST 2010


> From: <hugh.fraser <at> arcelormittal.com>
> Subject: Dangerous content detection with "file" command
> Newsgroups: gmane.mail.virus.mailscanner
> Date: 2009-09-29 17:05:15 GMT (29 weeks, 21 hours and 37 minutes ago)
>
> I have a word document that was mistakenly flagged as "executable".
> Adding some debugging into the "SweepOther.pm" code revealed that the
> document contained a Title property of "The Quest of the Self". The
> linux "file" command used to identify file types returns this property
> (along with author and others) in it's output as follows:
>
> Support.doc: CDF V2 Document, Little Endian, Os: Windows, Version 5.1,
> Code page
> : 1252, Title: The Quest of the Self, Author: johndoe, Template: Normal,
> Last Sa
> ved By: JOHN DOE, Revision Number: 2, Name of Creating Application:
> Microsoft
> Office Word, Total Editing Time: 01:00, Create Time/Date: Thu Sep 17
> 09:57:00 20
> 09, Last Saved Time/Date: Thu Sep 17 09:57:00 2009, Number of Pages: 1,
> Number o
> f Words: 2597, Number of Characters: 14289, Security: 0
>
> MailScanner does a simple regex compare of the output from the "file"
> command and sees the string "ELF" in it (in the word Self), and flags
> the file as executable. This will happen with any Word doucment that
> contains any matching strings in the title, subject, author, category,
> comments, or any other property fields.
>
> A simple change in the regex used in the CheckFileContentTypes to only
> capture the "file" command's output up to the first "," does the trick,
> and I've checked some other files in quarantine to see if it would be a
> problem. So far, I don't see a problem.
>
> The diffs for SweepOther.pm are as follows:
>
> 410c410
> <         $FileTypes{$1}{$2} = $3 if /^([^\/]+)\/([^:]+):\s*(.*)$/;
> ---
>>         $FileTypes{$1}{$2} = $3 if /^([^\/]+)\/([^:]+):\s*([^,]*),/;
>
> --
> MailScanner mailing list
> mailscanner <at> lists.mailscanner.info
> http://lists.mailscanner.info/mailman/listinfo/mailscanner
>
> Before posting, read http://wiki.mailscanner.info/posting
>
> Support MailScanner development - buy the book off the website!
>

Did this ever get addressed? I checked the latest Beta and SweepOther.pm 
still has the earlier code. I ran into the same problem on a Fedora 11 
server running MailScanner 4.79.11-1 RPM with sendmail, spamassassin, 
and clamd. I found this change to be quite beneficial. Is it in the queue?

-Bruce
-- 
Bruce R. Littlefield       Systems Manager/Lecturer
Tel: (207) 581-2238        Electrical and Computer Engineering
Fax: (207) 581-4531        University of Maine
brucel at eece.maine.edu      210 Barrows Hall
http://www.eece.maine.edu  Orono, Maine 04469-5708

     "Mastering MATLAB 7" (ISBN 0-13-143018-1)
   http://www.eece.maine.edu/mm   mm at eece.maine.edu


More information about the MailScanner mailing list