FileType rules show executable even though file shows data -- Please help fix.

Peter Ong at
Thu Jul 8 15:29:16 IST 2010

Hey Mark,

So much of the time, I'm playing catch up as I quell fires and I miss the little details. Thanks for the edifying replies. Last night, I actually had some quiet time to read through the links you posted, and now I understand better. Although, I do not understand what "All Match" means and how it applies or behaves in the case of the filetype rules file.

Initially, I thought it went down the line and stopped at the first match as described in "First Match", but the documentation clearly says otherwise. Also, based on the other replies, I had the mechanics of scanning all wrong; I learned that the msg-1234-1.txt is scanned by file and file -i. Now I just don't know how that, the All Match behavior, and whether one field is ignored or both are accepted or if the third of five is filled whether second of five is required, etc. You've alluded to this already, but there was behavior last week that keeps me confused. I'll experiment more today.

I emailed Jules the original as he had requested. Maybe he will have something about it today.


----- Original Message -----

> From: "Mark Sapiro" <mark at>
> To: "Peter Ong" < at>
> Cc: mailscanner at
> Sent: Wednesday, July 7, 2010 4:41:40 PM
> Subject: Re: FileType rules show executable even though file shows data -- Please help fix.
> On Thu, Jul 08, 2010 at 04:32:33AM +0000, Peter Ong wrote:
> > Hi Mark,
> > 
> > Thanks for that. Help me clarify a few things:
> > 
> > > As it should because the output of "file msg-16388-1.txt: is
> > > "DOS executable (COM)" and that is matched by the regexp
> "executable"
> > > in the rule.
> > 
> > I see. And that would be the second of four fields counting from the
> left, correct? I thought it was only regexp if it they were enclosed
> in slashes such as /executable/. Am I wrong?
> I don't know for sure. I'm going for regexp because that's what it
> says
> at the top of the file, but regexp or "substring match" would give
> the
> same result in this case with no "pattern characters". It seems clear
> it's not an "exact full string" match in any case.
> > > > There are two lines that shows "No programs allowed", but I
> changed
> > > one to say "No executables allowed" so depending on the error
> message
> > > I know that it failed on one of them, and it does fail on the "No
> > > executables" line.
> > > > 
> > > > I only ran file on the msg file because Julian suggested it, and
> for
> > > everyone's edification, I posted the result here. The fact that
> the
> > > file command shows DOS executable (COM) should trigger the
> correct
> > > line in the error message which is:
> > > > 
> > > > deny    -       x-dosexec       No DOS executables      No DOS
> > > programs allowed
> > 
> > I apologize. In my frustration, I pasted the wrong line from the
> filetypes.conf.rules file. I meant to paste this one:
> > deny    executable      No executables          No executables
> allowed
> > 
> > This is where I had changed the word "programs" to "executables"  so
> I can determine which line is triggering.
> Right, and that's the rule you said matched and it matches because
> "file"
> says "DOS executable (COM)" which is matched by "executable".
> > > The hyphen in the above rule makes it a "5 field" rule in which
> case,
> > > the third field is matched against the mime type (output of file
> -i)
> > > which in this case is "text/x-mail" so no match.
> > 
> > Can someone explain how these fields work? The instructions on top
> of the file are too terse for me.
> > 
> > The second of five field is for the result of the "file" command,
> and the third of five field is for the output of "file -i". Do both
> fields have to be filled out or just one?
> I think that's not quite right. I *think* if you want to match
> against
> the "file" output, you use a four field rule and the second field is
> the
> match, and if you want to match "file -i", you use a five field rule
> and
> the third field is the match. In the latter case, in the example, the
> second field is a "-" because, I think, it is ignored. Clearly the
> two
> field matches are not anded because the hyphen in the example
> wouldn't
> match and the rule wouldn't match. I don't think they are ored
> either,
> I *think* in a five field rule the second field is merely a
> placeholder
> to make five fields and is ignored.
> > Are they evaluated as && or ||? I'm not sure. As you can see in my
> original post, I tried to put in all combinations, just in case. Are
> those fields always evaluated as regex? Because if so that means I
> need to escape special characters, but I don't know whether it's
> always regex or just as a string.
> I don't really know the answer to that.
> > I thought it went this way... there are two files in the folder. One
> is named after a postfix unique identifier... 012A34ABC and the other
> is msg-1234-1.txt. I thought the first file was scanned by "file" and
> the second scanned by "file -i". Tell me if I got this wrong.
> That's not the way it works in my quarantine. In mine, for messages
> with
> content issues I have a directory under the date directory named,
> e.g.
> "BB7596900BE.A6E7E", and under that there is a file named "message"
> which
> contains the entire raw message. This is not examined by either "file"
> or
> "file -i" because they just say "RFC 822 mail text" and
> "message/rfc822"
> respectively. Also under the "queue id + entropy" directory are one
> or
> more files, such as your msg-1234-1.txt file which are the contents
> of
> the message body and/or multiple MIME message parts. It is these
> message
> parts which are examined by "file" and/or "file -i".
> > > I think the reason your "allow - text/x-mail - -" rules don't work
> is
> > > that
> > > FileType Rules is an "all match" ruleset and not a "first match"
> > > ruleset.
> > 
> > Can you please explain what you mean by this?
> I did explain this somewhat in another reply, but basically, in this
> context, I think if any Deny rule matches, the message will be denied
> even if Allow rules that match precede or follow the matching Deny
> rule.
> -- 
> Mark Sapiro <mark at>        The highway is for gamblers,
> San Francisco Bay Area, California    better use your sense - B. Dylan

More information about the MailScanner mailing list