filtering file types vs. extensions
Mariano Absatz
mailscanner at LISTS.COM.AR
Fri Jun 6 23:14:51 IST 2003
Now that I see it... it seems to be an Apache httpd server related module, so
the "magic" file format is that of Apache and not that of the file command
"magic" file format.
I don't know if Apache's 1.3 and 2.0 mime-magic format is the same, but the
documentation for them is at:
http://httpd.apache.org/docs/mod/mod_mime_magic.html
and
http://httpd.apache.org/docs-2.0/mod/mod_mime_magic.html
respectively.
The file itself is included in both Apache httpd distributions, and, for the
record, I think it would be much better to have a mime-type answer and
process it with a file like filename.rules.conf (e.g. mime-type.rules.conf)
in a relatively independent way.
That is, I'd have two options in the config file:
Filename Rules = /opt/MailScanner/etc/filename.rules.conf
MIME-type Rules = /opt/MailScanner/etc/mime-type.rules.conf
And inside there a set of allow/deny rules with an optional message (just
like filename.rules.conf).
Obviously, if an attachment matches a deny rule in any of both files, the
attachment would be treated as dangerous and the proper action would trigger.
Example: I get a file called "funny-picture.jpg" that actually has a DOS
executable in it, it would be allowed by an explicit rule in
filename.rules.conf, but later forbidden by an explicit rule in
mime-type.rules.conf, and thus it would be replaced by a message that says
"funny-picture.jpg seems to be an application/octet-stream type. This type is
considered dangerous".
It seems the file's "magic" file has some interesting data that Apache's
doesn't... maybe someone is willing to fit the file's one into the Apache...
Or maybe even... take a look at the C source for the file command... geez...
I don't know if this is a good idea... it will take more than a weekend...
Back to CPAN... take a look at
http://search.cpan.org/author/SDAGUE/ppt-0.12/bin/file
It is a command, and not a library, but maybe...
In http://www.perl.com/language/ppt/src/file/index.html there is another
implementation.
El 6 Jun 2003 a las 19:16, Julian Field escribió:
> Not a good start.
> The latest File::MMagic module does not understand Linux /usr/share/magic
> files. It complains a lot about them, which makes it useless.
> So I will have to use the "file" command, with a timeout and all that c**p
> to stop DoS attacks on the file command.
>
> Does everyone's "file" command output the filename followed by a ":"
> followed by 1 or more spaces followed by the file type?
>
> It's going to rain all weekend here (surprise, surprise) so I may attack
> this feature soon.
>
> At 18:42 06/06/2003, you wrote:
> >At 18:29 06/06/2003, you wrote:
> >> > Does anyone know of a Perl module that uses the magic file? I
> >> > would very
> >> > much like to avoid having to write this, but I don't want to
> >> > have to crank
> >> > up the file command for every message batch if I can avoid it.
> >>
> >>maybe you missed Mariano's post with the link in (it ended up in a
> >>different thread in my mailreader) so heres the link he found..
> >>http://search.cpan.org/author/KNOK/File-MMagic-1.19/
> >
> >I hadn't seen his post when I replied.
> >
> >>Looks like this returns a mime type, which is probably the right way to go
> >>about this (saves processing the output from file too)
> >>
> >>Given mime types I think probaly the easiest way would be to have a
> >>mimetypes.rules.conf which matches using RE's in the same way
> >>filename.rules.conf does.
> >>
> >>I guess you run into issues if the output from filename rules and mimetype
> >>rules conflict (reject takes precedence?)
> >>
> >>I don't think combining filename rules and mime types into one file would
> >>be very easy as it would be difficult to deal with wildcard matching,
> >>double extensions etc.
> >>
> >>One suggestion which although complicating the implementation would make
> >>it much easier to construct rulesets based on file type is to have both a
> >>filename rules and mimetype rules file which assign category names (rather
> >>than simple yes/no) then have a much simpler ruleset determining action
> >>based on category (again reject takes precedence). Category names need to
> >>be arbitary so that users can extend the range of categories.
> >>
> >>I guess thats not easy - but it could be quite handy!
> >
> >I want to keep it very simple to use. Very few people ever change these
> >files, as they are complicated enough already. Mapping a mimetype or a
> >filename rule to another keyword, then deny/allow based on those keywords,
> >is a bit too complicated in my opinion.
> >
> >A file like filename.rules.conf that matches mimetypes (or possibly "file"
> >output) would be the easiest thing to do. But it would not manage to match
> >files in which the file content doesn't match the filename. But maybe this
> >isn't actually a problem. I think maybe that enforcing that is actually
> >going to cause you more trouble than it's worth anyway, so that might well
> >not be a problem.
> >
> >It needs to be fast, fairly easy to implement, but above all easy to use.
> >It doesn't need to be able to do absolutely everything, though that would
> >be nice :-)
> >--
> >Julian Field
> >www.MailScanner.info
> >Professional Support Services at www.MailScanner.biz
> >MailScanner thanks transtec Computers for their support
>
> --
> Julian Field
> www.MailScanner.info
> Professional Support Services at www.MailScanner.biz
> MailScanner thanks transtec Computers for their support
--
Mariano Absatz
El Baby
----------------------------------------------------------
Behind every successful man is a woman, behind her is his wife.
-- Groucho Marx
More information about the MailScanner
mailing list