filtering file types vs. extensions
Julian Field
mailscanner at ecs.soton.ac.uk
Fri Jun 6 19:16:07 IST 2003
Not a good start.
The latest File::MMagic module does not understand Linux /usr/share/magic
files. It complains a lot about them, which makes it useless.
So I will have to use the "file" command, with a timeout and all that c**p
to stop DoS attacks on the file command.
Does everyone's "file" command output the filename followed by a ":"
followed by 1 or more spaces followed by the file type?
It's going to rain all weekend here (surprise, surprise) so I may attack
this feature soon.
At 18:42 06/06/2003, you wrote:
>At 18:29 06/06/2003, you wrote:
>> > Does anyone know of a Perl module that uses the magic file? I
>> > would very
>> > much like to avoid having to write this, but I don't want to
>> > have to crank
>> > up the file command for every message batch if I can avoid it.
>>
>>maybe you missed Mariano's post with the link in (it ended up in a
>>different thread in my mailreader) so heres the link he found..
>>http://search.cpan.org/author/KNOK/File-MMagic-1.19/
>
>I hadn't seen his post when I replied.
>
>>Looks like this returns a mime type, which is probably the right way to go
>>about this (saves processing the output from file too)
>>
>>Given mime types I think probaly the easiest way would be to have a
>>mimetypes.rules.conf which matches using RE's in the same way
>>filename.rules.conf does.
>>
>>I guess you run into issues if the output from filename rules and mimetype
>>rules conflict (reject takes precedence?)
>>
>>I don't think combining filename rules and mime types into one file would
>>be very easy as it would be difficult to deal with wildcard matching,
>>double extensions etc.
>>
>>One suggestion which although complicating the implementation would make
>>it much easier to construct rulesets based on file type is to have both a
>>filename rules and mimetype rules file which assign category names (rather
>>than simple yes/no) then have a much simpler ruleset determining action
>>based on category (again reject takes precedence). Category names need to
>>be arbitary so that users can extend the range of categories.
>>
>>I guess thats not easy - but it could be quite handy!
>
>I want to keep it very simple to use. Very few people ever change these
>files, as they are complicated enough already. Mapping a mimetype or a
>filename rule to another keyword, then deny/allow based on those keywords,
>is a bit too complicated in my opinion.
>
>A file like filename.rules.conf that matches mimetypes (or possibly "file"
>output) would be the easiest thing to do. But it would not manage to match
>files in which the file content doesn't match the filename. But maybe this
>isn't actually a problem. I think maybe that enforcing that is actually
>going to cause you more trouble than it's worth anyway, so that might well
>not be a problem.
>
>It needs to be fast, fairly easy to implement, but above all easy to use.
>It doesn't need to be able to do absolutely everything, though that would
>be nice :-)
>--
>Julian Field
>www.MailScanner.info
>Professional Support Services at www.MailScanner.biz
>MailScanner thanks transtec Computers for their support
--
Julian Field
www.MailScanner.info
Professional Support Services at www.MailScanner.biz
MailScanner thanks transtec Computers for their support
More information about the MailScanner
mailing list