Porn msg identification?

Julian Field mailscanner at ecs.soton.ac.uk
Thu Apr 10 15:29:10 IST 2003


At 15:22 10/04/2003, you wrote:
>El 9 Apr 2003 a las 14:45, Richard D Alloway escribió:
>
> > On Tue, 8 Apr 2003, Mariano Absatz wrote:
> >
> > > Hi Rich,
> > >
> > > The point is that MailScanner doesn't know anything about scoring 
> messages...
> > > the spam score you see in MailScanner is actually done by the 
> SpamAssassin
> > > library that MailScanner optionally uses.
> >
> > This is, of course, quite true :)
> >
> > The reason I was suggesting it be part of MailScanner is the fact that
> > MailScanner takes the output of SpamAssassin and modifies the subject
> > and/or adds a header to the message.
> >
> > > Now, _that_ library, including the rules that come with it, is 
> developed and
> > > optimized to tag as much spam as possible _avoiding_ as many false 
> positives
> > > as it can.
> >
> > Well, I'm not necessarily looking to detect spam... legitimate email with
> > mature content might not be spam. :)
>Right, but my point is that, so far, MailScanner invokes SpamAssassin at most
>once, and thus, it only uses one set of SA rules that, by default, is
>configured to detect spam.
>
>It would be easy (only a matter of configuration, not programming) to change
>the SA rules (and/or their scoring) to detect adult content, and modify the
>MailScanner.conf, so the X-MailScanner-xxxx and Subject be modified to report
>'adulthood' rather than 'spamhood' of the message.
>
>The problem is if you want the _same_ MailScanner to do _both_ spam & adult
>content detection.
>
>For that to work you should modify MS to invoke SA twice, with a different
>set of rules and generate to sets of headers and subject: modification, based
>on what each of the two SA invocations yield.
>
>That would include duplicating some of MS's data structures representing
>messages with different names, configuration variables and their defaults,
>etc.
>
>A slower (from a performance point of view) but faster (from a development
>point of view) solution would be to run 2 instances of MailScanner on the
>same machine, one to do de usual spam & virus detection and the other one to
>do adult content detection.
>
>For this you'll have to set up another queue directory like
>/var/spool/mqueue.mid and set the first MS with that as the "output"
>directory and the second MS with that as the "input" directory...
>
>You should also change, for the second MS all the messages that speak about
>"spam" to speak about "adult content", configure it to not query (either
>internally or via SA any RBL), to not check for viruses, eliminate the
>internal MS content checks (IFRAME, attachment extensions, etc.) so as to
>avoid as much double-processing as you can....
>
>The first MS should also change its "Sendmail2" invocation... I don't know
>much about Sendmail and Exim, but, for what I see, it should be kind of
>"/bin/true" since every file that the second MS finds in
>/var/spool/mqueue.mid (left there by the first MS) will automatically be
>processed by the second MailScanner without it needing to be invoked as
>sendmail does...
>
>Am I wrong, Julian, Nick?

That should work fine.


> >
> > > Thus, SpamAssassin scans the message looking for patterns and it adds or
> > > substracts from the score as some conditions are met or not...
> >
> > Which is the same functionality I'd be looking for in a word/phrase
> > detection routine, but with a seperate set of actions from the spam
> > portion.
> >
> > > You _could_ create a different set of rules for SpamAssassin and 
> invoke it
> > > twice, once for spam detection and the other for "adulthood" 
> detection, but
> > > that would imply at least modifying MailScanner and using a secondary 
> set of
> > > SpamAssassin rules... it _will_ require some time and an effort to do 
> it...
> >
> > It seems I may be one of the very few actually looking for this type of
> > feature...perhaps I will have to throw on the ol' coding hat for a while
> > :)
> >
> > Julian, if I am (or anybody else is) able to create a relatively
> > lightweight way of adding this feature to MailScanner, would you consider
> > adding it to the production version?
> >
> > Thanks again for everyone's feedback!
> >
> > -Rich
>
>--
>Mariano Absatz
>El Baby
>----------------------------------------------------------
>Honey, I Formatted the Kid!

-- 
Julian Field
www.MailScanner.info
MailScanner thanks transtec Computers for their support




More information about the MailScanner mailing list