Porn msg identification?

G. Armour Van Horn vanhorn at whidbey.com
Wed Apr 16 20:54:55 IST 2003


Julian,

I had a fax yesterday from one of the beneficiaries of my MailScanner system
complaining about porn spam, and then I saw a note in the Politech list about the
subject, referring to this story:

http://news.com.com/2100-1032-995658.html

That suggests that companies handling work-related mail could, in some
jurisdictions (both important ones like Australia and minor ones like the US
<grin>) could end up with different obligations in handling porn pam than all
other kinds of solicitations.

We now have two levels of labeling we can apply, for messages that make our
standard and our high spam score thresholds. Currently I am just labeling both
(the standard label for score of 5, "{Grossly Blatantl  SPAM?}" for score of 20),
but as a lot of the mail I carry ends up in the workplace (and my service paid for
by employers) this article made me wonder.

My wondering led me to think about the spam tests that are identified on the
X-MailScanner-SpamCheck: line. Could there be a third tier based on a ruleset,
said ruleset being a list of SA's codes?

I'm not sure I trust SA enough to delete messages based on a single SpamCheck
code, but if I could give MS a list of codes that would be checked after messages
had hit my Spam threshold, I would be just delighted. I.e., if the message is
already declared spam, delete rather than re-subject if any of my list of
PornCheck codes is present.

Does this make sense?

Van

Julian Field wrote:

> At 15:22 10/04/2003, you wrote:
> >El 9 Apr 2003 a las 14:45, Richard D Alloway escribió:
> >
> > > On Tue, 8 Apr 2003, Mariano Absatz wrote:
> > >
> > > > Hi Rich,
> > > >
> > > > The point is that MailScanner doesn't know anything about scoring
> > messages...
> > > > the spam score you see in MailScanner is actually done by the
> > SpamAssassin
> > > > library that MailScanner optionally uses.
> > >
> > > This is, of course, quite true :)
> > >
> > > The reason I was suggesting it be part of MailScanner is the fact that
> > > MailScanner takes the output of SpamAssassin and modifies the subject
> > > and/or adds a header to the message.
> > >
> > > > Now, _that_ library, including the rules that come with it, is
> > developed and
> > > > optimized to tag as much spam as possible _avoiding_ as many false
> > positives
> > > > as it can.
> > >
> > > Well, I'm not necessarily looking to detect spam... legitimate email with
> > > mature content might not be spam. :)
> >Right, but my point is that, so far, MailScanner invokes SpamAssassin at most
> >once, and thus, it only uses one set of SA rules that, by default, is
> >configured to detect spam.
> >
> >It would be easy (only a matter of configuration, not programming) to change
> >the SA rules (and/or their scoring) to detect adult content, and modify the
> >MailScanner.conf, so the X-MailScanner-xxxx and Subject be modified to report
> >'adulthood' rather than 'spamhood' of the message.
> >
> >The problem is if you want the _same_ MailScanner to do _both_ spam & adult
> >content detection.
> >
> >For that to work you should modify MS to invoke SA twice, with a different
> >set of rules and generate to sets of headers and subject: modification, based
> >on what each of the two SA invocations yield.
> >
> >That would include duplicating some of MS's data structures representing
> >messages with different names, configuration variables and their defaults,
> >etc.
> >
> >A slower (from a performance point of view) but faster (from a development
> >point of view) solution would be to run 2 instances of MailScanner on the
> >same machine, one to do de usual spam & virus detection and the other one to
> >do adult content detection.
> >
> >For this you'll have to set up another queue directory like
> >/var/spool/mqueue.mid and set the first MS with that as the "output"
> >directory and the second MS with that as the "input" directory...
> >
> >You should also change, for the second MS all the messages that speak about
> >"spam" to speak about "adult content", configure it to not query (either
> >internally or via SA any RBL), to not check for viruses, eliminate the
> >internal MS content checks (IFRAME, attachment extensions, etc.) so as to
> >avoid as much double-processing as you can....
> >
> >The first MS should also change its "Sendmail2" invocation... I don't know
> >much about Sendmail and Exim, but, for what I see, it should be kind of
> >"/bin/true" since every file that the second MS finds in
> >/var/spool/mqueue.mid (left there by the first MS) will automatically be
> >processed by the second MailScanner without it needing to be invoked as
> >sendmail does...
> >
> >Am I wrong, Julian, Nick?
>
> That should work fine.
>
> > >
> > > > Thus, SpamAssassin scans the message looking for patterns and it adds or
> > > > substracts from the score as some conditions are met or not...
> > >
> > > Which is the same functionality I'd be looking for in a word/phrase
> > > detection routine, but with a seperate set of actions from the spam
> > > portion.
> > >
> > > > You _could_ create a different set of rules for SpamAssassin and
> > invoke it
> > > > twice, once for spam detection and the other for "adulthood"
> > detection, but
> > > > that would imply at least modifying MailScanner and using a secondary
> > set of
> > > > SpamAssassin rules... it _will_ require some time and an effort to do
> > it...
> > >
> > > It seems I may be one of the very few actually looking for this type of
> > > feature...perhaps I will have to throw on the ol' coding hat for a while
> > > :)
> > >
> > > Julian, if I am (or anybody else is) able to create a relatively
> > > lightweight way of adding this feature to MailScanner, would you consider
> > > adding it to the production version?
> > >
> > > Thanks again for everyone's feedback!
> > >
> > > -Rich
> >
> >--
> >Mariano Absatz
> >El Baby
> >----------------------------------------------------------
> >Honey, I Formatted the Kid!
>
> --
> Julian Field
> www.MailScanner.info
> MailScanner thanks transtec Computers for their support

--
----------------------------------------------------------
Sign up now for Quotes of the Day, a handful of quotations
on a theme delivered every morning.
Enlightenment! Daily, for free!
mailto:twisted at whidbey.com?subject=Subscribe_QOTD

For web hosting and maintenance,
visit Van's home page: http://www.domainvanhorn.com/van/
----------------------------------------------------------




More information about the MailScanner mailing list