What's in a name? - "spam" / "not spam"

Mike Brudenell pmb1 at YORK.AC.UK
Wed Jun 30 11:00:07 IST 2004


Greetings -

Someone has just pointed out to me a slight problem with the text used to
identify spam and non-spam in the "X-Blah-MailScanner-SpamCheck:" header
when used with an IMAP server...

For spam this heading looks something like this:

    X-Blah-MailScanner-SpamCheck: not spam, SpamAssassin (score=-4.8,
            required 8, BAYES_00 -4.90, BIZ_TLD 0.10)

and for non-spam something like this:

    X-Blah-MailScanner-SpamCheck: spam, SpamAssassin (score=22.781,
            required 8, autolearn=spam, BAYES_99 5.40,
            DATE_IN_PAST_12_24 0.75, ... )

The problem arises when trying to use client-side filtering with an IMAP
mail program.  Such can be set to query the IMAP server to check the text
of a particular message header.  However the IMAP specification stipulates
that this match is as a case-insensitive substring.

Thus setting up a search to check the "X-Blah-MailScanner-SpamCheck" header
for the word "spam" matches both spam and not-spam messages.  (The converse
-- checking for non-spam messages by looking for "not spam" is fine.)

I am pondering changing the wording of one or both of these two strings in
the languages.conf file.  The aim is to use wording such that neither is a
case-insensitive substring of the other.

But choosing words that satisfy this whilst still being clear to users is
proving trickier than I'd at first thought.  Possibilities I've toyed with
so far are:

    spam                not spam
    ----                --------
    *spam*              not spam        (wildcards + problems if *s omitted)
    spammy              not spam        (too colloquial?)
    probable spam       not spam        (doesn't fit high-scoring spam well)
    spam                genuine         (implies approval of leak-thru spam)
    spam                legitimate      (ditto)
    spam                pukka           (do most staff/students know pukka?)

Bearing in mind this is a "difficult to change subsequently" setting once
people have started using it in their filters I was wondering if any other
sites had taken this step to address the problem?

Cheers,

Mike B-)

--
The Computing Service, University of York, Heslington, York Yo10 5DD, UK
Tel:+44-1904-433811  FAX:+44-1904-433740

* Unsolicited commercial e-mail is NOT welcome at this e-mail address. *

-------------------------- MailScanner list ----------------------
To leave, send    leave mailscanner    to jiscmail at jiscmail.ac.uk
Before posting, please see the Most Asked Questions at
http://www.mailscanner.biz/maq/     and the archives at
http://www.jiscmail.ac.uk/lists/mailscanner.html



More information about the MailScanner mailing list