Hi David,

>>> But we find more emails than we would expect still escape being
spam-tagged: their spamscores seem strangely low.

I too have seen similar patterns of spam scoring strangely low and spent
some time over the weekend using MailWatch to work out why this was

I checked the 'Received:' headers IP addresses via OpenRBL.org and realised
that although these messages were listed in quite a few RBL's - SpamAssassin
had not picked up on this - further debugging via:

spamassassin -D rbl=-3 -p /etc/MailScanner/spam.assassin.prefs.conf <
message 2>&1 | less

and I discovered that for some reason SA was 'trusting' the first host on
the received line and not checking it against the RBL's.  I ended up adding:

trusted_networks 10/8 172.16/12 192.168/16 <<external mx>>
<<external mx>>

in spam.assassin.prefs.conf and double-checked the settings by running SA in
debug across a range of messages to make sure that SA was checking the RBL's
as expected.

For good measure I also added:

# Manually add in the CBL until SA has it by default
header RCVD_IN_CBL      eval:check_rbl_txt('cbl', 'cbl.abuseat.org.')
describe RCVD_IN_CBL    Received via a relay in cbl.abuseat.org
tflags RCVD_IN_CBL      net
score RCVD_IN_CBL       5

And where these low-scoring spam were once slipping through - they aren't

Hope this helps.

Kind regards,

Executive summary:  Might a high value of MS "Required SpamAssassin
interact adversely with SA Bayes?

We started site-wide use of MailScanner some time ago (mid-2001), and of
SpamAssassin back in 2002.  Because of our worries about false
we adjusted the MailScanner.conf "Required SpamAssassin Score" from its
default of 5 up to 7.

Things have moved on, and we are now happily using SA 2.61 including its
Bayes aspects.  But we find more emails than we would expect still
being spam-tagged: their spamscores seem strangely low.  Might it be
our artificially high "Required SpamAssassin Score = 7" is causing the
Bayes mechanism to auto-learn some "Score = 5" and "6" spams incorrectly
as hams, and perhaps then to cause future occurences of these spams to
marked down as hams (and thus escape being spam-tagged)?

I think we could reasonably confidently reduce "Required SA Score" from
down to 6 or 5, which would both catch a few more spams, and the
Bayes autolearn might then catch more (positive feedback).

Is the above reasoning basically sound?  Or is it fundamentally flawed?

A supplementary question: Our SA/Bayes is currently only self-learning.
Are there any nicely packaged schemes to allow us to supplement this
emails from validated individuals?  A few of us could then redirect
(bounce) emails to, say, "sa-learn-ham at ..." and "sa-learn-spam at ..." (but
in such a way that it would verify the redirector/bouncer (or some
equivalent) against a list of trusted folk).


