Ideas for improved bayes learning

Wed Sep 19 09:56:34 IST 2007

Bayes normally autolearn a mail as being spam if the score is over 20.
This is configurable.
Many of us use RBLs on the MTA to reject known spam.

I was thinking that it might be usefull to instead of rejecting the RBL
mail, to accept it, train bayes using it and then discard it.

However I believe that the RBL checks that spamassassin perform are on
all the received lines and not just the IP address our mail servers
received the mail from?
If that is correct then I cannot simply assign a high score to the RBL
checks and have mailscanner delete very high scoring mail.

Ideally what I was thinking would for a couple of enhancements to
Mailscanner :-

1) Add a new action of sa-learn-spam so the mail can be learnt. You can
use a custom rule to fire this if a RBL matches so the mail is learnt
and then deleted.

2) Incorporate MailScanners RBL feature (I assume this one only checks
one received header) into the rules which can be used when writing a
custom action.

Its only an idea and not a request for the new feature. Personally
MailScanner is working very well for us so at this time it is not worth
allowing all the extra mail in just to improve the bayes effectivness.