Is this really how bayes+autolearn works?

Scott Silva ssilva at
Tue Dec 12 22:45:11 GMT 2006

Furnish, Trever G spake the following on 12/12/2006 1:59 PM:
> My Bayes db seems to consistently start assigning BAYES_00 (-2.6) to
> messages that are simple plain text messages you'd think would be easily
> caught.  The messages are all seemingly almost identical.  They're
> coming from bots, and only a small percentage of them are caught by
> other SA rules.
> Does that mean that they're being auto-learned as Ham and then
> cancelling out my attempts to teach Bayes later that this is spam?  Out
> of the many thousands that flood in, I'm only able to retrain bayes
> using a small percentage (because only a small subset of my users drag
> spam into the retraining system consistently).
> The reason they're not caught by other SA rules is because they're
> coming from bots.  Many of the samples I've looked at also wouldn't have
> been caught John Rudd's Botnet plugin.
> So Bayes is getting lots of messages that SA doesn't detect as spam, and
> only a few similar messages that I train it to treat as spam.  Is this a
> plausible explanation for why Bayes would consistently be misclassifying
> this mail?
> So far the floods start in the afternoon and the subject strings are
> consistent enough that I'm able to correct the damage by:
>     - removing my bayes database and retraining from archived spam
> corpus (slow)
>     - creating custom rules to, for example, filter out "Subject =~
> /Good Morning/" (dangerous)
I also see a lot of spam coming from bots, but I consistently catch most of
it. Are you using some good add-on rules?
Do you have any samples that some of us could run through our systems to see
what we get?


MailScanner is like deodorant...
You hope everybody uses it, and
you notice quickly if they don't!!!!

More information about the MailScanner mailing list