Lots of spam gets through because of BAYES_00 -2.60
Greg Matthews
gmatt at nerc.ac.uk
Wed Sep 12 13:38:08 IST 2007
Gareth wrote:
> Personally I find that it is very difficult to make bayes particularly
> effective in a corporate enviroment because of the variety of mails
this is not a reflection on the usefulness of Bayes. Proper
configuration will make this an extremely useful part of the anti-spam
suite.
> people receive. Therefore I find the low scoring bayes rules give a far
> to big a negative score. I tend to overise the low and high scores with
> the following :-
>
> score BAYES_00 -0.5
> score BAYES_05 -0.1
> score BAYES_20 -0.01
> score BAYES_40 -0.01
> score BAYES_99 5.0
>
interesting, your high-end scores aren't as conservative as your low
end. I wonder if you are managing to auto-learn enough ham? You know you
can adjust the autolearn thresholds dont you? Its quite common for Bayes
to have far more spam to learn from than ham which without attention
results in having to skew the scores as you have above.
Personally, I have great success with Bayes on relays that filter around
20-30k messages per day across 20-30 domains and around 5000 mailboxes.
I am careful tho to feed back all false postives flagged up by users
(perhaps as many as 5 per week) back into the system. I also feed back
all my own (personal) false negatives which may be as many as 10 per
week (<1% of my mail).
In summary, if Bayes is not working for you, its worth taking the time
to get it right rather than simply skewing the scores.
--
Greg Matthews 01491 692445
Head of UNIX/Linux, iTSS Wallingford
--
This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.
More information about the MailScanner
mailing list