multiple garbage words/bayes

Peter Bonivart peter at UCGBOOK.COM
Mon Jan 26 22:02:41 GMT 2004


Dustin Baer wrote:
> As we all know, spammers try to get around bayes by putting in multiple
> words that have no meaning:
>
>         coolant drier cudgel belgrade baroness airlock actuate judas decision
> abbreviate betroth
>
> etc.

I have had real good luck with these two rules someone posted a week
ago. I have been using them with 0.1/0.25 respectively to test them and
have not seen any false positives yet but they often seem to trigger
when Bayes doesn't which is exactly what I'm looking for.

rawbody  CP_RANDOMWORD_10
/(?:\b(?!(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){10}/
describe CP_RANDOMWORD_10       string of 10+ random words
score    CP_RANDOMWORD_10       0.5

rawbody  CP_RANDOMWORD_15
/(?:\b(?!(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){15}/
describe CP_RANDOMWORD_15       string of 15+ random words
score    CP_RANDOMWORD_15       2.5

--
/Peter Bonivart

--Unix lovers do it in the Sun

Sun Fire V210, Solaris 9, Sendmail 8.12.10, MailScanner 4.25-14,
SpamAssassin 2.61 + DCC 1.2.21, ClamAV 0.65 + GMP



More information about the MailScanner mailing list