multiple garbage words/bayes

Dustin Baer dustin.baer at IHS.COM
Mon Jan 26 18:46:19 GMT 2004


As we all know, spammers try to get around bayes by putting in multiple
words that have no meaning:

        coolant drier cudgel belgrade baroness airlock actuate judas decision
abbreviate betroth

etc.

Does anyone see anything wrong with the following rule?  It should match
30 consecutive four-letter words that have no punctuation.  So far, one
spam has triggered it.  The score is currently set low for testing.

body   MULTI_WORD /\w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,}
\w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,}
\w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,}
\w{4,} \w{4,} \w{4,}/i
describe MULTI_WORD A lot of 4-letter words, with no punctuation
score MULTI_WORD 0.1

Since I am not a Perl master, can anyone suggest an easier way to write
it?

Thanks,

Dustin



More information about the MailScanner mailing list