multiple garbage words/bayes
Michele Neylon :: Blacknight Solutions
michele at BLACKNIGHTSOLUTIONS.COM
Mon Jan 26 20:09:33 GMT 2004
This kind of spam is a real pain. The only sane way of blocking it would
have to be some form of frequency analysis, though the punctuation or lack
thereof makes it quite unwieldy :/
Mr. Michele Neylon
Blacknight Internet Solutions Ltd
http://www.blacknightsolutions.ie/
http://www.search.ie/
Tel. + 353 (0)59 9137101
Lowest price domains in Ireland
> -----Original Message-----
> From: MailScanner mailing list [mailto:MAILSCANNER at JISCMAIL.AC.UK]On
> Behalf Of Dustin Baer
> Sent: 26 January 2004 18:46
> To: MAILSCANNER at JISCMAIL.AC.UK
> Subject: multiple garbage words/bayes
>
>
> As we all know, spammers try to get around bayes by putting in multiple
> words that have no meaning:
>
> coolant drier cudgel belgrade baroness airlock actuate
> judas decision
> abbreviate betroth
>
> etc.
>
> Does anyone see anything wrong with the following rule? It should match
> 30 consecutive four-letter words that have no punctuation. So far, one
> spam has triggered it. The score is currently set low for testing.
>
> body MULTI_WORD /\w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,}
> \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,}
> \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,} \w{4,}
> \w{4,} \w{4,} \w{4,}/i
> describe MULTI_WORD A lot of 4-letter words, with no punctuation
> score MULTI_WORD 0.1
>
> Since I am not a Perl master, can anyone suggest an easier way to write
> it?
>
> Thanks,
>
> Dustin
>
More information about the MailScanner
mailing list