how to detect koi8-r characters
Mark Nienberg
lists at tippingmar.com
Mon Sep 15 22:55:01 IST 2008
Kevin Howard wrote:
> We're receiving a lot of spam comprising Cyrillic characters in the
> subject line, example Subject: =?koi8-r?B?8sXLzGHNwSDXIOnObcXSzsVtxSA=?=
>
> and a message body which is 100% Cyrillic, some messages are plain
> text and some HTML.
>
> The plains messages are using;
>
> MIME-Version: 1.0
> Content-Type: text/plain;
> charset="koi8-r"
> Content-Transfer-Encoding: 8bit
>
>
> Spamassassin doesn't seem to be able to detect these reliably despite
> us training bayes on these messages and utilising language filters. So
> we're trying to use MCP to detect them but have had no success
> whatsoever to date.
>
> I have tried making a rule to detect " ?koi8 " in the subject line but
> Mailscanner only seems to look at visible characters.
>
> Any ideas? my preference is to stop them with MCP if possible.
>
I use a spamassassin rule like this:
header LOCAL_CYRILLIC Subject:raw =~ /windows\-1251/i
describe LOCAL_CYRILLIC Cyrillic fonts
score LOCAL_CYRILLIC 3
in your case, maybe you need to replace windows-1251 with koi8-r. The
"raw" part is important.
Mark Nienberg
More information about the MailScanner
mailing list