Blocking by character set

Brendan Pirie bpirie at rma.edu
Mon May 11 20:15:08 IST 2009


Denis Beauchemin wrote:
> Paul Lemmons a écrit :
>> Is there any way to recognize a particular character set in a message 
>> and block based on it. We are a non-international company and 100% of 
>> the email containing non-English characters is spam. I would like to 
>> use that to my advantage and simply block mail containing (to us) 
>> foreign character sets.
> Paul,
>
> Maybe this SA option could do the trick (from man 
> Mail::SpamAssassin::Conf):
> ok_locales xx [ yy zz ... ] (default: all)
>    This option is used to specify which locales are considered OK for 
> incoming mail. Mail using the character sets that are allowed by this 
> option will not be marked as possibly being spam in a foreign language.
>
>    If you receive lots of spam in foreign languages, and never get any 
> non-spam in these languages, this may help. Note that all ISO-8859-* 
> character sets, and Windows code page character sets, are always 
> permitted by default.
>
>    Set this to all to allow all character sets. This is the default.
>
>    The rules CHARSET_FARAWAY, CHARSET_FARAWAY_BODY, and 
> CHARSET_FARAWAY_HEADERS are triggered based on how this is set.
>
>    Examples:
>
>      ok_locales all         (allow all locales)
>      ok_locales en          (only allow English)
>      ok_locales en ja zh    (allow English, Japanese, and Chinese)
>
>    Note: if there are multiple ok_locales lines, only the last one is 
> used.
>
>    Select the locales to allow from the list below:
>
> en - Western character sets in general
> ja - Japanese character sets
> ko - Korean character sets
> ru - Cyrillic character sets
> th - Thai character sets
> zh - Chinese (both simplified and traditional) character sets
>
> normalize_charset ( 0 | 1) (default: 0)
>    Whether to detect character sets and normalize message content to 
> Unicode. Requires the Encode::Detect module, HTML::Parser version 3.46 
> or later, and Perl 5.8.5 or later.
>
> Denis
>

Another possible option is the TextCat plugin included with spamassassin.

Brendan


More information about the MailScanner mailing list