Blocking by character set

Denis Beauchemin Denis.Beauchemin at
Mon May 11 20:06:50 IST 2009

Paul Lemmons a écrit :
> Is there any way to recognize a particular character set in a message 
> and block based on it. We are a non-international company and 100% of 
> the email containing non-English characters is spam. I would like to 
> use that to my advantage and simply block mail containing (to us) 
> foreign character sets.

Maybe this SA option could do the trick (from man Mail::SpamAssassin::Conf):
ok_locales xx [ yy zz ... ] (default: all)
    This option is used to specify which locales are considered OK for 
incoming mail. Mail using the character sets that are allowed by this 
option will not be marked as possibly being spam in a foreign language.

    If you receive lots of spam in foreign languages, and never get any 
non-spam in these languages, this may help. Note that all ISO-8859-* 
character sets, and Windows code page character sets, are always 
permitted by default.

    Set this to all to allow all character sets. This is the default.

CHARSET_FARAWAY_HEADERS are triggered based on how this is set.


      ok_locales all         (allow all locales)
      ok_locales en          (only allow English)
      ok_locales en ja zh    (allow English, Japanese, and Chinese)

    Note: if there are multiple ok_locales lines, only the last one is used.

    Select the locales to allow from the list below:

en - Western character sets in general
ja - Japanese character sets
ko - Korean character sets
ru - Cyrillic character sets
th - Thai character sets
zh - Chinese (both simplified and traditional) character sets

normalize_charset ( 0 | 1) (default: 0)
    Whether to detect character sets and normalize message content to 
Unicode. Requires the Encode::Detect module, HTML::Parser version 3.46 
or later, and Perl 5.8.5 or later.


