FuzzyOcr working but not via MailScanner

Scott Silva ssilva at sgvwater.com
Wed Oct 18 22:10:27 IST 2006


Anthony Cartmell spake the following on 10/18/2006 1:05 PM:
>> Answer 10:    MailScanner by default only passes the first 30kb of the
>> mail to SpamAssassin.
> 
> Interesting. Most of the spam in question is less than 30kb in total
> size, though, and I don't see any error messages.
> 
>> Another thing to try
>> ====================
>> Also try setting 'focr_verbose 2' in the config file, most messages
>> report something like this..
> 
> I get a lot of
> 
> [2006-10-18 20:39:26] Debug mode: Set scansets to values:
>                       $gocr -i -
>                       $gocr -l 180 -d 2 -i -
> 
> But only get messages like:
> 
> [2006-10-18 16:17:11] Debug mode: Starting FuzzyOcr...
> [2006-10-18 16:17:11] Debug mode: Attempting to load personal wordlist...
> [2006-10-18 16:17:11] Debug mode: No personal wordlist found, skipping...
> [2006-10-18 16:17:11] Debug mode: FuzzyOcr ending successfully...
> 
> when I run the spamassassin test manually, not when it's run via
> MailScanner :(
> 
> The spam messages with inline GIFs are found by SARE_GIF_ATTACH, but
> aren't scoring high enough to be marked.
> 
> For example, a message that went through unmarked as spam, gets marked
> as spam if I run spamassassin manually:
> 
> spamassassin --debug -t <
> /var/spool/MailScanner/quarantine/20061018/nonspam/k9IHujkc027719
> 
> Hmmmm... it also gets a much higher score from this, as other tests also
> seem to be missed when run from MailScanner...
> 
> MailScanner score (1.508):
> 
> 0.75    SARE_GIF_ATTACH    Email has a inline gif
> 0.08    TW_DF    Odd Letter Triples with DF
> 0.08    TW_GG    Odd Letter Triples with GG
> 0.08    TW_GZ    Odd Letter Triples with GZ
> 0.08    TW_RG    Odd Letter Triples with RG
> 
> Manual spamassassin score (38.9):
> 
> 3.8 HELO_DYNAMIC_IPADDR2   Relay HELO'd using suspicious hostname (IP
> addr 2)
>  1.1 EXTRA_MPART_TYPE       Header has extraneous Content-type:...type=
> entry
>  0.1 TW_GZ                  BODY: Odd Letter Triples with GZ
>  0.1 TW_RG                  BODY: Odd Letter Triples with RG
>  0.1 TW_GG                  BODY: Odd Letter Triples with GG
>  0.1 TW_DF                  BODY: Odd Letter Triples with DF
>  1.8 TVD_FW_GRAPHIC_NAME_LONG BODY: TVD_FW_GRAPHIC_NAME_LONG
>  1.2 HTML_IMAGE_ONLY_20     BODY: HTML: images with 1600-2000 bytes of
> words
>  2.8 TVD_FW_GRAPHIC_ID1     BODY: TVD_FW_GRAPHIC_ID1
>  0.0 HTML_MESSAGE           BODY: HTML included in message
>  0.0 BAYES_50               BODY: Bayesian spam probability is 40 to
> 60%   [score: 0.4908]
>  0.8 SARE_GIF_ATTACH        FULL: Email has a inline gif
>  2.0 RCVD_IN_SORBS_DUL      RBL: SORBS: sent directly from dynamic IP
> address   [84.122.43.158 listed in dnsbl.sorbs.net]
>  2.6 RCVD_IN_DSBL           RBL: Received via a relay in list.dsbl.org  
> [<http://dsbl.org/listing?84.122.43.158>]
>  3.9 RCVD_IN_XBL            RBL: Received via a relay in Spamhaus XBL 
> [84.122.43.158 listed in sbl-xbl.spamhaus.org]
>  1.7 SARE_GIF_STOX          Inline Gif with little HTML
>   17 FUZZY_OCR              BODY: Mail contains an image with common
> spam text inside
>                             Words found:
>                             "alert" in 3 lines
>                             "news" in 1 lines
>                             "alert" in 3 lines
>                             "stock" in 1 lines
>                             "investor" in 2 lines
>                             "company" in 1 lines
>                             "trade" in 1 lines
>                             "service" in 1 lines
>                             "levitra" in 2 lines
>                             (15 word occurrences found)
> 
You must have some permission problems, as I did the same thing, and got near
identical scores (at least to the first decimal - 32.7 in smamassassin 32.73
in mailscanner.
Maybe Julian can confirm if spamassassin called by mailscanner can still load
plugins that have their loadplugin line in a .cf file instead of being called
in a .pre file.. I seem to remember some sort of privilege change when
spamassassin 3.0.0 or maybe 3.1.0 came out.



-- 

MailScanner is like deodorant...
You hope everybody uses it, and
you notice quickly if they don't!!!!



More information about the MailScanner mailing list