MS and SA diuffer

Alex Broens ms-list at alexb.ch
Fri Oct 6 14:01:33 IST 2006


On 10/6/2006 1:37 PM, Anthony Peacock wrote:
> Hi Alex,
> 
> Alex Broens wrote:
>> On 10/6/2006 6:53 AM, Garry Glendown wrote:
>>> Hi,
>>>
>>> I've just set up FuzzyOCR to take care of the Image spam that has
>>> increased recently ... after still receiving untagged stock spam, I've
>>> checked into the scores and stuff and noticed on a test message, that MS
>>> has a lot less rule hits (and therefore less score points) than when
>>> calling spamassassin directly ...
>>>
>>> Here's what I got originally from MS:
>>>
>>> X-nethinks-MailScanner-SpamCheck: not spam, SpamAssassin (Wertung=3.905,
>>>     benoetigt 5, HTML_10_20 1.35, HTML_IMAGE_ONLY_32 1.05,
>>>     HTML_MESSAGE 0.00, MIME_HTML_ONLY 0.00, RCVD_NUMERIC_HELO 1.50)
>>>
>>> whereas the -t run from SA resulted in:
>>>
>>> X-Spam-Status: Yes, score=25.2 required=5.0 tests=AWL,BAYES_99,
>>>  FORGED_RCVD_HELO,FUZZY_OCR,HTML_10_20,HTML_IMAGE_ONLY_32,HTML_MESSAGE,
>>>         MIME_HTML_ONLY,RCVD_NUMERIC_HELO,SARE_GIF_ATTACH autolearn=no
>>>
>>> MailScanner.conf points to the right SA directory
>>> (/etc/mail/spamassassin), there ARE image spams that get tagged with the
>>> OCR-tags, so I don't really get it why the scoring differs this much ...
>>> also with the Bayes score ... none on MS, 99 on SA ... !?
>>>
>>> I'm still running MS 4.50, SA is 3.1.5 ...
>>>
>>> Any idea where I could look for the cause of this?
>>
>> I know I'l be tarred & feathered by this comment (once again):
>>
>> I'd bet its because MS only sent part of the whole msg thru SA and 
>> cutoff too early & missed the attached images.
>>
>> You may have to increase the value in "Max SpamAssassin Size" to catch 
>> them.
>>
>> Alex
> 
> No tar and feathers, but I do think that you are wrong in your 
> assumption in this case. :-)
> 
> 
> There are lots of rules different between the two tests that can't be 
> explained by a truncated message being passed to SA.
> 
> AWL, BAYES, RCVD_ tests for instance.
> 
> To me this suggests that the SpamAssassin tests were run as a different 
> user than the user that MailScanner runs as.  So it picks up the BAYES 
> databases and the AWL databases.  It might also be that some tests are 
> being disabled in the MS setup.

yes but:  Garry asked about the missing OCR hit. SARE_GIF_ATTACH is a 
full rule which probably wasn't parsed due to a cutoff and the missing 
FUZZY_OCR score points in the same direction...

and some messages are indeed scored by OCR, while other are...


and if he has AWL switched off in MS, passing SA thru the command line 
without -C filename will ignore that setting and send msg thru AWL

or it could also be a bad FUZZY_OCR install, but that I really doubt.

Alex








More information about the MailScanner mailing list