MS and SA diuffer

Anthony Peacock a.peacock at chime.ucl.ac.uk
Fri Oct 6 14:09:54 IST 2006


Alex Broens wrote:
> On 10/6/2006 1:37 PM, Anthony Peacock wrote:
>> Hi Alex,
>>
>> Alex Broens wrote:
>>> On 10/6/2006 6:53 AM, Garry Glendown wrote:
>>>> Hi,
>>>>
>>>> I've just set up FuzzyOCR to take care of the Image spam that has
>>>> increased recently ... after still receiving untagged stock spam, I've
>>>> checked into the scores and stuff and noticed on a test message, 
>>>> that MS
>>>> has a lot less rule hits (and therefore less score points) than when
>>>> calling spamassassin directly ...
>>>>
>>>> Here's what I got originally from MS:
>>>>
>>>> X-nethinks-MailScanner-SpamCheck: not spam, SpamAssassin 
>>>> (Wertung=3.905,
>>>>     benoetigt 5, HTML_10_20 1.35, HTML_IMAGE_ONLY_32 1.05,
>>>>     HTML_MESSAGE 0.00, MIME_HTML_ONLY 0.00, RCVD_NUMERIC_HELO 1.50)
>>>>
>>>> whereas the -t run from SA resulted in:
>>>>
>>>> X-Spam-Status: Yes, score=25.2 required=5.0 tests=AWL,BAYES_99,
>>>>  FORGED_RCVD_HELO,FUZZY_OCR,HTML_10_20,HTML_IMAGE_ONLY_32,HTML_MESSAGE,
>>>>         MIME_HTML_ONLY,RCVD_NUMERIC_HELO,SARE_GIF_ATTACH autolearn=no
>>>>
>>>> MailScanner.conf points to the right SA directory
>>>> (/etc/mail/spamassassin), there ARE image spams that get tagged with 
>>>> the
>>>> OCR-tags, so I don't really get it why the scoring differs this much 
>>>> ...
>>>> also with the Bayes score ... none on MS, 99 on SA ... !?
>>>>
>>>> I'm still running MS 4.50, SA is 3.1.5 ...
>>>>
>>>> Any idea where I could look for the cause of this?
>>>
>>> I know I'l be tarred & feathered by this comment (once again):
>>>
>>> I'd bet its because MS only sent part of the whole msg thru SA and 
>>> cutoff too early & missed the attached images.
>>>
>>> You may have to increase the value in "Max SpamAssassin Size" to 
>>> catch them.
>>>
>>> Alex
>>
>> No tar and feathers, but I do think that you are wrong in your 
>> assumption in this case. :-)
>>
>>
>> There are lots of rules different between the two tests that can't be 
>> explained by a truncated message being passed to SA.
>>
>> AWL, BAYES, RCVD_ tests for instance.
>>
>> To me this suggests that the SpamAssassin tests were run as a 
>> different user than the user that MailScanner runs as.  So it picks up 
>> the BAYES databases and the AWL databases.  It might also be that some 
>> tests are being disabled in the MS setup.
> 
> yes but:  Garry asked about the missing OCR hit. SARE_GIF_ATTACH is a 
> full rule which probably wasn't parsed due to a cutoff and the missing 
> FUZZY_OCR score points in the same direction...
> 
> and some messages are indeed scored by OCR, while other are...
> 
> 
> and if he has AWL switched off in MS, passing SA thru the command line 
> without -C filename will ignore that setting and send msg thru AWL
> 
> or it could also be a bad FUZZY_OCR install, but that I really doubt.
> 
> Alex

Without more information from the OP both theories are possible.


-- 
Anthony Peacock
CHIME, Royal Free & University College Medical School
WWW:    http://www.chime.ucl.ac.uk/~rmhiajp/
"If you have an apple and I have  an apple and we  exchange apples
then you and I will still each have  one apple. But  if you have an
idea and I have an idea and we exchange these ideas, then each of us
will have two ideas." -- George Bernard Shaw


More information about the MailScanner mailing list