Is this really how bayes+autolearn works?

Denis Beauchemin Denis.Beauchemin at USherbrooke.ca
Wed Dec 13 14:59:22 GMT 2006


Furnish, Trever G a écrit :
>  
>
>   
>> -----Original Message-----
>> From: mailscanner-bounces at lists.mailscanner.info 
>> [mailto:mailscanner-bounces at lists.mailscanner.info] On Behalf 
>> Of Scott Silva
>> Sent: Tuesday, December 12, 2006 5:45 PM
>> To: mailscanner at lists.mailscanner.info
>> Subject: Re: Is this really how bayes+autolearn works?
>>     
>
>   
>> Furnish, Trever G spake the following on 12/12/2006 1:59 PM:
>>     
>>> So Bayes is getting lots of messages that SA doesn't detect 
>>>       
>> as spam, 
>>     
>>> and only a few similar messages that I train it to treat as 
>>>       
>> spam.  Is 
>>     
>>> this a plausible explanation for why Bayes would consistently be 
>>> misclassifying this mail?
>>>  
>>> So far the floods start in the afternoon and the subject 
>>>       
>> strings are 
>>     
>>> consistent enough that I'm able to correct the damage by:
>>>     - removing my bayes database and retraining from archived spam 
>>> corpus (slow)
>>>     - creating custom rules to, for example, filter out "Subject =~ 
>>> /Good Morning/" (dangerous)
>>>       
>  
>   
>> I also see a lot of spam coming from bots, but I consistently 
>> catch most of it. Are you using some good add-on rules?
>> Do you have any samples that some of us could run through our 
>> systems to see what we get?
>>     
>
> Requested samples are attached.  They're very simple messages -- but
> they're flooding in without being caught and then Bayes starts to assign
> -2.60 to them. :-(
>
> --
> Trever
Trever,

They scored 10 here:

	SpamAssassin (not cached, score=10.448, requis 4.5, BAYES_50 0.00,
	INFO_TLD 1.27, SARE_LWHUGE 1.00, SARE_LWSHORTT 0.79,
	SARE_MLB_Stock1 1.66, SARE_MLB_Stock2 1.66, SARE_MLB_Stock6 1.66,
	TVD_STOCK1 2.40)

The first one comes from base SA, all but the last one come from 
70_sare_stocks and the last one comes from 80_additional in 
/var/lib/spamassassin/3.001007/updates_spamassassin_org (auto updated by 
SA).

Denis

-- 
   _
  °v°   Denis Beauchemin, analyste
 /(_)\  Université de Sherbrooke, S.T.I.
  ^ ^   T: 819.821.8000x62252 F: 819.821.8045


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3595 bytes
Desc: S/MIME Cryptographic Signature
Url : http://lists.mailscanner.info/pipermail/mailscanner/attachments/20061213/d42fcf23/smime.bin


More information about the MailScanner mailing list