Is this really how bayes+autolearn works?

Wed Dec 13 19:03:16 GMT 2006

Denis Beauchemin spake the following on 12/13/2006 10:41 AM:
> Scott Silva a écrit :
>> Content analysis details:   (33.4 points, 5.0 required)
>>
>>  pts rule name              description
>> ---- ----------------------
>> --------------------------------------------------
>>  0.0 BOTNET_CLIENTWORDS     Hostname contains client-like substrings
>>  0.0 BOTNET_IPINHOSTNAME    Hostname contains its own IP address
>>  1.7 SARE_MLB_Stock1        BODY: SARE_MLB_Stock1
>>  1.7 SARE_MLB_Stock2        BODY: SARE_MLB_Stock2
>>  1.0 SARE_LWHUGE            BODY: SARE_LWHUGE
>>  0.8 SARE_LWSHORTT          BODY: SARE_LWSHORTT
>>  1.7 SARE_MLB_Stock6        BODY: ML obfuscated ticker symbols
>>  2.4 TVD_STOCK1             BODY: Message looks like it's pushing a
>> stock...
>>  0.0 BAYES_50               BODY: Bayesian spam probability is 40 to 60%
>>                             [score: 0.5000]
>>  1.5 RAZOR2_CHECK           Listed in Razor2 (http://razor.sf.net/)
>>  1.5 RAZOR2_CF_RANGE_E4_51_100 Razor2 gives engine 4 confidence level
>>                             above 50%
>>                             [cf: 100]
>>  1.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
>>                             [cf: 100]
>>  3.7 PYZOR_CHECK            Listed in Pyzor (http://pyzor.sf.net/)
>>  2.2 DCC_CHECK              Listed in DCC
>> (http://rhyolite.com/anti-spam/dcc/)
>>  2.0 RCVD_IN_SORBS_DUL      RBL: SORBS: sent directly from dynamic IP
>> address
>>                             [84.2.92.253 listed in dnsbl.sorbs.net]
>>  2.0 RCVD_IN_NJABL_DUL      RBL: NJABL: dialup sender did non-local SMTP
>>                             [84.2.92.253 listed in combined.njabl.org]
>>  2.5 DIGEST_MULTIPLE        Message hits more than one network digest
>> check
>>  2.8 RATWARE_OUTLOOK_NONAME Bulk email fingerprint (Outlook no name)
>>                             found
>>  0.0 BOTNET_CLIENT          Hostname looks like a client hostname
>>  1.9 RATWARE_MS_HASH        Bulk email fingerprint (msgid ms hash) found
>>  1.7 MSGID_DOLLARS          Message-Id has pattern used in spam
>>  2.0 BOTNET                 The submitting mail server looks like part
>> of a Botnet
>>
>>
>>
>>
>>   
> I was wondering how you got a score so different than mine and realized
> I cited the score Trevor's message got with all its attachments
> included.  I saved one of the attachments and ran SA on it and got
> results similar to yours.
> 
> Denis
> 
I don't think that a bare spamassassin install is going to be sufficient
anymore. At least until they add more rules.
I'm always looking to find something that can catch that extra percent or so
of messages, without too high of a processing cost. But the best bang for the
buck has been dropping at the MTA anything in sbl+xbl. That is over 75% of the
traffic not even needing to be run through spamassassin.
-- 

MailScanner is like deodorant...
You hope everybody uses it, and
you notice quickly if they don't!!!!