crm question
Koopmann, Jan-Peter
jan-peter at koopmann.eu
Wed Dec 19 23:51:48 GMT 2007
Hi,
I just implemented crm today. Two things make me think:
1. I see a lot of learned documents in spam.css. More than spams
above the spam threshold.
2. The accuracy right now is poor:
Id:1J4v83-000EIG-MntSA Score:50.184 CRM114 Score:-0.15
Id:1J4vSG-000Eng-8jtSA Score:7.415 CRM114 Score:-0.56
Id:1J4vVR-000Esm-GHtSA Score:8.595 CRM114 Score:-0.10
Id:1J4vZO-000Eva-MOtSA Score:14.439 CRM114 Score:-0.33
Id:1J4vah-000ExY-GOtSA Score:21.129 CRM114 Score:14.96
Id:1J4vbt-000F0R-FqtSA Score:6.463 CRM114 Score:-1.52
Id:1J4vgb-000F5b-QGtSA Score:6.401 CRM114 Score:-0.40
Id:1J4vgf-000F5Q-JrtSA Score:6.439 CRM114 Score:-0.36
Id:1J4vgf-000F5S-HHtSA Score:6.401 CRM114 Score:-0.40
Id:1J4vsE-000FNu-1ktSA Score:7.616 CRM114 Score:-0.05
Id:1J4w68-000FeL-HwtSA Score:10.054 CRM114 Score:-0.09
Id:1J4xFR-000Hpa-CatSA Score:12.041 CRM114 Score:-0.33
Id:1J4xqF-000Ipa-0itSA Score:46.701 CRM114 Score:-0.22
Id:1J4y6U-000JJk-PhtSA Score:47.709 CRM114 Score:0.18
Id:1J4y8u-000JQj-CWtSA Score:10.574 CRM114 Score:0.46
Id:1J4yLX-000Jjj-HPtSA Score:39.997 CRM114 Score:0.90
Id:1J4yjL-000KTm-BXtSA Score:24.844 CRM114 Score:0.36
Id:1J4yjB-000KTa-94tSA Score:11.199 CRM114 Score:-0.02
Id:1J4ywH-000L0s-4VtSA Score:17.607 CRM114 Score:-0.03
Id:1J4yyY-000L5j-6RtSA Score:20.166 CRM114 Score:-0.29
Id:1J4z0m-000LC5-JmtSA Score:18.519 CRM114 Score:-0.40
Id:1J4zRY-000M1J-UrtSA Score:7.77 CRM114 Score:0.65
Id:1J4zhl-000Mbf-8htSA Score:11.473 CRM114 Score:0.47
Id:1J4znT-000MlI-UOtSA Score:12.217 CRM114 Score:2.63
Id:1J4zqC-000MvI-9wtSA Score:31.898 CRM114 Score:-0.11
Id:1J4zrk-000N1K-HztSA Score:21.626 CRM114 Score:0.73
Id:1J506N-000NWJ-4KtSA Score:35.986 CRM114 Score:0.75
Id:1J50eF-000Ocu-3UtSA Score:8.262 CRM114 Score:-0.48
Id:1J51He-000PxA-UotSA Score:21.383 CRM114 Score:0.81
Id:1J51WW-0000Qq-SutSA Score:7.187 CRM114 Score:-0.40
Id:1J525Y-0001I4-IktSA Score:43.963 CRM114 Score:1.06
Id:1J52Aw-0001QU-0ctSA Score:40.391 CRM114 Score:0.66
Id:1J52Sj-0001wY-8btSA Score:25.329 CRM114 Score:0.66
Id:1J54Ir-0004JM-IvtSA Score:18.02 CRM114 Score:0.02
Id:1J54ph-0005HH-TktSA Score:30.694 CRM114 Score:0.81
Id:1J54th-0005Ph-IZtSA Score:28.211 CRM114 Score:1.05
Id:1J55q8-0006Zw-VptSA Score:46.701 CRM114 Score:-0.22
Id:1J55qB-0006bS-VPtSA Score:36.446 CRM114 Score:0.65
Id:1J561i-0006p1-3itSA Score:18.866 CRM114 Score:0.36
Id:1J56cJ-0007at-M1tSA Score:21.154 CRM114 Score:0.48
Id:1J57fa-0008qe-BHtSA Score:15.118 CRM114 Score:0.09
Is this due to the few documents on my server?
proxy:/server-root/spamlearn/crm # cssutil -b -r spam.css
Sparse spectra file spam.css statistics:
Total available buckets : 1048577
Total buckets in use : 26847
Total in-use zero-count buckets : 0
Total buckets with value >= max : 0
Total hashed datums in file : 30840
Documents learned : 645
Features learned : 30841
Average datums per bucket : 1.15
Maximum length of overflow chain : 4
Average length of overflow chain : 1.04
Average packing density : 0.03
proxy:/server-root/spamlearn/crm # cssutil -b -r nonspam.css
Sparse spectra file nonspam.css statistics:
Total available buckets : 1048577
Total buckets in use : 55127
Total in-use zero-count buckets : 0
Total buckets with value >= max : 0
Total hashed datums in file : 62625
Documents learned : 666
Features learned : 62626
Average datums per bucket : 1.14
Maximum length of overflow chain : 5
Average length of overflow chain : 1.08
Average packing density : 0.05
Will this improve automatically or is there something wrong with my
setup?
Kind regards,
JP
More information about the MailScanner
mailing list