<br><br><div class="gmail_quote">2009/8/12 Mauricio Tavares <span dir="ltr"><<a href="mailto:raubvogel@gmail.com">raubvogel@gmail.com</a>></span><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div><div></div><div class="h5">Glenn Steen wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
2009/8/12 Mauricio Tavares <<a href="mailto:raubvogel@gmail.com" target="_blank">raubvogel@gmail.com</a>>:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Glenn Steen wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
2009/8/12 Mauricio Tavares <<a href="mailto:raubvogel@gmail.com" target="_blank">raubvogel@gmail.com</a>>:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Jules Field wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Did you run sa-learn as the same user you run MailScanner as? ("Run As<br>
User" in MailScanner.conf). Otherwise your Bayes database you've been<br>
training will be in the wrong place.<br>
<br>
</blockquote>
I see your point. I was indeed running sa-learn as root, not as<br>
postfix, which should be the user MailScanner runs as. So, I guess I<br>
should<br>
run it then as postfix. Now, should I delete the root-created database?<br>
Also, where will it save the database at?<br>
<br>
</blockquote>
You should delete the one for root, if it resides in roots home<br>
directory, since that will be no help at all... Or move it. But I see<br>
you have configured it to reside somewhere sane, so all you need do is<br>
make it all owned by postfix.<br>
</blockquote>
Here is an update: I wrote a script that through all the virtual<br>
email accounts (/var/spool/vmail/<a href="http://domain.com" target="_blank">domain.com</a>) and scanned the spam (placed in<br>
the .Spam folder) and the ham (placed in all the other mail folders). Since<br>
I am running it as postfix:postfix and that directory is owned by<br>
virtual:virtual, I did not get everyone. Is there a way to let the<br>
postfix-owned script check all the mails in the virtual-owned ones? Make<br>
postfix part of the virtual group? I think that is what the sticky bit is<br>
for, right? In any case, here is the output:<br>
<br>
postfix@mail /etc/postfix $ sa-learn --dump magic<br>
0.000 0 3 0 non-token data: bayes db version<br>
0.000 0 1837 0 non-token data: nspam<br>
0.000 0 179092 0 non-token data: nham<br>
0.000 0 3104505 0 non-token data: ntokens<br>
0.000 0 1053729759 0 non-token data: oldest atime<br>
0.000 0 1250081652 0 non-token data: newest atime<br>
0.000 0 1250081434 0 non-token data: last journal sync<br>
atime<br>
0.000 0 1250034247 0 non-token data: last expiry atime<br>
0.000 0 0 0 non-token data: last expire atime<br>
delta<br>
0.000 0 0 0 non-token data: last expire<br>
reduction count<br>
postfix@mail /etc/postfix $<br>
<br>
<br>
As you can see, there is a lot more ham than spam. I wonder how much harm<br>
would that cause in my bayesian filtering...<br>
<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
If you also use MailWatch, you'll need make the apache users group the<br>
"group owner" for the base directory and all the files, and set the<br>
GID bit for the directory (/var/spool/MailScanner/bayes in your case),<br>
so that any new files get the correct group ownership. Once you've<br>
done that, things should start cooking:-).<br>
</blockquote>
Thanks for the suggestion! If I ever use MailWatch, I will try to<br>
remember to use that. =)<br>
<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
One more thing: Always run your tests (spamassassin --lint and stuff<br>
like that) as your postfix user, to avoid some subleties that might<br>
otherwise bite.<br>
</blockquote>
postfix@mail /etc/postfix $ spamassassin --lint<br>
[19591] warn: config: warning: score set for non-existent rule<br>
WANTS_CREDIT_CARD<br>
[19591] warn: config: warning: score set for non-existent rule<br>
FORGED_RCVD_HELO<br>
[19591] warn: lint: 2 issues detected, please rerun with debug enabled for<br>
more information<br>
postfix@mail /etc/postfix $<br>
<br>
</blockquote>
Hm, I wonder if your postfix user really can read all the .cf files...<br>
Do as it suggests and see what debug will tell you (spamassassin<br>
--lint -D, as the PF user). Also try running a message through, or<br>
else it will not test bayes for you:<br>
spamassassin -t -D < /path/to/email/file<br>
... and llok carefully at what it says about bayes. You might want to<br>
pipe the output to a file (or less). Don't forget to redirect STDERR<br>
as well ( 2>&1).<br>
<br>
Cheers<br>
</blockquote>
<br></div></div>
Some interesting findings (to me):<br>
<br>
postfix@mail /home/raub/Spam $ spamassassin -D < spam9.eml<br>
<br>
Content analysis details: (10.2 points, 5.0 required)<br>
<br>
pts rule name description<br>
---- ---------------------- --------------------------------------------------<br>
1.8 BAD_ENC_HEADER Message has bad MIME encoding in the header<br>
3.2 CHARSET_FARAWAY_HEADER A foreign language charset used in headers<br>
0.0 BAYES_50 BODY: Bayesian spam probability is 40 to 60%<br>
[score: 0.5000]<br>
1.4 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76 chars<br>
0.9 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP address<br>
[202.132.194.31 listed in <a href="http://dnsbl.sorbs.net" target="_blank">dnsbl.sorbs.net</a>]<br>
0.9 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL<br>
[202.132.194.31 listed in <a href="http://zen.spamhaus.org" target="_blank">zen.spamhaus.org</a>]<br>
2.0 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in <a href="http://bl.spamcop.net" target="_blank">bl.spamcop.net</a><br>
[Blocked - see <<a href="http://www.spamcop.net/bl.shtml?202.132.194.31" target="_blank">http://www.spamcop.net/bl.shtml?202.132.194.31</a>>]<br>
0.1 RDNS_DYNAMIC Delivered to trusted network by host with<br>
dynamic-looking rDNS<br>
0.0 MISSING_MIMEOLE Message has X-MSMail-Priority, but no X-MimeOLE<br>
<br>
But, as me:<br>
<br>
raub@mail ~/Spam $ spamassassin -D < spam9.eml<br>
[...]<br>
<br>
Content analysis details: (12.7 points, 5.0 required)<br>
<br>
pts rule name description<br>
---- ---------------------- --------------------------------------------------<br>
2.9 BAD_ENC_HEADER Message has bad MIME encoding in the header<br>
3.2 CHARSET_FARAWAY_HEADER A foreign language charset used in headers<br>
1.8 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76 chars<br>
0.9 RCVD_IN_PBL RBL: Received via a relay in Spamhaus PBL<br>
[202.132.194.31 listed in <a href="http://zen.spamhaus.org" target="_blank">zen.spamhaus.org</a>]<br>
1.6 RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from dynamic IP address<br>
[202.132.194.31 listed in <a href="http://dnsbl.sorbs.net" target="_blank">dnsbl.sorbs.net</a>]<br>
2.2 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in <a href="http://bl.spamcop.net" target="_blank">bl.spamcop.net</a><br>
[Blocked - see <<a href="http://www.spamcop.net/bl.shtml?202.132.194.31" target="_blank">http://www.spamcop.net/bl.shtml?202.132.194.31</a>>]<br>
0.1 RDNS_DYNAMIC Delivered to trusted network by host with<br>
dynamic-looking rDNS<br>
0.0 MISSING_MIMEOLE Message has X-MSMail-Priority, but no X-MimeOLE<br>
<br>
So, I guess the above means that bayesian was not run when I ran spamassasin as me because it did not have the rights to access the database. I can live with that.<br>
<br>
On a related note, why is it saying 5.0 points required if in MailScanner.conf I have<br>
<br>
Required SpamAssassin Score = 4.7<br>
<br>
Do I also have to define required_hits 4.70 in spam.assassin.prefs.conf?<div><div></div><div class="h5"><br>
-- <br>
MailScanner mailing list<br>
<a href="mailto:mailscanner@lists.mailscanner.info" target="_blank">mailscanner@lists.mailscanner.info</a><br>
<a href="http://lists.mailscanner.info/mailman/listinfo/mailscanner" target="_blank">http://lists.mailscanner.info/mailman/listinfo/mailscanner</a><br>
<br>
Before posting, read <a href="http://wiki.mailscanner.info/posting" target="_blank">http://wiki.mailscanner.info/posting</a><br>
<br>
Support MailScanner development - buy the book off the website! <br>
</div></div></blockquote></div><br>Hi<br><br>There are two settings in MailScanner.conf for SA scores. This gives you the opportunity to mark the mail as "maybe spam" with delivery and the high score as definitely spam and just drop it. <br clear="all">
<br>This differs from SA's view of the world.<br><br>-- <br>Martin Hepworth<br>Oxford, UK<br>