<br><br><div class="gmail_quote">2009/8/12 Mauricio Tavares <span dir="ltr">&lt;<a href="mailto:raubvogel@gmail.com">raubvogel@gmail.com</a>&gt;</span><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<div><div></div><div class="h5">Glenn Steen wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

2009/8/12 Mauricio Tavares &lt;<a href="mailto:raubvogel@gmail.com" target="_blank">raubvogel@gmail.com</a>&gt;:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Glenn Steen wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

2009/8/12 Mauricio Tavares &lt;<a href="mailto:raubvogel@gmail.com" target="_blank">raubvogel@gmail.com</a>&gt;:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Jules Field wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Did you run sa-learn as the same user you run MailScanner as? (&quot;Run As<br>

User&quot; in MailScanner.conf). Otherwise your Bayes database you&#39;ve been<br>

training will be in the wrong place.<br>

<br>

</blockquote>

      I see your point. I was indeed running sa-learn as root, not as<br>

postfix, which should be the user MailScanner runs as. So, I guess I<br>

should<br>

run it then as postfix. Now, should I delete the root-created database?<br>

Also, where will it save the database at?<br>

<br>

</blockquote>

You should delete the one for root, if it resides in roots home<br>

directory, since that will be no help at all... Or move it. But I see<br>

you have configured it to reside somewhere sane, so all you need do is<br>

make it all owned by postfix.<br>

</blockquote>

       Here is an update: I wrote a script that through all the virtual<br>

email accounts (/var/spool/vmail/<a href="http://domain.com" target="_blank">domain.com</a>) and scanned the spam (placed in<br>

the .Spam folder) and the ham (placed in all the other mail folders). Since<br>

I am running it as postfix:postfix and that directory is owned by<br>

virtual:virtual, I did not get everyone. Is there a way to let the<br>

postfix-owned script check all the mails in the virtual-owned ones? Make<br>

postfix part of the virtual group? I think that is what the sticky bit is<br>

for, right? In any case, here is the output:<br>

<br>

postfix@mail /etc/postfix $ sa-learn --dump magic<br>

0.000          0          3          0  non-token data: bayes db version<br>

0.000          0       1837          0  non-token data: nspam<br>

0.000          0     179092          0  non-token data: nham<br>

0.000          0    3104505          0  non-token data: ntokens<br>

0.000          0 1053729759          0  non-token data: oldest atime<br>

0.000          0 1250081652          0  non-token data: newest atime<br>

0.000          0 1250081434          0  non-token data: last journal sync<br>

atime<br>

0.000          0 1250034247          0  non-token data: last expiry atime<br>

0.000          0          0          0  non-token data: last expire atime<br>

delta<br>

0.000          0          0          0  non-token data: last expire<br>

reduction count<br>

postfix@mail /etc/postfix $<br>

<br>

<br>

As you can see, there is a lot more ham than spam. I wonder how much harm<br>

would that cause in my bayesian filtering...<br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

If you also use MailWatch, you&#39;ll need make the apache users group the<br>

&quot;group owner&quot; for the base directory and all the files, and set the<br>

GID bit for the directory (/var/spool/MailScanner/bayes in your case),<br>

so that any new files get the correct group ownership. Once you&#39;ve<br>

done that, things should start cooking:-).<br>

</blockquote>

       Thanks for the suggestion! If I ever use MailWatch, I will try to<br>

remember to use that. =)<br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

One more thing: Always run your tests (spamassassin --lint and stuff<br>

like that) as your postfix user, to avoid some subleties that might<br>

otherwise bite.<br>

</blockquote>

postfix@mail /etc/postfix $ spamassassin --lint<br>

[19591] warn: config: warning: score set for non-existent rule<br>

WANTS_CREDIT_CARD<br>

[19591] warn: config: warning: score set for non-existent rule<br>

FORGED_RCVD_HELO<br>

[19591] warn: lint: 2 issues detected, please rerun with debug enabled for<br>

more information<br>

postfix@mail /etc/postfix $<br>

<br>

</blockquote>

Hm, I wonder if your postfix user really can read all the .cf files...<br>

Do as it suggests and see what debug will tell you (spamassassin<br>

--lint -D, as the PF user). Also try running a message through, or<br>

else it will not test bayes for you:<br>

spamassassin -t -D &lt; /path/to/email/file<br>

... and llok carefully at what it says about bayes. You might want to<br>

pipe the output to a file (or less). Don&#39;t forget to redirect STDERR<br>

as well ( 2&gt;&amp;1).<br>

<br>

Cheers<br>

</blockquote>

<br></div></div>

        Some interesting findings (to me):<br>

<br>

postfix@mail /home/raub/Spam $ spamassassin -D &lt; spam9.eml<br>

<br>

Content analysis details:   (10.2 points, 5.0 required)<br>

<br>

 pts rule name              description<br>

---- ---------------------- --------------------------------------------------<br>

 1.8 BAD_ENC_HEADER         Message has bad MIME encoding in the header<br>

 3.2 CHARSET_FARAWAY_HEADER A foreign language charset used in headers<br>

 0.0 BAYES_50               BODY: Bayesian spam probability is 40 to 60%<br>

                            [score: 0.5000]<br>

 1.4 MIME_QP_LONG_LINE      RAW: Quoted-printable line longer than 76 chars<br>

 0.9 RCVD_IN_SORBS_DUL      RBL: SORBS: sent directly from dynamic IP address<br>

                            [202.132.194.31 listed in <a href="http://dnsbl.sorbs.net" target="_blank">dnsbl.sorbs.net</a>]<br>

 0.9 RCVD_IN_PBL            RBL: Received via a relay in Spamhaus PBL<br>

                            [202.132.194.31 listed in <a href="http://zen.spamhaus.org" target="_blank">zen.spamhaus.org</a>]<br>

 2.0 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in <a href="http://bl.spamcop.net" target="_blank">bl.spamcop.net</a><br>

              [Blocked - see &lt;<a href="http://www.spamcop.net/bl.shtml?202.132.194.31" target="_blank">http://www.spamcop.net/bl.shtml?202.132.194.31</a>&gt;]<br>

 0.1 RDNS_DYNAMIC           Delivered to trusted network by host with<br>

                            dynamic-looking rDNS<br>

 0.0 MISSING_MIMEOLE        Message has X-MSMail-Priority, but no X-MimeOLE<br>

<br>

But, as me:<br>

<br>

raub@mail ~/Spam $ spamassassin -D &lt; spam9.eml<br>

[...]<br>

<br>

Content analysis details:   (12.7 points, 5.0 required)<br>

<br>

 pts rule name              description<br>

---- ---------------------- --------------------------------------------------<br>

 2.9 BAD_ENC_HEADER         Message has bad MIME encoding in the header<br>

 3.2 CHARSET_FARAWAY_HEADER A foreign language charset used in headers<br>

 1.8 MIME_QP_LONG_LINE      RAW: Quoted-printable line longer than 76 chars<br>

 0.9 RCVD_IN_PBL            RBL: Received via a relay in Spamhaus PBL<br>

                            [202.132.194.31 listed in <a href="http://zen.spamhaus.org" target="_blank">zen.spamhaus.org</a>]<br>

 1.6 RCVD_IN_SORBS_DUL      RBL: SORBS: sent directly from dynamic IP address<br>

                            [202.132.194.31 listed in <a href="http://dnsbl.sorbs.net" target="_blank">dnsbl.sorbs.net</a>]<br>

 2.2 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in <a href="http://bl.spamcop.net" target="_blank">bl.spamcop.net</a><br>

              [Blocked - see &lt;<a href="http://www.spamcop.net/bl.shtml?202.132.194.31" target="_blank">http://www.spamcop.net/bl.shtml?202.132.194.31</a>&gt;]<br>

 0.1 RDNS_DYNAMIC           Delivered to trusted network by host with<br>

                            dynamic-looking rDNS<br>

 0.0 MISSING_MIMEOLE        Message has X-MSMail-Priority, but no X-MimeOLE<br>

<br>

So, I guess the above means that bayesian was not run when I ran spamassasin as me because it did not have the rights to access the database. I can live with that.<br>

<br>

On a related note, why is it saying 5.0 points required if in MailScanner.conf I have<br>

<br>

Required SpamAssassin Score = 4.7<br>

<br>

Do I also have to define required_hits 4.70 in spam.assassin.prefs.conf?<div><div></div><div class="h5"><br>

-- <br>

MailScanner mailing list<br>

<a href="mailto:mailscanner@lists.mailscanner.info" target="_blank">mailscanner@lists.mailscanner.info</a><br>

<a href="http://lists.mailscanner.info/mailman/listinfo/mailscanner" target="_blank">http://lists.mailscanner.info/mailman/listinfo/mailscanner</a><br>

<br>

Before posting, read <a href="http://wiki.mailscanner.info/posting" target="_blank">http://wiki.mailscanner.info/posting</a><br>

<br>

Support MailScanner development - buy the book off the website! <br>

</div></div></blockquote></div><br>Hi<br><br>There are two settings in MailScanner.conf for SA scores. This gives you the opportunity to mark the mail as &quot;maybe spam&quot; with delivery and the high score as definitely spam and just drop it. <br clear="all">

<br>This differs from SA&#39;s view of the world.<br><br>-- <br>Martin Hepworth<br>Oxford, UK<br>