FuzzyOcr working but not via MailScanner

Rick Cooper rcooper at dwford.com
Thu Oct 19 14:17:24 IST 2006


 

> -----Original Message-----
> From: mailscanner-bounces at lists.mailscanner.info 
> [mailto:mailscanner-bounces at lists.mailscanner.info] On Behalf 
> Of Glenn Steen
> Sent: Thursday, October 19, 2006 4:42 AM
> To: MailScanner discussion
> Subject: Re: FuzzyOcr working but not via MailScanner
> 
> On 19/10/06, Glenn Steen <glenn.steen at gmail.com> wrote:
> > On 19/10/06, Anthony Cartmell <ajcartmell at fonant.com> wrote:
> > > > I think you are on to something there Scott. I'll offer 
> a guess...
> > > > Anthony, are you by any chance running Postfix?
> > >
> > > Nope, sendmail.
> >
> > OK. Was just a thought:-).
> >
> > > It seems to be a "search path" issue: MailScanner skips a 
> lot of setup
> > > stuff that spamassassin does from the command line. From 
> MailScanner, the
> > > whole /var/lib/spamassassin/3.001003 directory was being 
> missed and hence
> > > a whole load of default rules.
[...]

> (just proving my PF "roots"....)
> 
> I looked at SpamAssassin.pm, and this snippet is the clicher:
> -----
> @default_rules_path = (
>   '__local_state_dir__/__version__',
>   '__def_rules_dir__',
>   '__prefix__/share/spamassassin',
>   '/usr/local/share/spamassassin',
>   '/usr/share/spamassassin',
> );
> -----
> As you can see, the normal way to find the sa-updated files is via
> "__local_state_dir__/__version__", where __local_state_dir__ defautls
> to /var/lib/spamassassin .... and __version__ is set (just above that)
> to something like 3.00100X ... If you did an upgrade (and perhaps
> didn't do an sa-update afterwards) I suppose you could end up in a
> situation where the new SA version couldn't find the updated files, I
> suppose (someone a bit more fluent (than me) in how SA is instantiated
> will probably eb able to tell if this supposition is correct).
> I suppose running sa-update should clear any such problem... And you
> might clear the FuzzyOcr problem by resetting the MailScanner option
> for site rules.

I can tell you from experience that if you do not run sa-update following a
SA update your /var/lib/spamassassin<version> directory is not created and
when SA does it's path checks it will (there is a thread on this on one of
the SA lists)

1. check for /var/lib/spamassassin<version>
2. If step one fails it will use the local site dir

It's unnecessarily complicated in my opinion compared to how it used to be.
One would think you could do something like the old 
	$defaultrules = $test->{default_rules_path};
	$defaultrules ||= $test->first_existing_path
(@Mail::SpamAssassin::default_rules_path); 

and use 

$defaultupdaterules = $test->{default_update_rules_path};
$defaultupdaterules ||= $test->first_existing_path
(@Mail::SpamAssassin::default_update_rules_path);

If I am remembering the balance of the thread it's now something like

my $SAVersion = $Mail::SpamAssassin::VERSION;
my $defaultrules = "/var/lib/spamassassin/$SAVersion" if -d
"/var/lib/spamassassin/$SAVersion";
$defaultrules ||= $test->first_existing_path
(@Mail::SpamAssassin::default_rules_path);

In my case I forgot the sa-update after 3.17 and
/var/lib/spamassassin/3.001007 was not created so it then defaulted to
/usr/share/spamassassin even though /var/lib/spamassassin/3.001001 was still
there (yes I skipped some updates, time constraints). I did not change the
"SpamAssassin Default Rules Dir" setting from the default (blank) so I guess
I am wondering if the proper /var/lib/spamassassin/3.001007 directory is
being used when SA is run from MailScanner? Julian can you answer that
question?

Also, along the lines of the thread. When I first installed FuzzyOcr and was
testing it I forgot to set the value of focr_autodisable_score (in
FuzzyOcr.cf) lower (I think the default is 10 or something like that) so it
appeared that Fuzzy wasn't doing anything because the spams with images were
already scoring above that. I lowered the score to 2 (temporarily) and then
Fuzzy was hitting. Of course I raised it back to my hit range so it wouldn't
waste resources when the messages was already scored as spam.

Rick


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.




More information about the MailScanner mailing list