Starter Bayes database for SA 3.0 is available

Steve Swaney Steve.Swaney at FSL.COM
Thu Sep 23 20:21:44 IST 2004


> -----Original Message-----
> From: MailScanner mailing list [mailto:MAILSCANNER at JISCMAIL.AC.UK] On
> Behalf Of Matt Kettler
> Sent: Thursday, September 23, 2004 2:17 PM
> To: MAILSCANNER at JISCMAIL.AC.UK
> Subject: Re: Starter Bayes database for SA 3.0 is available
>
> At 01:53 PM 9/23/2004, Steve Swaney wrote:
> >The bad news is I had trouble upgrading a Bayes database from SA 2.64 ->
> >3.0. The good news is that as a result, I was forced to build a new SA
> 3.0
> >DB from our saved ham and spam libraries.
> >
> >This database is for Linux only and is available at www.fsl.com/support
> >
> >With luck I'll have the FreeBSD database available shortly, late today or
> >over the weekend.
>
> Hmm.. not to be overly critical, but doesn't the concept of using a
> starter
> bayes database defeat some of the purpose of using bayes in the first
> place?
>
> Remember, the primary advantage bayes has is that it learns the email
> patterns of your site, which are very different from other people's sites.
> While spam training is sharable in a reasonable way, ham training contains
> considerable site-to-site variances.
>
> I suppose in some situations a starter is better than nothing, and it's a
> really good thing to have out there for people that need it. However I
> think it would be wise to at least advise people that this is a rather
> sub-optimal way to start a bayes DB, rather than give them the false
> impression this is a good idea with no drawbacks.
>
> Just my $0.02.
>

I theory you're more than $0.02 right but we've had good results using this
database at many sites over the last year. It seems to match up fairly well
for general business and academic use. When used is conjunction with
bayes_auto_learn and regular database expiry, it seems to learn well enough
to match the sites email characteristics over time.

Also over the last year, I can remember only one database "poisoning" on all
the sites that have used this starter database and I can't be sure the
database caused the problem.

My personal experience with Bayes is that custom feeding of missed ham and
spam definitely increases correct spam detection and reduces false
positives. At the same time, using the starter database and then putting
Bayes on auto-pilot works well at sites where there is not enough time or
resources to hand feed the Bayes database.

Steve

Steve Swaney
President
Fortress Systems Ltd.
www.fsl.com
steve.swaney at fsl.com

> ------------------------ MailScanner list ------------------------
> To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
> 'leave mailscanner' in the body of the email.
> Before posting, read the MAQ (http://www.mailscanner.biz/maq/) and
> the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
> Fortress Systems Ltd.
> www.fsl.com
>



--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Fortress Systems Ltd.
www.fsl.com

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the MAQ (http://www.mailscanner.biz/maq/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).



More information about the MailScanner mailing list