SpamAssassin preferences for every domain

paddy paddy at PANICI.NET
Fri Dec 31 18:16:21 GMT 2004


On Fri, Dec 31, 2004 at 04:12:14PM +0100, Felix Schwarz wrote:
> Hi all,
>
> if I understand the docs correctly I can't set up a SpamAssassin
> preference file for every domain I'm hosting (only 1 MailScanner
> instance running). That is because MS uses SA as a library and SA
> initializes all options at the beginning (MS start).
>
> Are there any plans to circumvent this limitation?
> What I would like to do: Hosting multiple domains (customers) on one
> server. Every customer should have its own SA preferences (especially
> bayes database).
>
>
> When browsing the SA sourcecode I found the signal_user_changed method
> (especially the user_dir option is interessting to me). IMHO that method
> would be sufficient in order to implement the behavior described
> above.
>
> I know that this would be less efficient that the normal MS operation
> mode because the bayes db and the whole stuff must be loaded, verified
> etc. but as I'm in the process of installing a server that won't have
> much load (5-10K mails/day) I think that would be worth it.
>
> Furthermore there are some optimizations that could be done (such as
> having multiple SA instances that are specific for a certain
> domain/user so that they don't have to reload their databases all the
> time).

I think I know what you mean.  A couple of thoughts:

(Note that I am far from being an expert on the subject, and the following
will likely need substantial correction from Julian!)

When SA is implemented, for example, by running it from your .procmailrc
you know who you are because it is already delivery time.  AFAIK MailScanner
deals with recipient addresses, which is not quite the same thing, but
may be close enough for your liking.

Similarly, MailScanner is built around a batch processing design that
gives good scalability, but I thought that the combination with SA in this
case is that you'd be stuck with the same prefs for at least the lifetime of
a batch.  signal_user_changed certainly sounds interesting.

If you were stuck, what could you do?  Changing batch creation so that all
the contents of a batch would be of the same domain would be simple enough
to do, not absurdly expensive at batch creation time, but very likely make a
nonsense of having a batch processing design in the first place.  With a
small load, that may not be much of an issue for you, and you could consider
various strategies such as waiting for a load and falling back to the batch
mode at some critical load/queue size.  As to the cost of an SA load, when
it happens, pre-forking and that sort of thing, I'd need to look, but we're
not in outer space yet as far as I can see.  How many domains ?

Even if signal_user_changed would work, you'd be into approximately the
same things to get the benefits of batching against SA, but at least you
might more easily retain batched virus scanning.

Also, I understand the amavisd-new endeavours to do this sort of thing, but
I haven't used it, mainly because (to my untutored eye) it appears to
written in an obscure dialect of martian that I don't speak. :)

> If you don't want to implement that by yourself what do you think:
> + How much work would it be for a moderately skilled perl programmer?

Disclaimer: I am not even a skilled perl programmer!

Not an enourmous amount.

> + Would you (Julian) accept a patch that would implement the feature
> described above (without changing the default behavior of course?

Disclaimer: I am (still) not Julian :)

How intrusive is the patch ?

Without actually looking at the SA stuff, I can't see why it would have
to be especially bad.

What would be the consequences in terms of the way MailScanner was used,
for instance would it cause more problems than it solved ?

(I'll let someone else answer that one, I really don't know)

Why am I answering these questions ?

Because I too came to MailScanner expecting to find something like this.
Now I'm inclined to ask 'Are you sure you really need that?'.

Perhaps you do, I realised I didn't.

Last quick thoughts:

Bayes is a good example of a system that works better at the periphery of
the network than at the center, but that doesn't mean you can't do both.

If you're deeply wedded to the per-user/per-domain idea, then you should
also at how the MailScanner configuration works: its not a
~/.mailscanner.conf sort of thing, but more like, say, sendmail.

Good Luck, and Happy New Year!

Regards,
Paddy
--
Perl 6 will give you the big knob. -- Larry Wall

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the MAQ (http://www.mailscanner.biz/maq/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).

Support MailScanner development - buy the book off the website!




More information about the MailScanner mailing list