An express checkout? [was: Re: Postfix and Mailscanner sitting in a tree k-iss-ing]

paddy paddy at PANICI.NET
Fri Dec 31 23:44:19 GMT 2004


Julian,

I'm glad I'm the only one of us that's the sad-act posting here on
new year's eve ;) (hell, easier than apologising for replying to myself!)

I'll try to improve on my last thoughts _and_ keep it short. With any luck
you'll read this one first _and_ it'll make sense :)

the tail|sed gives me a rolling indication of latency on the mail queue.
  (In practice I'm intersted in the time to get through mailscanner, which
  under ordinary conditions is a non-issue, and even under critical conditions
  I have found to be very acceptable).  I see there is now a LogSpeed option,
  never noticed that before (probably been there forever :)

Upon reflection I can't see a 'simple criteria' that's cheap enough to be
a no-brainer to use unless you can do some processing before the incoming mail
first goes to disk.

  (My first choice would be originating IP. I did briefly, in desperation,
  consider size).  Anything else is just equivalent to what MailScanner
  already does (dispatch RBL queries early, etc) only my suggestions
  were weaker :)

  I'm also imagining that any processing before the mail hits disk
  is at a premium in a DoS/highload situation, although that may not be the
  case if the cpu is not the bottleneck ...

I don't think the express checkout idea is necessarily a totally lost cause:

  sure, the cost of scheduling can easily drown the value, but a system
  where the order of operations effects the cost is a promising target.

  the original intention - differential QoS based on approximate spamminess -
  still seems good.  The problem is implementing it at acceptable costs.
  (remember Magnus Pike?)

  <more insane and pointless handwaving>
    I also had this vague idea that using directories for the elevator in the
    CriticalQueue condition might be cheaper than sorting by date, but the
    problem is obvious ....

What I realise is:

  I don't really understand the trade-off between batch size and MaxChildren

  I'd certainly appreciate it if you, or anyone for that matter :), could help
  me with this.  Since they are both limits, I imagine that describing the
  limiting conditions will help.

  I'm just re-reading the notes in the conf file.

    Does a mailscanner child really consume ~20MB ?  Why ?

  based on your 'try 5 children per CPU' comment, I'm guessing that more
  children = more cpu heavy (which makes sense anyway).
  (must fix my CPU utlisation logging! :)

  Is there even a BatchSize type option? Is MailScanner even batch-oriented
  in the way I had imagined? is MaxUnscannedMessagesPerScan it ?

  I'm also amused to discover (see previous mail) I have

    Max Normal Queue Size = 5000

  This reminds me of the 'per-user spamsassasin' thread tonight.  There are
  already so many options, no doubt for each one there is somebody who
  really needs it, but nobody could really need them all (could they?),
  and the idea that anybody needs a new one should at least attract a
  little skepticism.  But then, I expect I'm preaching to the priest !

  would any of the options make sense in multiple units?
    for (over)simplified example: 5000 mails or 5 mails per GHz of cpu
    perhaps this is best left to admin and configuration tools?

And it's easy to think you (I mean me, of course!) know what going on, until ....

I wish you a Very Happy New Year !

Regards,
Paddy
--
Perl 6 will give you the big knob. -- Larry Wall

------------------------ MailScanner list ------------------------
To unsubscribe, email jiscmail at jiscmail.ac.uk with the words:
'leave mailscanner' in the body of the email.
Before posting, read the MAQ (http://www.mailscanner.biz/maq/) and
the archives (http://www.jiscmail.ac.uk/lists/mailscanner.html).

Support MailScanner development - buy the book off the website!




More information about the MailScanner mailing list