RFC: calculating scan times for messages.

Wed Mar 5 16:49:27 GMT 2003

At 15:31 05/03/2003, you wrote:
>El 5 Mar 2003 a las 13:57, David escribió:
>
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: RIPEMD160
> >
> > Hello.
> >
> > I was wondering if any of you have an idea how I could time the
> > scanning process for a message.
> >
> > I am using sendmail and I was thinking about using the delay= data, but
> > that would not be too accurate.
> > What I actually wish to do is for a private littkle project of mine.
> >
> > I would ike to estimate the following:
> >
> > With the checks XX used and sophos, using Spamassassin with checks XXX
> >
> > scanning a 500byte message takes and avergae of XX seconds (and so on)
> >
> > Does this make sense at all?
>Well... not that it doesn't make sense, but it wouldn't be measuring anything
>too useful...
>
>The point is that you can't extrapolate useful info from that data... In
>order to measure something useful, you should have to bomb your server with a
>good mixture of mails including spam, ham and viruses and keep an eye on the
>queues... once you have steadily growing queues you should make a couple of
>marks in the logs and measure the number of messages per time unit that are
>passing thru MailScanner.
>
>That should give you a rough estimate of performance... it doesn't make too
>much sense to measure how much does any specific message takes.
>
>Note that you need at least 3 machines to do this... the actual test machine,
>an emisor machine and a receptor machine.
>
>The test machine should be configured to route all its outgoing mail to the
>receptor machine. The receptor machine should have a very fast mail server
>configured to accept and delete every message inconditionally (kind of, your
>incoming mail queue should be /dev/null :-)
>
>The emisor machine is the hardest... maybe you'll have to hack a small fast
>program to send the mail. Or you can take something like qmail (which I think
>sends 1 message per session even though they may be going to the same place),
>stop the smtp client, fill the outgoing queue with your very large collection
>mixing spam, ham and virus e-mails and... start the smtp client.
>
>It might be a funny process and I would definitively like to have the outcome
>from that if you do it... maybe also the programs/configuration used.
>
>For the client smtp (the emisor) you might also want to take a look at Russel
>Cocker's postal http://www.coker.com.au/postal/ (the receptor machine is what
>he calls SMTP sink, if you do it, I guess he'll be glad to know about it).
>
>Postal generates garbage for the mail data, but maybe you can modify it so it
>takes the messages from somewhere. It has a nice set of options for number of
>simultaneous connections, max number of messages per connection, max message
>size, rate limitation, etc.

This is exactly the test setup I already use. I have a test set of 60,000 
messages. The emisor uses 10 parallel copies of a Perl script to squirt 
mail as fast as it possibly can to the MailScanner. The limiting factor 
here is disk I/O and the lousy i/o scheduler Linux has (it is being 
re-written for the 2.6 kernels, thank heavens). The emisor's limit is about 
8 million messages per day.

The MailScanner runs Exim and MailScanner, in a pretty much vanilla 
configuration, except that the MailScanner/incoming directory is on tmpfs 
to remove all that nasty disk i/o.

It then sends all its output to a perl SMTP sink I wrote in about 10 
minutes, which speaks just enough SMTP to convince Exim that it's a real 
mail server. These fork off to handle traffic, and there are quite often 
nearly 100 running simultaneously. They throw away everything they are sent.

Speed control is done by varying an optional delay in the emisor script, 
and changing the number of emisor scripts that are run in parallel. It's 
pretty coarse but is good enough.

Tweak the speed until the queue *just* doesn't grow without bounds. That's 
about the limit of what the MailScanner can handle.

Running MailScanner, Sophos, SpamAssassin (2.44 or 2.50, it doesn't matter 
much) and 3 RBL's, the MailScanner can do about 1.5 million messages per 
day. Just running MailScanner and SpamAssassin, it can handle 4.4 million 
per day.

In case you are interested, I have attached a little zip file containing 
the emisor test "harness"(and the shell script that runs them in parallel) 
and the smtp sink. 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SpeedTests.zip
Type: application/zip
Size: 2651 bytes
Desc: not available
Url : http://lists.mailscanner.info/pipermail/mailscanner/attachments/20030305/ee1fdc06/SpeedTests.zip
-------------- next part --------------
--
Julian Field
www.MailScanner.info
MailScanner thanks transtec Computers for their support