mailscanner with heavy load

Sat Jan 31 18:53:32 GMT 2009

Martin Hepworth wrote:
>>> From: Brent Addis <brent.addis at spit.gen.nz>
>>>
>>> Those examples are by now quite old (I remember seeing those at least 3
>>> years ago)
>>>
>>> Does anyone have any real world examples of large scale deployments,
>>> using current spam types and newer plugins (ocr scanning etc) on more
>>> modern hardware?

Current spam types doesn't require OCR; Image spam isn't common any more.

>>>> Paulo Roncon yazm?s,:
>>>>> Hello everyone,
>>>>>
>>>>> Can you please tell how many mgs/day and Mb/day do your mailscanner
>>> filter?
>>>>> I'm designing a large deployment and have some concerns in its
>>> capability of handling heavy loads...
>>>>> In my case the box will face about 2MB/s incoming and 60msg/s !!
>>>>> How many servers(HP G5, quadcore, 16RAM) should I install? (not using
>>> DCC, Razor, Pyzor.)
>>>>> Thanks!

60 message/sec == 518,400 messages per day.

The key metric for MailScanner is the average time to scan a single
message; on a tuned system this can take anywhere between 1 and 8
seconds maximum depending on the message.  This includes SA (with
compiled rulesets), ClamAV, FProt6, Razor2, DCC and all the default
DNSBL/URIBL lookups in SA and writing the data to MailWatch.

Disabling DCC, Razor2 and all untrusted DNSBLs would decrease the scan
time considerably.  Note that to get reasonable scan times you *cannot*
use *any* command-line virus scanner that doesn't use sockets or a
persistent daemon.

If you base the default at 8 seconds per message (which is
super-conservative) then:

1 child can process 10,800 msgs/day, therefore you would require ~47
MailScanner children to process 500,000 messages per day.

Based on the tuning metric of 5 children per GB RAM and per CPU - you
would need 10 CPUs and 10Gb RAM minimum to process that load based on a
default configuration.

So three boxes of that specification would suffice to handle the
required load with some overhead to spare.  You would also need to make
sure each box got an equal load of the input messages, so some sort of
load balancer would be required.

I would also recommend buying Spamhaus, URIBL and SURBL datafeeds and
run rbldnsd locally on your network as you will be way over the
threshold to use the public mirrors - this will also prevent the lookups
from hurt the scan performance adversely.  I seriously recommend looking
at my firms BarricadeMX product which can sit in-front of MailScanner
and reduce the message input to your MTA and into MailScanner
considerably to avoid any nasty spikes, improve efficiency and
performance and catch-rate.

Hope that gives you a rough guide.

Kind regards,
Steve.

--
Steve Freegard
Development Director
Fort Systems Ltd.