mailscanner at ecs.soton.ac.uk
Fri Jan 10 09:21:10 GMT 2003
At 07:40 10/01/2003, you wrote:
>I am interested in what other large sites have done to optimize the
>processing of spam and virus scanning. I currently run with 20 MailScanner
>processes, since we have 4 CPU's. From what I can tell, it pulls in 100
>messages at a time to process in a large batch and then sends them on their
>way. Doing it this way shows that disk IO gets slammed, and when it does
>recover, the CPU gets slammed, and then it starts all over again. I am
>thinking that maybe processing smaller chunks of emails might even out the
>load a little and maybe make things run a bit better.
The idea was that the processes all start at different times, and should
therefore be out of phase with each other. So while 1 process is doing lots
of disk IO, another is doing lots of CPU, another is doing lots of network
If you find them all running doing the same thing at the same time (so lots
of processes are collecting new batches, then they all do SA together, then
they all virus scan together, etc) then you are seeing a very strange
symptom that I have seen on my dual-Xeon box here. I haven't the foggiest
idea how it happens, there's nothing wrong with the code (I've had some
computer science experts stare at it).
But, I did find a way around it. If you put the incoming directory
("incoming", not "mqueue.in") in RAM using tmpfs, the problem disappears.
>Another thought is with Spam Assassin. I know it has the capability to run
>in daemon mode (spamd). Does MailScanner even support this? Does running
>spamd in daemon mode give you any performance advantage at all?
The spamd daemon merely provides a (narrow) route to the SpamAssassin code,
which is all written in perl. MailScanner talks to the perl code directly,
which is considerably faster than having to poke all the files down a
socket to it. Using spamd would be slower.
MailScanner thanks transtec Computers for their support
More information about the MailScanner