A quick and easy performance improvement

Logan Shaw lshaw at emitinc.com
Wed Jul 26 19:41:25 IST 2006


On Wed, 26 Jul 2006, DAve wrote:
> Maybe I am showing my ignorance but how? I'm not seeing any performance 
> issues myself, just curious.

That's sort of my question about this whole thing.  From my
perspective, if you're running SpamAssassin and a virus checker,
then MailScanner is going to be mostly CPU bound and/or bound
by network delays on the network tests (assuming a sufficient
number of children).

I guess what I'm wondering is how often MailScanner is
really I/O bound.  I would think not that often: on my server
(which is admittedly low-volume), a batch or 2 or 3 messages
processes something like 100 kilobytes (if they are large
messages) and takes something like 1 or 2 seconds of CPU time.
(It's a slow CPU...)  At that rate, the I/O system should have
no trouble keeping up.

> I currently have bayes on one controller/disk 
> pair and the queues on another controller/disk pair. I've always believed 
> that to be about the best you could do.

You can probably get even more performance with RAID and volume
managers.  A striped volume with a pretty wide stripe width
(wide enough to fit entire e-mail messages) should allow an
entire message to be written to one disk without the other disk
being involved (not even having to seek) at all, in many cases.
Then you can just keep adding disks to this stripe set and
getting increased performance (up to a point, probably).

There are also volume managers and filesystems that can write
the filesystem's journal to a completely separate disk.  With a
setup like that, you can get close to 100% sequential access.
(I believe Solaris ZFS can even get near-100% sequential access
even without a separate disk because of its copy-on-write
style of updating the disk as well.)

Of course, if I'm right that I/O capacity  isn't the bottleneck,
then none of that matters...  :-)

   - Logan


More information about the MailScanner mailing list