Moving Bayes database to tmps
infernix
infernix at infernix.net
Wed Jun 3 22:36:49 IST 2009
Kai Schaetzl wrote:
> I forgot to look at these data. These are low figures. If you already have
> a performance problem, then not because of Bayes. As I said I'm sure going
> to SQL is better than using tmpfs.
I have seen the exact opposite (and have read elsewhere that SQL is
really very expensive cpu-wise for what the Bayes engine does in SA),
but my use case is perhaps non-standard.
We have 4 nodes doing about 1 million messages a day. Before I moved the
bayes db to tmpfs, I had very large amounts of iowait, even though
everything else (sendmail spool, mailscanner incoming+spool dirs) was
already on tmpfs. Now the SATA disks in these boxes aren't great but I
had expected better performance. Apparently the amount of concurrent
messages we get at peak times is just too big to handle on disk platters
(at least with single disks), so the move to tmpfs helped enormously.
In contrary to that, when I converted one of the bayes dbs to sql and
configured all nodes to use one mysql server, the mysql box couldn't
handle it. Net effect was that scanning messages was taking 5-15 seconds
longer than before.
Right now, with everything in tmpfs, I am running 40 children on each
box and iowait during peak hours is 0-1%. I could increase children if I
wanted to but the current mail volume does not warrant that.
For data protection (it is tmpfs after all) I have written an init
script that backs up the tmpfs on shutdown and restores it on bootup.
I'm also making an emergency-backup.tar.gz of the tmpfs folder every 15
minutes and use that tar file when the server crashes; at bootup i check
for a shutdown-generated tar and if its not there i revert to the
emergency-backup tar.
Just my $0.02.
More information about the MailScanner
mailing list