Strange HI Load

campbell at campbell at
Fri Jun 16 23:42:50 IST 2006

Quoting Thomas Chamtieh <tchamtieh at>:

> Hi all,
> After I upgraded from 4.46 to 4.54 I started seeing hi load on 2
> servers. Looking at the processes. The noticed that after a couple of
> hours I have 30-40 MailScanner processes in "waiting for messages" mode.
> I have restart every 30 mins. We process over 200K emails a day. I try
> as much as I can to take a lod off MailScanner, for example, I use
> sbl-xbl in sendmail and RBL checking in SpamAssassin, I'm not using
> RulesDuJour. So it shouldn't be acting that way.
> Your help is appreciated, I have to check on these 2 servers every 2
> hours and restart the MailScanner to get ride of the hung processes.
> Thanks,
> -Thomas


I don't think I'm running that version, maybe 4.52, maybe 4.54. I'm not at work
now, and I forget.

I can say that Mailwatch reports at least 5 MailScanner processes - more anytime
there is something in the input queue. This may be subprocesses, but I have 5
children set in my config file. This doesn't seem to make much of a difference
in load.

What I do see, though, is, what looks like incoming sendmail connections, and as
these go up with no increase in MS processes, the load does go up. When I have a
quiet server with nothing in input, load is around 1.7-2.5. As soon as sendmail
is triggered into doing something, the load jumps at least a point and a half.

I have recently blocked at the MTA one fifth of my incoming spam, and this seems
to have helped keep the LA down below 10, regardless of any huge influx of new
email. I am running 8.12 sendmail, but plan on upgrading this to 8.13 next week.
I have also lowered most of the sendmail timeouts to very low levels, and this
help cut out the junk connections. So sendmail processes stay pretty low now all
the time. But this machine used to handle 40-60 sendmail processes at a 11+ LA
prior to attempting to tune/discover anything. So the sendmail processes are
logarithmically responsible for LA, and not directly linear.

85% of all mail is high scoring, so is quarantined. My databases are fairly
large, but not huge, and I only keep 9 days worth of Mailwatch stuff. I'm not
sure how much the MySQL stuff is causing this, but things show up in Mailwatch's
DBs fairly quick, so I doubt there is much caching or pending writes there.

I'm still stumped, but if I find out anything and why this started out of the
blue, I will post it. Hope you care to do the same.

Load averages are so misleading, except when sendmail says it's time to stop
accepting connections due to a high LA, that sometimes I wonder if this is all
worth investigating.

Whew - sure is windy here tonight.

