Strange HI Load

Steve Campbell campbell at cnpapers.com
Sat Jun 17 02:39:16 IST 2006


Quoting Thomas Chamtieh <tchamtieh at nayzak.com>:

> Steve,
> 
> Thanks for your insight. It's totally weird, I have 4 other server
> running the same version and all identical. These were running fine
> before the upgrade. When I say hi LA I'm talking about 70-85% almost
> killing the server. On the other 4 servers I have, the LA never goes
> above 1.7 and usually is about 0.4-0.7, and these server handle a lot
> more mail that the trouble ones.

Yeh, that's the same here. I have knocked these down to about 25K each per day
now. One runs about 1.0 LA, the other shoots up to really high. The timing
changes stopped the extra MS children as you showed in your last post, but the
LA still doesn't go down properly. That's the main difference in config files.
The good machine runs the nearly standard RH sendmail.cf, the bad one used to,
but now has the timing changes. The high load necessitated the change in timing. 

I've tried to find something else that is causing this, but when I stop MS
(along with everything else it is doing like MySQL, SA, sendmail, etc) the load
drops immediately to normal.

These machines were built with the exact same stuff, are identical in hardware,
everything. The difference is the domains they handle. And the size of the bayes
files. (Obviously the emails they handle are different).

This has been a gradually increasing thing. It used to run fine, then started to
climb a little, but not enough to worry about, then a little more, and again,
and again, until now it sometimes hit the 12.0 mark. Then sendmail stops accepting.

Really strange. Something is different about the machines, I just can't seem to
pinpoint it. But I will find it.

Keep me posted on anything you find!

Steve
> 
> Thanks,
> 
> -Thomas
>  
> > 
> > 
> > > Hi all,
> > > 
> > > After I upgraded from 4.46 to 4.54 I started seeing hi load on 2 
> > > servers. Looking at the processes. The noticed that after a 
> > > couple of 
> > > hours I have 30-40 MailScanner processes in "waiting for 
> > > messages" mode.
> > > I have restart every 30 mins. We process over 200K emails a 
> > > day. I try 
> > > as much as I can to take a lod off MailScanner, for example, I use 
> > > sbl-xbl in sendmail and RBL checking in SpamAssassin, I'm not using 
> > > RulesDuJour. So it shouldn't be acting that way.
> > > 
> > > Your help is appreciated, I have to check on these 2 
> > > servers every 2 
> > > hours and restart the MailScanner to get ride of the hung processes.
> > 
> > As an afterthought, I have an almost identical server. It's 
> > message count per day is very close to the problem server. I 
> > have always had bayes expiry files on the problem server, and 
> > almost never on the proper acting one.
> > 
> > I see where I have about 4 times the number of tokens in the 
> > Bayes database on the problem machine that I have on the 
> > proper one. The number of expired tokens on the two machines 
> > is really extraordinarily difference during an expiry.
> > 
> > I used to run a cron job to delete the Bayes expire files 
> > just to keep the directory clean, but just turned that off in 
> > the event I was deleting real, valid files, ... so we'll see.
> > 
> > Steve  
> > >  
> > > 
> > > Thanks,
> > > 
> > > -Thomas
> > > 
> > 
> -- 
> MailScanner mailing list
> mailscanner at lists.mailscanner.info
> http://lists.mailscanner.info/mailman/listinfo/mailscanner
> 
> Before posting, read http://wiki.mailscanner.info/posting
> 
> Support MailScanner development - buy the book off the website! 
> 




-------------------------------------------------
This mail sent through IMP: http://horde.org/imp/


More information about the MailScanner mailing list