FW: Strange HI Load
Thomas Chamtieh
tchamtieh at nayzak.com
Sat Jun 17 01:43:34 IST 2006
Here's an output of 'ps aux' look at all these MailScanner
processes!!!!:
1 ? S 0:51 init
2 ? SW 0:00 [migration/0]
3 ? SW 0:00 [migration/1]
4 ? SW 0:00 [migration/2]
5 ? SW 0:00 [migration/3]
6 ? SW 0:00 [keventd]
7 ? SWN 0:00 [ksoftirqd/0]
8 ? SWN 0:00 [ksoftirqd/1]
9 ? SWN 0:00 [ksoftirqd/2]
10 ? SWN 0:00 [ksoftirqd/3]
13 ? SW 0:01 [bdflush]
11 ? SW 48:48 [kswapd]
12 ? SW 60:09 [kscand]
14 ? SW 1:06 [kupdated]
15 ? SW 0:00 [mdrecoveryd]
23 ? SW 27:14 [kjournald]
74 ? SW 0:00 [khubd]
315 ? SW 0:00 [kjournald]
693 ? S 14:02 syslogd -m 0
697 ? S 0:00 klogd -x
707 ? S 1:14 irqbalance
715 ? S 0:00 portmap
734 ? S 0:00 rpc.statd
745 ? S 0:05 mdadm --monitor --scan -f
758 ? SL 0:02 mdmpd
769 ? S 0:11 /usr/bin/perl
/usr/libexec/usermin/miniserv.pl /etc/usermin/miniserv.conf
776 ? S 0:11 /usr/bin/perl /usr/libexec/webmin/miniserv.pl
/etc/webmin/miniserv.conf
1206 ? S 59:25 hpasmd
1255 ? S 9:53 cmahostd -p 15 -s OK
1256 ? S 0:29 cmathreshd -p 5 -s OK
1258 ? S 0:02 cmapeerd
1280 ? S 0:09 cmastdeqd -p 30
1298 ? S 5:59 cmaperfd -p 30 -s OK
1312 ? S 3:56 cmahealthd -p 30 -s OK -t OK -i
1430 ? S 17:13 cmaeventd
1460 ? S 18:24 cmaidad -p 15 -s OK
1482 ? S 0:04 cmafcad -p 15 -s OK
1484 ? S 0:16 cmaided -p 15 -s OK
1570 ? S 0:00 /usr/sbin/snmpd -Lsd -Lf /dev/null -p
/var/run/snmpd -a
1626 ? S 117:24 /usr/sbin/named -u named
1642 ? S 0:01 /usr/sbin/sshd
1656 ? S 0:00 xinetd -stayalive -pidfile
/var/run/xinetd.pid
1668 ? S 0:00 /bin/sh /usr/bin/safe_mysqld
--defaults-file=/etc/my.cnf --pid-file=/var/run/mysqld/mysqld.pid
1697 ? S 102:00 /usr/libexec/mysqld
--defaults-file=/etc/my.cnf --basedir=/usr --datadir=/var/lib/mysql
--user=mysql --pi
1806 ? S 0:00 gpm -t imps2 -m /dev/psaux
1994 ? S 0:19 /opt/hp/hpsmh/sbin/hpsmhd -DSSL -f
/opt/hp/hpsmh/conf/smhpd.conf
2004 ? S 0:00 /opt/hp/hpsmh/sbin/hpsmhd -DSSL -f
/opt/hp/hpsmh/conf/smhpd.conf
2034 ? S 0:20 /usr/sbin/httpd
2045 ? S 0:10 crond
2162 ? S 0:01 /usr/sbin/atd
2180 ? S 0:00 cmanicd
2653 ? S 0:00 /opt/hp/vcagent/bin/vcagentd
2654 ? S 0:06 /opt/hp/vcagent/bin/vcagentd
2655 ? S 0:00 /opt/hp/vcagent/bin/vcagentd
2663 tty1 S 0:00 /sbin/mingetty tty1
2664 tty2 S 0:00 /sbin/mingetty tty2
2665 tty3 S 0:00 /sbin/mingetty tty3
2666 ? S 0:00 /opt/hp/vcagent/bin/vcagentd
31538 ? S 0:00 /var/dcc/libexec/dccifd -tCMN,5, -llog
-wwhiteclnt -Uuserdirs
31539 ? S 62:59 /var/dcc/libexec/dccifd -tCMN,5, -llog
-wwhiteclnt -Uuserdirs
4403 ? S 0:16 cupsd
4445 ? S 0:04 /usr/sbin/httpd
4446 ? S 0:03 /usr/sbin/httpd
4447 ? S 0:04 /usr/sbin/httpd
4448 ? S 0:05 /usr/sbin/httpd
4449 ? S 0:05 /usr/sbin/httpd
4450 ? S 0:03 /usr/sbin/httpd
4451 ? S 0:03 /usr/sbin/httpd
4452 ? S 0:04 /usr/sbin/httpd
16640 ? S 0:04 /usr/sbin/httpd
31317 ? S 0:03 /usr/sbin/httpd
4838 ? S 0:02 /usr/sbin/httpd
4839 ? S 0:03 /usr/sbin/httpd
4840 ? S 0:02 /usr/sbin/httpd
4841 ? S 0:02 /usr/sbin/httpd
4842 ? S 0:02 /usr/sbin/httpd
4843 ? S 0:03 /usr/sbin/httpd
4844 ? S 0:02 /usr/sbin/httpd
21765 ? S 0:27 sendmail: accepting connections
21770 ? S 0:00 sendmail: Queue runner at 00:15:00 for
/var/spool/clientmqueue
21776 ? S 0:00 sendmail: Queue runner at 00:15:00 for
/var/spool/mqueue
21799 ? S 0:00 MailScanner: starting child
11399 ? S 0:00 MailScanner: starting child
15110 ? S 0:00 MailScanner: starting child
14531 ? S 0:00 MailScanner: starting child
16689 ? S 0:00 MailScanner: starting child
18976 ? S 0:00 MailScanner: starting child
28368 ? S 0:00 sendmail: k5GMGEj9028368
dsl5400AB1C.pool.t-online.hu [84.0.171.28]: DATA
17108 ? S 0:00 MailScanner: starting child
19020 ? S 0:00 sendmail: server
63.102.177.60.broad.hz.zj.dynamic.cndata.com [60.177.102.63] cmd read
19189 ? S 0:00 sendmail: server
63.102.177.60.broad.hz.zj.dynamic.cndata.com [60.177.102.63] cmd read
22961 ? S 0:02 MailScanner: waiting for messages
24589 ? S 0:02 MailScanner: waiting for messages
25133 ? S 0:00 sendmail: k5GNG6gE025133
cc986512-a.assen1.dr.home.nl [82.74.88.79]: data
25180 ? S 0:00 sendmail: server
pc97.broad.dynamic.xm.fj.cn.cndata.com [59.57.186.97] (may be forged)
cmd read
27287 ? S 0:02 MailScanner: waiting for messages
27626 ? S 0:03 MailScanner: waiting for messages
27872 ? S 0:02 MailScanner: waiting for messages
28267 ? S 0:02 MailScanner: waiting for messages
28656 ? S 0:02 MailScanner: waiting for messages
29298 ? S 0:02 MailScanner: waiting for messages
29482 ? S 0:02 MailScanner: waiting for messages
29815 ? S 0:02 MailScanner: waiting for messages
29955 ? S 0:03 MailScanner: waiting for messages
29995 ? S 0:02 MailScanner: waiting for messages
30640 ? S 0:00 sendmail: k5GNSLHT030640 [82.201.230.204]:
DATA
30918 ? S 0:03 MailScanner: waiting for messages
32300 ? S 0:02 MailScanner: waiting for messages
32396 ? S 0:01 MailScanner: waiting for messages
32465 ? S 0:01 MailScanner: waiting for messages
32602 ? S 0:02 MailScanner: waiting for messages
32725 ? S 0:02 MailScanner: waiting for messages
622 ? S 0:02 MailScanner: waiting for messages
677 ? S 0:02 MailScanner: waiting for messages
778 ? S 0:02 MailScanner: waiting for messages
865 ? S 0:01 MailScanner: waiting for messages
1079 ? S 0:02 MailScanner: waiting for messages
1136 ? S 0:02 MailScanner: waiting for messages
1322 ? S 0:01 MailScanner: waiting for messages
1635 ? S 0:01 MailScanner: waiting for messages
1893 ? S 0:00 sendmail: server
pc201.broad.dynamic.qz.fj.cn.cndata.com [218.85.167.201] (may be forged)
cmd read
2358 ? S 0:03 MailScanner: waiting for messages
2517 ? S 0:01 MailScanner: waiting for messages
2703 ? S 0:01 MailScanner: waiting for messages
2762 ? S 0:02 MailScanner: waiting for messages
3070 ? S 0:01 MailScanner: waiting for messages
3127 ? S 0:00 sendmail: server [206.74.10.56] cmd read
3163 ? S 0:01 MailScanner: waiting for messages
3185 ? S 0:01 MailScanner: waiting for messages
3193 ? S 0:02 MailScanner: waiting for messages
3458 ? S 0:00 sendmail: server
9.167.71.218.broad.nb.zj.dynamic.cndata.com [218.71.167.9] cmd read
3459 ? S 0:00 sendmail: server
9.167.71.218.broad.nb.zj.dynamic.cndata.com [218.71.167.9] cmd read
3524 ? S 0:02 MailScanner: waiting for messages
3578 ? S 0:02 MailScanner: checking with SpamAssassin
4055 ? S 0:02 MailScanner: waiting for messages
4147 ? S 0:02 MailScanner: waiting for messages
4573 ? S 0:02 MailScanner: waiting for messages
4829 ? S 0:02 MailScanner: waiting for messages
5081 ? S 0:01 MailScanner: waiting for messages
5102 ? S 0:01 MailScanner: waiting for messages
5214 ? S 0:01 MailScanner: waiting for messages
5220 ? S 0:01 MailScanner: waiting for messages
5313 ? S 0:02 MailScanner: waiting for messages
5317 ? S 0:02 MailScanner: waiting for messages
5345 ? S 0:01 MailScanner: waiting for messages
5458 ? S 0:01 MailScanner: waiting for messages
5714 ? S 0:01 MailScanner: waiting for messages
5748 ? S 0:01 MailScanner: waiting for messages
5780 ? S 0:00 sendmail: k5GNk1Bv005780
host112170.metrored.net.mx [200.53.121.170] (may be forged): DATA
5828 ? S 0:00 sendmail: k5GNk9Yo005828
host112170.metrored.net.mx [200.53.121.170] (may be forged): DATA
5931 ? S 0:01 MailScanner: waiting for messages
5983 ? S 0:01 MailScanner: waiting for messages
6150 ? S 0:01 MailScanner: waiting for messages
6203 ? S 0:01 MailScanner: waiting for messages
6337 ? S 0:00 sendmail: server
20151226076.user.veloxzone.com.br [201.51.226.76] cmd read
6338 ? S 0:01 MailScanner: waiting for messages
6471 ? S 0:01 MailScanner: waiting for messages
6489 ? S 0:01 MailScanner: waiting for messages
6559 ? S 0:01 MailScanner: waiting for messages
7047 ? S 0:01 MailScanner: waiting for messages
7104 ? S 0:01 MailScanner: waiting for messages
7274 ? S 0:00 sendmail: server
171.57.112.125.broad.jh.zj.dynamic.cndata.com [125.112.57.171] cmd read
7321 ? S 0:00 sendmail: server
pc85.broad.dynamic.qz.fj.cn.cndata.com [218.5.122.85] (may be forged)
cmd read
7389 ? S 0:01 MailScanner: waiting for messages
7786 ? S 0:01 MailScanner: waiting for messages
7805 ? S 0:01 MailScanner: waiting for messages
7836 ? S 0:01 MailScanner: waiting for messages
7860 ? S 0:00 sendmail: server mx11.sac.fedex.com
[199.81.193.118] cmd read
8458 ? S 0:01 MailScanner: waiting for messages
8489 ? S 0:01 MailScanner: waiting for messages
8942 ? S 0:01 MailScanner: checking with SpamAssassin
8982 ? S 0:00 sendmail: server welcome.aexp.com
[193.32.34.30] cmd read
9113 ? S 0:01 MailScanner: waiting for messages
9122 ? S 0:01 MailScanner: waiting for messages
9192 ? S 0:01 MailScanner: waiting for messages
9194 ? S 0:00 MailWatch SQL
9327 ? S 0:00 sendmail: startup with 209.9.184.36
9357 ? S 0:00 sshd: root at pts/0
9361 pts/0 S 0:00 -tcsh
9425 ? S 0:00 sendmail: server
pool-70-110-225-43.phil.east.verizon.net [70.110.225.43] cmd read
9426 ? S 0:00 sendmail: server
pool-70-110-225-43.phil.east.verizon.net [70.110.225.43] cmd read
9441 ? S 0:00 sendmail: server
1667203209.coffeefilterdreams.com [209.203.67.16] cmd read
9449 ? S 0:00 MailScanner: checking with SpamAssassin
9454 ? S 0:00 MailScanner: checking with SpamAssassin
A 'top' will give the following, I noticed that it's starting to use
swap at this point:
17:04:48 up 7 days, 17:46, 1 user, load average: 3.66, 2.00, 1.38
271 processes: 269 sleeping, 2 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 43.8% 0.0% 7.3% 0.0% 0.0% 48.7% 0.0%
cpu00 31.3% 0.0% 5.8% 0.0% 0.0% 62.7% 0.0%
cpu01 45.0% 0.0% 13.7% 0.0% 0.0% 41.1% 0.0%
cpu02 37.2% 0.0% 1.9% 0.0% 0.0% 60.7% 0.0%
cpu03 61.7% 0.0% 7.8% 0.0% 0.0% 30.3% 0.0%
Mem: 3082456k av, 3017552k used, 64904k free, 0k shrd, 24568k
buff
2327328k actv, 431144k in_d, 39764k in_c
Swap: 4194232k av, 620560k used, 3573672k free 236836k
cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU
COMMAND
13862 root 22 0 38828 30M 2528 R 16.9 1.0 0:01 3
MailScanner
13785 root 16 0 45444 39M 2568 S 2.9 1.3 0:00 2
MailScanner
13777 root 15 0 44532 25M 2340 D 1.9 0.8 0:00 2
MailScanner
13874 root 21 0 3304 3304 1884 S 1.2 0.1 0:00 3 pyzor
778 root 17 0 43452 14M 1780 S 0.7 0.4 0:02 3
MailScanner
13524 root 15 0 1276 1276 840 R 0.7 0.0 0:00 0 top
1460 root 15 0 1112 480 264 S 0.4 0.0 18:25 1
cmaidad
32300 root 15 0 42228 5268 608 S 0.4 0.1 0:02 0
MailScanner
5102 root 17 0 42804 35M 1620 S 0.4 1.1 0:02 0
MailScanner
9357 root 15 0 604 548 184 S 0.4 0.0 0:00 2 sshd
23 root 15 0 0 0 0 SW 0.2 0.0 27:16 2
kjournald
1434 root 15 0 688 424 252 S 0.2 0.0 44:11 1 hpasmd
1630 named 15 0 41872 39M 780 S 0.2 1.3 25:27 1 named
29955 root 15 0 41756 7416 608 S 0.2 0.2 0:04 0
MailScanner
1635 root 15 0 42776 13M 1684 S 0.2 0.4 0:02 2
MailScanner
3070 root 15 0 42748 36M 1776 S 0.2 1.2 0:02 1
MailScanner
4055 root 15 0 43788 36M 1680 S 0.2 1.2 0:02 2
MailScanner
9790 root 15 0 42976 37M 1816 S 0.2 1.2 0:02 2
MailScanner
12365 root 15 0 42192 41M 1448 S 0.2 1.3 0:01 2
MailScanner
12687 root 15 0 43528 42M 2692 S 0.2 1.4 0:01 2
MailScanner
1 root 22 0 500 468 440 S 0.0 0.0 0:51 2 init
Thanks,
-Thomas
> -----Original Message-----
> From: Thomas Chamtieh
> Sent: Friday, June 16, 2006 5:00 PM
> To: 'MailScanner discussion'
> Subject: RE: Strange HI Load
>
> Steve,
>
> Thanks for your insight. It's totally weird, I have 4 other
> server running the same version and all identical. These were
> running fine before the upgrade. When I say hi LA I'm talking
> about 70-85% almost killing the server. On the other 4
> servers I have, the LA never goes above 1.7 and usually is
> about 0.4-0.7, and these server handle a lot more mail that
> the trouble ones.
>
> Thanks,
>
> -Thomas
>
> >
> >
> > > Hi all,
> > >
> > > After I upgraded from 4.46 to 4.54 I started seeing hi load on 2
> > > servers. Looking at the processes. The noticed that after
> a couple
> > > of hours I have 30-40 MailScanner processes in "waiting for
> > > messages" mode.
> > > I have restart every 30 mins. We process over 200K emails
> a day. I
> > > try as much as I can to take a lod off MailScanner, for
> example, I
> > > use sbl-xbl in sendmail and RBL checking in SpamAssassin, I'm not
> > > using RulesDuJour. So it shouldn't be acting that way.
> > >
> > > Your help is appreciated, I have to check on these 2
> servers every 2
> > > hours and restart the MailScanner to get ride of the hung
> processes.
> >
> > As an afterthought, I have an almost identical server. It's message
> > count per day is very close to the problem server. I have
> always had
> > bayes expiry files on the problem server, and almost never on the
> > proper acting one.
> >
> > I see where I have about 4 times the number of tokens in the Bayes
> > database on the problem machine that I have on the proper one. The
> > number of expired tokens on the two machines is really
> extraordinarily
> > difference during an expiry.
> >
> > I used to run a cron job to delete the Bayes expire files
> just to keep
> > the directory clean, but just turned that off in the event I was
> > deleting real, valid files, ... so we'll see.
> >
> > Steve
> > >
> > >
> > > Thanks,
> > >
> > > -Thomas
> > >
> >
More information about the MailScanner
mailing list