Debugging system load - How to start?
glenn.steen at gmail.com
Wed Oct 28 10:32:38 GMT 2009
2009/10/27 Götz Reinicke - IT-Koordinator <goetz.reinicke at filmakademie.de>:
> we run an "old" mailserver system which was set up a couple of years
> ago. The systme dose "everything" what we need(ed). Over the last days I
> noticed an unnormal increase of the system load up to 10 and lots of
> users told me that there mailclient connections (sending and receiving)
> are dropped from time to time.
> I was planing to exchange the server respectively distribut the services
> in the near future anyway, but I'm interessted in what causes the load
> now or where the system "hangs" :-)
> There is a lot of work to do and the setup is not very well designed,
> but at the time I started the mailsystem at our place, I had only this
> one server and a lot of user requests ...
> The System:
> Intel Pentium D 3.20GHz, 8 GB RAM, 3Ware 4*320GB Sata II Raid Level 5,
> Gigabit LAN.
> Red Hat EL 5.4 (still 32 Bit)
> About 700 Users, 1GB Mailboxquota (mbox), Webmail-System Horde, an
> avarage of 3.200 messages per day over the last 12 Month.
> The Services and setup:
> Dovecot imap(s) & pop3(s), mailscanner, spamassassin, mysql, bind,
> httpd, sendmail.
> Because there are so many config parameters I'll summarise this a little
> Mails are checked agains two blacklists (heise.de and Spamhaus) by
> sendmail, mailscanner and spamassassin use mostly the default settings.
> Virusscanning is done by avira AV.
> Logging for mailwatch to mysql is activated.
> What tools or logfiles may give me a clue which settings should be
> checked or changed?
> Thanks and best regards,
The usual tools, of course;-)
- "top" or "ps", to see how many processes are in state D... and how
the system is performing (CPU-wise, mostly) just now.
- "vmstat 2" to see if you are memory-starved, to the point where you
- iostat, if you have it, can give a clue on average disk IO problems
("hot spots" etc... Watch the average queue time and average queue
- sar, with a sane setup, will give you a clue as to what is normal
and what is not.
- smartmon (or similar, depending on RAID controller etc), give you a
clue about how your drives are faring.
Depending n what you find, you might need use other tools specific to
finetune specific programs/subsystems.
If you are tight for memory, consider lowering the amount of MS
children dramatically, as a stopgap to get things flowing.
Also consider any MTA "max load average" cut-off settings... They may
be set too low, if you have an SMP system, and if so ... might explain
the dropped connections.
email: glenn < dot > steen < at > gmail < dot > com
work: glenn < dot > steen < at > ap1 < dot > se
More information about the MailScanner