Mysterious MailScanner hangs

Eric McClelland emcc-mailscanner at CTCNET.COM
Fri Jul 11 19:25:36 IST 2003

Hi All,

I have a sporadic problem where MailScanner mysteriously stops picking up inbound MTA spool files.  A 'service MailScanner restart' temporarily clears the problem for the most part (inbound MTA queue, normally 0-10, still hovers between 30-95 afterwards).  When the problem occurs, there is invariably one MailScanner process taking >90% of the CPU (load usually 1-3), and the problem persists until I intervene (i.e. MailScanner does not kill and restart itself periodically as it normally does).  I've poked around the MTA and MailScanner queues, but noticed nothing amiss with any of the messages (except that some appear never to get processed), nor do any log entries provide a clue.

At this point I'm trying to decide the next step in troubleshooting; setting "Debug = yes" in MailScanner.conf merely stops the scanning again, but I see no output.  Then again I haven't found much documentation on debugging so perhaps I'm not looking in the right place.

My current setup:
6 servers in a DNS round-robin under one hostname (i.e. one hostname mapping to six different machine IP addresses).
CPU:            Pentium III / 733 MHz
RAM:            Two (512MB each), One (256MB), Three (128 MB)
Distribution:   RedHat 7.3, up2date'd periodically
MailScanner:    4.20-3
All six servers run MailScanner + Postfix + McAfee; no spam checking at this time.

The hardware setup is certainly not ideal, especially where I'm using IDE drives; I have access to suboptimal hardware, but a lot of it.  It's actually easier for me to throw a whole box into the mix than to get a single DIMM > 128MB.  For the most part, the quantity-over-quality strategy has worked fine, and I've seen this problem occur on all the boxes - again sporadically - so I don't think the issue is hardware.

Sheer load does not appear to be the issue, either:  a 'service MailScanner restart' will result in an inbound MTA queue being whittled from several thousand messages to under 100 in minutes.  We did ramp up a lot of traffic on these servers on Monday, but the problem did not appear until Tuesday evening / Wednesday morning.

FWIW we originally saw similar symptoms several weeks ago, back when we ran ClamAV in conjunction with McAfee, but we discovered that the clamav-autoupdate was hanging; killing that script caused MailScanner to wake up with no need for a restart.  I removed clamav from the Virus Scanners list at that time (I've since seen some postings about this in the list archives).  When this problem occurs now, I see no update scripts running.

Any suggestions would be appreciated.  Hopefully I've provided enough info without being long-winded. :)


