performance issues tonite

David Vosburgh vosburgh at DALSEMI.COM
Tue Oct 28 13:22:11 GMT 2003

Well, the situation got a whole lot worse that night and the next
morning (several thousand messages backed up) before I finally figured
out what was going on.  Maybe others could find this useful.  It turns
out that an ex-employee, prior to leaving, had set up his corporate mail
to forward to an account he maintained at hotmail.  Apparently, the
hotmail account reached its' quota, which caused a message to be bounced
back saying that it could not be delivered.  This message kept going in
a circle, getting progressively larger with each loop.  Others piled on
as he received more email to his corporate email address.  By the time I
stopped fiddling with the MailScanner config, and started looking at the
actual messages in the queue. I noticed well over  hundred that were
roughly the same size (~80kb).  Looking at the contents clued me in to
what was happening.  The users alias was subsequently pointed to
/dev/null and his messages cleared out in short order.  The processing
was still extremely slow until I restarted MS.

I'm thinking that with our relatively default install, MS was biting off
thirty message chunks of these large(r) files and wasn't able to process
them quickly enough, causing the SA time-outs we were seeing.  I have
since increased the SA time-out to 40 seconds (from 20),  and decreased
the Max Unsafe Messages Per Scan to 10 (from 30).

Any MS experts have any useful suggestions/comments on any other changes
that could help prevent this in the future (besides critiquing our
egress policies AND so long as it doesn't involve $'s)?  Also, is there
any way that MS could be modified to catch a situation like this before
it degenerates too far?  Could MCP be used for this purpose? Thanks.


David Vosburgh wrote:

> I'm using MS 4.21-9 and SA 2.60 (upgraded two weeks ago), no
> DCC/razor/pyzor on a Sun 220R with 1GB memory and 2x450Mhz.  Tonight I
> noticed that our incoming mail queue started backing up about 6pm CDT,
> and delivery times were growing quickly.  It peaked about 9:30pm and has
> been working it's way back down slowly since then.  I've only got about
> 150 messages in the queue now.  During this time, the maillog is
> indicating that SA is timing out:
> Oct 23 22:03:36 xxxxxxx root: [ID 702911]
> MailScanner:SpamAssassin timed out and was killed, consecutive failure 1
> of 20
> I haven't seen more than one consecutive failure.  I've read on this
> list about this being related to SA's RBL checks, but I've got them
> disabled (skip_rbl_checks 1) in spam.assassin.prefs.conf.  The load on
> the system is about what it usually is (load average between 4 and 6,
> with CPU smacked pretty good but ~85% user).  There's not an abnormal
> amount of mail being processed.  Nothing like with the Sobig outbreak.
> FWIW, I've got Max Children = 5.  Any suggestions on what else could
> cause the timeouts?  Thanks.
> Dave


Dave Vosburgh
Sr. Unix System Administrator
Dallas Semiconductor
vosburgh at  972-371-4418

More information about the MailScanner mailing list