MailScanner lock up Problem

Steve Freegard steve.freegard at fsl.com
Fri Sep 7 15:33:34 IST 2007


Hi Paul,

Paul Kelly :: Blacknight Solutions wrote:
> Hi All,
> 
> One of our boxes has developed a weird lock up problem in the last 24 hours.
> 
> Version: 4.63.8
> SA: 3.2.3
> Perl: 5.8.5
> 
> Once started MailScanner will process a few 100 mails and then simply
> lock up.
> 
> e.g.
> 
> mail     22296  0.2  1.3 63840 58056 ?       S    14:03   0:06
> MailScanner: dangerous content scanning
> mail     22325  0.3  1.4 64288 58524 ?       S    14:03   0:06
> MailScanner: waiting for messages
> mail     22384  0.2  1.3 63104 57344 ?       S    14:04   0:05
> MailScanner: checking with SpamAssassin
> mail     22452  0.3  1.4 64412 58632 ?       S    14:04   0:06
> MailScanner: waiting for messages
> mail     22506  0.3  1.4 64364 58520 ?       S    14:04   0:06
> MailScanner: finishing batch
> mail     22543  0.2  1.3 62660 56900 ?       S    14:04   0:05
> MailScanner: checking with SpamAssassin
> mail     22588  0.2  1.3 63504 57652 ?       S    14:04   0:05
> MailScanner: spam checks
> mail     22608  0.2  1.4 64212 58372 ?       S    14:04   0:06
> MailScanner: waiting for messages
> mail     21668  0.0  0.5 26948 21112 ?       Ss   14:03   0:00
> MailScanner: master waiting for children, sleeping
> mail     21669  0.2  1.3 63088 57272 ?       S    14:03   0:05
> MailScanner: waiting for messages
> mail     22176  0.2  1.4 64648 58836 ?       S    14:03   0:06
> MailScanner: dangerous content scanning
> 
> 
> They will stay like that for ever. If you do a restart, the processes
> never die and the init script is unhappy and keeps saying:
> 
> Waiting for MailScanner to die gracefully
> ...................................
> 
> 
> it'll keep going like that, till I have to force a restart.
> 
> An strace on some of the PIDs shows:
> 
> Parent process:
> 
> # strace -p 21668
> Process 21668 attached - interrupt to quit
> waitpid(-1,  <unfinished ...>
> 
> Children:
> 
> # strace -p 21669
> Process 21669 attached - interrupt to quit
> write(5, "<22>Sep  7 14:08:37 MailScanner["..., 77 <unfinished ...>
> 
> # strace -p 22176
> Process 22176 attached - interrupt to quit
> write(5, "<22>Sep  7 14:19:50 MailScanner["..., 113 <unfinished ...>
> 
> No other activity from them. We're running this on CentOS 4.5 and most
> of the important perl modules are current.
> 
> If i run MailScanner --debug --debug-sa, the process exists once it's
> batch is done with no obvious errors or anything.
> 
> Any of you got any ideas??

By the looks of the strace - my gut tells me that the problem is related 
to syslog.  Hopefully that might point you in the right direction.

Also - make sure that 'nscd' isn't running as that's caused me no end of 
trouble in the past with similar symptoms.

Kind regards,
Steve.


More information about the MailScanner mailing list