Upgraded to 4.67.6,
MailScanner scans a batch then hangs at 100 percent CPU
Rick Cooper
rcooper at dwford.com
Thu Mar 13 00:43:14 GMT 2008
> -----Original Message-----
> From: mailscanner-bounces at lists.mailscanner.info
> [mailto:mailscanner-bounces at lists.mailscanner.info] On
> Behalf Of Julian Field
> Sent: Wednesday, March 12, 2008 5:51 PM
> To: MailScanner discussion
> Subject: Re: Upgraded to 4.67.6, MailScanner scans a batch
> then hangs at 100 percent CPU
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
[...]
> >> In which case go into "sub Explode" in
> >> /usr/lib/MailScanner/MailScanner/Message.pm, and add some
> >> "print STDERR"
> >> lines to generate tracing output so you can see how far
> it gets. When
> >> you do a "MailScanner --debug" it will show you the STDERR
> >> debug output
> >> in the terminal session.
> >>
> >
> >
> > OK, Here is whats happening. Its using Explode in
> MessageBatch.pm and
> > not Message.pm.
> > Here is where it dies in MessageBatch.pm:
> >
> > sub Explode {
> > my $this = shift;
> > print STDERR "messagebatch\n"; #crumley
> >
> > my($key, $message);
> >
> > # jjh 2004-03-12 reap as many as we can.
> > # JKF Test 2004-11-23 1 until waitpid(-1, &POSIX::WNOHANG) == -1;
> > print STDERR "about to hang\n";
> > 1 until waitpid(-1, WNOHANG) == -1;
> > print STDERR "we never get here\n";
> >
> But as the comments in the code show, this code hasn't been touched
> since 2004. So I don't understand why you are just seeing a
> change in
> behaviour. I would suspect you have upgraded something else
> in your system.
I missed a bunch of this and I could go back and read but I will ask
instead... Have you had a look at what the hanging process is doing yet with
lsof? Particularly lsof +r -p ?
>
> Are other people seeing the same problem?
> What OS, distro, version, kernel, etc are you running?
> Is anyone else running an identical system?
> If so, are they seeing the same symptoms?
>
> From the "perl-func" man page:
> waitpid PID,FLAGS
> Waits for a particular child process to
> terminate and returns
> the pid of the deceased process, or "-1" if
> there is no such
> child process.
> so it should reap processes until there aren't any left to
> be reaped.
> What does the documentation for waitpid say on your system?
> This is a
> POSIX function, so should be the same across most systems.
>
> If you take out the waitpid() call, you will collect <defunct>
> processes, as they are terminating but never being reaped.
> So this call
> is very necessary.
>
> I'm not going to touch this code with a 10-foot barge pole
> unless I have
> *very* good reason to.
>
> Jules
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
More information about the MailScanner
mailing list