Upgraded to 4.67.6,
MailScanner scans a batch then hangs at 100 percent CPU
Steve Crumley
scrumley at secure-enterprise.com
Wed Mar 12 20:59:47 GMT 2008
> -----Original Message-----
> From: mailscanner-bounces at lists.mailscanner.info
> [mailto:mailscanner-bounces at lists.mailscanner.info] On Behalf
> Of Julian Field
> Sent: Tuesday, March 11, 2008 6:50 PM
> To: MailScanner discussion
> Subject: Re: Upgraded to 4.67.6, MailScanner scans a batch
> then hangs at 100 percent CPU
>
> * PGP Signed by an unverified key: 03/11/08 at 18:50:26
>
>
>
> Steve Crumley wrote:
> >
> >
> >
> >> -----Original Message-----
> >> From: mailscanner-bounces at lists.mailscanner.info
> >> [mailto:mailscanner-bounces at lists.mailscanner.info] On Behalf
> >> Of Glenn Steen
> >> Sent: Tuesday, March 11, 2008 4:32 PM
> >> To: MailScanner discussion
> >> Subject: Re: Upgraded to 4.67.6,MailScanner scans a batch
> >> then hangs at 100 percent CPU
> >>
> >> On 11/03/2008, Steve Crumley
> <scrumley at secure-enterprise.com> wrote:
> >>
> >>> > -----Original Message-----
> >>> > From: mailscanner-bounces at lists.mailscanner.info
> >>> > [mailto:mailscanner-bounces at lists.mailscanner.info] On Behalf
> >>>
> >>>
> >>>> Of Glenn Steen
> >>>>
> >>> > Sent: Tuesday, March 11, 2008 1:21 PM
> >>> > To: MailScanner discussion
> >>> > Subject: Re: Upgraded to 4.67.6,MailScanner scans a batch
> >>> > then hangs at 100 percent CPU
> >>> >
> >>> > On 11/03/2008, Steve Crumley
> >>>
> >> <scrumley at secure-enterprise.com> wrote:
> >>
> >>> > >
> >>> > >
> >>> > > > -----Original Message-----
> >>> > > > From: mailscanner-bounces at lists.mailscanner.info
> >>> > > > [mailto:mailscanner-bounces at lists.mailscanner.info]
> >>>
> >> On Behalf
> >>
> >>> > > > Of --[ UxBoD ]--
> >>> > >
> >>> > > > Sent: Tuesday, March 11, 2008 11:29 AM
> >>> > > > To: MailScanner discussion
> >>> > > > Subject: Re: Upgraded to 4.67.6, MailScanner scans a batch
> >>> > > > then hangs at 100 percent CPU
> >>> > > >
> >>> > >
> >>> > > > do you have strace installed on the server ? if so when the
> >>> > > > process is running at 100% CPU connect to it and
> see what it
> >>> > > > is doing. I had this before, but for the life of
> >>>
> >> me I cannot
> >>
> >>> > > > remember what I changed to fix it :(
> >>> > > >
> >>> > > > Things to check :-
> >>> > > >
> >>> > > > 1) Permissions, are they all correct
> >>> > > > 2) Check MailScanner.conf again just to make sure no typos
> >>> > > >
> >>> > > > Regards,
> >>> > > >
> >>> > > > --
> >>> > >
> >>> > >
> >>> > > Here is the output from strace:
> >>> > >
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > > waitpid(-1, 0xbff09448, WNOHANG) = 0
> >>> > >
> >>> > >
> >>> > >
> >>> > >
> >>> > > The system had been running fine for over a year, I
> >>>
> >> can't find any
> >>
> >>> > > permission or setting change thats doing this, but
> I could be
> >>> > > overlooking something.
> >>> > > Thanks,
> >>> > > -Steve
> >>> > >
> >>> > Could perhaps be a busted SQLite SA cache? What does
> >>>
> >> analyse_s<TAB> (I
> >>
> >>> > don't remember if it is sacache or spamassassin_cache
> >>>
> >> ... the command
> >>
> >>> > completion should take care of it:-) say? If it looks
> >>>
> >> fishy, simply
> >>
> >>> > delete the SA cache file and restart MS.
> >>> >
> >>> > You've run MailScanner --lint, right? Nothing obvious
> from that?
> >>> >
> >>> > Oh, and what av scanners do you use? Obviously not
> >>>
> >> clamavmodule, but
> >>
> >>> > perhaps clamav or clamd? are those OK?
> >>> >
> >>> > Cheers
> >>> > --
> >>> > -- Glenn
> >>> > email: glenn < dot > steen < at > gmail < dot > com
> >>> > work: glenn < dot > steen < at > ap1 < dot > se
> >>>
> >>>
> >>>> --
> >>>>
> >>> > MailScanner mailing list
> >>> > mailscanner at lists.mailscanner.info
> >>> > http://lists.mailscanner.info/mailman/listinfo/mailscanner
> >>> >
> >>> > Before posting, read http://wiki.mailscanner.info/posting
> >>> >
> >>> > Support MailScanner development - buy the book off the website!
> >>> >
> >>>
> >>>
> >>>
> >>> analyse_SpamAssassin_cache looks clean, MailScanner --lint
> >>>
> >> is clean too.
> >>
> >>> I'm running clamd for AV but I've set virus scanning to no
> >>>
> >> while working
> >>
> >>> on this.
> >>>
> >>> Thanks,
> >>> -Steve
> >>>
> >> Couldn't be something easily mended, huh:-)....
> >>
> >> What you seem to have attached to above (with strace) would be the
> >> main MailScanner process, since it basically just wait for it's
> >> children to end... Or is it? What does a ps listing show (one that
> >> show the command argument list, since Jules rewrite it to
> show what it
> >> thinks it is basically doing)?
> >> Do the children restart endlessly when hung? How many children are
> >> there, and in what state?
> >> Cheers
> >> -- Glenn
> >>
> >
> >
> >
> > When I first started it with 8 children, they all end up
> quickly hanging
> > and consuming CPU. For now, I've set it to 1 child and I've been
> > running in debug mode. The ps gives us a good clue! Its the only
> > mailscanner process and it reports "MailScanner: extracting
> attachments"
> >
> > Thanks,
> > -Steve
> >
> In which case go into "sub Explode" in
> /usr/lib/MailScanner/MailScanner/Message.pm, and add some
> "print STDERR"
> lines to generate tracing output so you can see how far it gets. When
> you do a "MailScanner --debug" it will show you the STDERR
> debug output
> in the terminal session.
OK, Here is whats happening. Its using Explode in MessageBatch.pm and
not Message.pm.
Here is where it dies in MessageBatch.pm:
sub Explode {
my $this = shift;
print STDERR "messagebatch\n"; #crumley
my($key, $message);
# jjh 2004-03-12 reap as many as we can.
# JKF Test 2004-11-23 1 until waitpid(-1, &POSIX::WNOHANG) == -1;
print STDERR "about to hang\n";
1 until waitpid(-1, WNOHANG) == -1;
print STDERR "we never get here\n";
>
More information about the MailScanner
mailing list