Upgraded to 4.67.6, MailScanner scans a batch then hangs at 100 percent CPU

Glenn Steen glenn.steen at gmail.com
Wed Mar 12 19:05:34 GMT 2008


On 12/03/2008, Steve Crumley <scrumley at secure-enterprise.com> wrote:
>
>
>  > -----Original Message-----
>  > From: mailscanner-bounces at lists.mailscanner.info
>  > [mailto:mailscanner-bounces at lists.mailscanner.info] On Behalf
>
> > Of Julian Field
>  > Sent: Tuesday, March 11, 2008 6:50 PM
>  > To: MailScanner discussion
>  > Subject: Re: Upgraded to 4.67.6, MailScanner scans a batch
>  > then hangs at 100 percent CPU
>  >
>
> > * PGP Signed by an unverified key: 03/11/08 at 18:50:26
>  >
>  >
>  >
>  > Steve Crumley wrote:
>  > >
>  > >
>  > >
>  > >> -----Original Message-----
>  > >> From: mailscanner-bounces at lists.mailscanner.info
>  > >> [mailto:mailscanner-bounces at lists.mailscanner.info] On Behalf
>  > >> Of Glenn Steen
>  > >> Sent: Tuesday, March 11, 2008 4:32 PM
>  > >> To: MailScanner discussion
>  > >> Subject: Re: Upgraded to 4.67.6,MailScanner scans a batch
>  > >> then hangs at 100 percent CPU
>  > >>
>  > >> On 11/03/2008, Steve Crumley
>  > <scrumley at secure-enterprise.com> wrote:
>  > >>
>  > >>>  > -----Original Message-----
>  > >>>  > From: mailscanner-bounces at lists.mailscanner.info
>  > >>>  > [mailto:mailscanner-bounces at lists.mailscanner.info] On Behalf
>  > >>>
>  > >>>
>  > >>>> Of Glenn Steen
>  > >>>>
>  > >>>  > Sent: Tuesday, March 11, 2008 1:21 PM
>  > >>>  > To: MailScanner discussion
>  > >>>  > Subject: Re: Upgraded to 4.67.6,MailScanner scans a batch
>  > >>>  > then hangs at 100 percent CPU
>  > >>>  >
>  > >>>  > On 11/03/2008, Steve Crumley
>  > >>>
>  > >> <scrumley at secure-enterprise.com> wrote:
>  > >>
>  > >>>  > >
>  > >>>  > >
>  > >>>  > >  > -----Original Message-----
>  > >>>  > >  > From: mailscanner-bounces at lists.mailscanner.info
>  > >>>  > >  > [mailto:mailscanner-bounces at lists.mailscanner.info]
>  > >>>
>  > >> On Behalf
>  > >>
>  > >>>  > >  > Of --[ UxBoD ]--
>  > >>>  > >
>  > >>>  > > > Sent: Tuesday, March 11, 2008 11:29 AM
>  > >>>  > >  > To: MailScanner discussion
>  > >>>  > >  > Subject: Re: Upgraded to 4.67.6, MailScanner scans a batch
>  > >>>  > >  > then hangs at 100 percent CPU
>  > >>>  > >  >
>  > >>>  > >
>  > >>>  > > > do you have strace installed on the server ? if so when the
>  > >>>  > >  > process is running at 100% CPU connect to it and
>  > see what it
>  > >>>  > >  > is doing.  I had this before, but for the life of
>  > >>>
>  > >> me I cannot
>  > >>
>  > >>>  > >  > remember what I changed to fix it :(
>  > >>>  > >  >
>  > >>>  > >  > Things to check :-
>  > >>>  > >  >
>  > >>>  > >  > 1) Permissions, are they all correct
>  > >>>  > >  > 2) Check MailScanner.conf again just to make sure no typos
>  > >>>  > >  >
>  > >>>  > >  > Regards,
>  > >>>  > >  >
>  > >>>  > >  > --
>  > >>>  > >
>  > >>>  > >
>  > >>>  > > Here is the output from strace:
>  > >>>  > >
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >  waitpid(-1, 0xbff09448, WNOHANG)        = 0
>  > >>>  > >
>  > >>>  > >
>  > >>>  > >
>  > >>>  > >
>  > >>>  > >  The system had been running fine for over a year, I
>  > >>>
>  > >> can't find any
>  > >>
>  > >>>  > >  permission or setting change thats doing this, but
>  > I could be
>  > >>>  > >  overlooking something.
>  > >>>  > >  Thanks,
>  > >>>  > >  -Steve
>  > >>>  > >
>  > >>>  > Could perhaps be a busted SQLite SA cache? What does
>  > >>>
>  > >> analyse_s<TAB> (I
>  > >>
>  > >>>  > don't remember if it is sacache or spamassassin_cache
>  > >>>
>  > >> ... the command
>  > >>
>  > >>>  > completion should take care of it:-) say? If it looks
>  > >>>
>  > >> fishy, simply
>  > >>
>  > >>>  > delete the SA cache file and restart MS.
>  > >>>  >
>  > >>>  > You've run MailScanner --lint, right? Nothing obvious
>  > from that?
>  > >>>  >
>  > >>>  > Oh, and what av scanners do you use? Obviously not
>  > >>>
>  > >> clamavmodule, but
>  > >>
>  > >>>  > perhaps clamav or clamd? are those OK?
>  > >>>  >
>  > >>>  > Cheers
>  > >>>  > --
>  > >>>  > -- Glenn
>  > >>>  > email: glenn < dot > steen < at > gmail < dot > com
>  > >>>  > work: glenn < dot > steen < at > ap1 < dot > se
>  > >>>
>  > >>>
>  > >>>> --
>  > >>>>
>  > >>>  > MailScanner mailing list
>  > >>>  > mailscanner at lists.mailscanner.info
>  > >>>  > http://lists.mailscanner.info/mailman/listinfo/mailscanner
>  > >>>  >
>  > >>>  > Before posting, read http://wiki.mailscanner.info/posting
>  > >>>  >
>  > >>>  > Support MailScanner development - buy the book off the website!
>  > >>>  >
>  > >>>
>  > >>>
>  > >>>
>  > >>> analyse_SpamAssassin_cache looks clean, MailScanner --lint
>  > >>>
>  > >> is clean too.
>  > >>
>  > >>>  I'm running clamd for AV but I've set virus scanning to no
>  > >>>
>  > >> while working
>  > >>
>  > >>>  on this.
>  > >>>
>  > >>> Thanks,
>  > >>>  -Steve
>  > >>>
>  > >> Couldn't be something easily mended, huh:-)....
>  > >>
>  > >> What you seem to have attached to above (with strace) would be the
>  > >> main MailScanner process, since it basically just wait for it's
>  > >> children to end... Or is it? What does a ps listing show (one that
>  > >> show the command argument list, since Jules rewrite it to
>  > show what it
>  > >> thinks it is basically doing)?
>  > >> Do the children restart endlessly when hung? How many children are
>  > >> there, and in what state?
>  > >> Cheers
>  > >> -- Glenn
>  > >>
>  > >
>  > >
>  > >
>  > > When I first started it with 8 children, they all end up
>  > quickly hanging
>  > > and consuming CPU.  For now, I've set it to 1 child and I've been
>  > > running in debug mode.  The ps gives us a good clue!  Its the only
>  > > mailscanner process and it reports "MailScanner: extracting
>  > attachments"
>  > >
>  > > Thanks,
>  > > -Steve
>  > >
>  > In which case go into "sub Explode" in
>  > /usr/lib/MailScanner/MailScanner/Message.pm, and add some
>  > "print STDERR"
>  > lines to generate tracing output so you can see how far it gets. When
>  > you do a "MailScanner --debug" it will show you the STDERR
>  > debug output
>  > in the terminal session.
>  >
>  > Jules
>  >
>
>
> There's something very screwed up with my perl.  I've put "print"s in
>  MailScanner around the call to Explode and I put a print first thing in
>  Explode.  I get the output right before the call but nothing from
>  explode itself and we never return to MailScanner.
>
>  I really appreciate everyone's help with this.
>  Thanks,
>  -Steve
>
I wonder if STDERR is unbuffered (too lazy/tired to go look it up...:)
... Jules? Else you might need do that to get reliable error
printing...

Cheers
-- 
-- Glenn
email: glenn < dot > steen < at > gmail < dot > com
work: glenn < dot > steen < at > ap1 < dot > se


More information about the MailScanner mailing list