Fwd: Mailscanner child freezes

Glenn Steen glenn.steen at gmail.com
Fri Nov 21 15:10:09 GMT 2008


Guys,

I know my quoting style will drive you nuts, but ... please look at this.
It's a heads up for 4.72.5, keep a lookout for children busy-looping
while "cleaning messages".
Hopefully Jules, or one of you, will have a solution ... really quick.

Cheers
-- Glenn


---------- Forwarded message ----------
From: Glenn Steen <glenn.steen at gmail.com>
Date: 2008/11/21
Subject: Re: Mailscanner child freezes
To: Jeffrey Haas <jeff at life.illinois.edu>


2008/11/21 Glenn Steen <glenn.steen at gmail.com>:
> 2008/11/21 Glenn Steen <glenn.steen at gmail.com>:
>> Right, I've now been able to reproduce this on a machine (I just
>> thought I had updated to 4.72.5:-). It is as you say very consisten.
>> The error is located in Message.pm around line 3837, where we enter a
>> while loop trying to parse a message so that we reliably can "clean
>> out" bad attachments. For some reason, this parsing never terminates
>> (the call never return "") so it in effect become an endless loop. The
>> code in question, whith a little added debug print, looks like:
>>
>>        # Find the top-level parent's entity
>>        while ($this->{file2parent}{$file} ne "") {
>> print STDERR ".";
>>          $file = $this->{file2parent}{$file};
>>        }
>>
>> I haven't been able (yet) to determine what's wrong here, and am
>> thinking of looking at the latest beta to try determine if that code
>> looks the same there.
>> Perhaps a smarter debug....:-). Hm. Seems that this is a hash
>> constructed when parsing the MIME message, so that "" would be at the
>> root... But in this case "letter.zip" point back to "letter.zip". No
>> "" in sight... Maybe there was a blurb about this after the latest
>> release, but I don't think so. Probably has something to do with MIME
>> parsing etc, the code (at a glance) looks pretty solid.
>>
>> Jules!
>> Could you take a look, pretty please?
>> In the meantime, I'll download 4.73.1 to see if this code has changed.
>>
>> Cheers
>> -- Glenn
>>
>
> Jules and Jeff!
>
> I now think I know what's making these loop forever. The message
> typically contain a zip file named XXXX which in turn contain a zip
> file XXXX (that is: the same name), which all get handled by renaming
> the "inner" zip file with a number tagged on to the "base name"...
> That zip in turn contain an abfuscated executable file (long run of
> whitespace before the double extension).
> The filename is handled when unpacking, but not when constructing the
> filename->parent hash chain. So we end up with a chain looking like
> "letter.zip"->"letter.zip", since the second time we store the
> filename ... it'll overwrite the preexisting "letter.zip"->"".
>
> I'm off to the last performance of Show Boat (www.showboat.se, if any
> of you fancy reading about it ... in Swedish:-), so deciding on how to
> fix this, either by using a sanitized name in the hash or by some
> loop-detection when using it, is entirely in your court Jules... As
> always:-):-).
>
> Strange that so few have noticed this bug.
>
> Cheers
> -- Glenn
>

An example of how it looks in the quarantine:

root at apmx06 52D0E1008122.7666A]# unzip -lv letter.zip
Archive:  letter.zip
 Length   Method    Size  Ratio   Date   Time   CRC-32    Name
--------  ------  ------- -----   ----   ----   ------    ----
  29078  Stored    29078   0%  11-14-08 12:30  029488a5  letter.zip
--------          -------  ---                            -------
  29078            29078   0%                            1 file
[root at apmx06 52D0E1008122.7666A]# unzip -lv letter1.zip
Archive:  letter1.zip
 Length   Method    Size  Ratio   Date   Time   CRC-32    Name
--------  ------  ------- -----   ----   ----   ------    ----
  28864  Stored    28864   0%  11-14-08 12:30  6d9e5302  letter.htm
                                        .exe
--------          -------  ---                            -------
  28864            28864   0%                            1 file
[root at apmx06 52D0E1008122.7666A]# ls -l
totalt 131
-rw-rw---- 1 postfix apache 29078 nov 21 14:49 letter1.zip
-rw-rw---- 1 postfix apache 28864 nov 21 14:49 letter.htm.exe
-rw-rw---- 1 postfix apache 29196 nov 21 14:49 letter.zip
-rw-rw---- 1 postfix apache 40492 nov 21 14:49 message
[root at apmx06 52D0E1008122.7666A]#

Cheers
-- Glenn

>
>> 2008/11/21 Jeffrey Haas <jeff at life.illinois.edu>
>>>
>>> When the MailScanner process is stuck, 'ps auwx' shows:
>>> postfix   6281 73.6  1.2  55256 51356 ?        R    12:40 233:52 MailScanner: cleaning messages
>>>
>>> 'top' looks like:
>>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>>
>>>  6281 postfix   25   0 55256  50m 3820 R  100  1.3 235:41.34 MailScanner
>>>
>>> I installed MailScanner and clamd from the tarballs from mailscanner.info.
>>>
>>> I suppose I won't be surprised if it's something odd on my systems, but these systems are dedicated to running postfix and MailScanner, and I try to keep the configuration as simple as possible.  The basic procedure to set them up is to load Ubuntu Server 8.04 (LTS) with default options.  Then the minimal changes to configure postfix for our needs, then MailScanner/Perl modules/ClamAV/SpamAssassin from the tarballs and then minimal configuration changes again.  That's about it.
>>>
>>> I think the spam.assassin.prefs.confs is the most heavily edited, since we add some rules there.  I could send diffs or config options that are changed from the default for filename.rules.conf, MailScanner.conf and spam.assassin.prefs.conf if anyone would want to look at them.
>>>
>>> We continue to receive messages that cause the children to freeze. Usually 1 - 3 per day.
>>>
>>> Actually here are all the settings changed in MailScanner.conf as a start:
>>> %org-name% = UIUC-Life-Sciences
>>> %org-long-name% = UIUC Life Sciences
>>> %web-site% = www.life.uiuc.edu
>>> Max Children = 10
>>> Run As User = postfix
>>> Run As Group = postfix
>>> Incoming Queue Dir = /var/spool/postfix/hold
>>> Outgoing Queue Dir =  /var/spool/postfix/incoming
>>> MTA = postfix
>>> Incoming Work Group = clamav
>>> Incoming Work Permissions = 0640
>>> Max Unscanned Messages Per Scan = 1
>>> Max Unsafe Messages Per Scan = 1
>>> Virus Scanners = clamd
>>> Clamd Socket = /tmp/clamd.socket
>>> Find Phishing Fraud = no
>>> Use Stricter Phishing Net = no
>>> Allow WebBugs = yes
>>> SpamScore Number Instead Of Stars = yes
>>> Information Header Value = Please contact help at life.illinois.edu for more information
>>> Always Include SpamAssassin Report = yes
>>> Sign Clean Messages = no
>>> Notify Senders = no
>>> Disarmed Modify Subject = no
>>> Is Definitely Spam = %rules-dir%/spam.blacklist.rules
>>> High SpamAssassin Score = 25
>>> SpamAssassin Auto Whitelist = no
>>> Rebuild Bayes Every = 86400
>>> Wait During Bayes Rebuild = yes
>>> High Scoring Spam Actions = delete
>>> Syslog Facility = local0
>>> Log Spam = yes
>>> Log Non Spam = yes
>>> SpamAssassin User State Dir = /var/spool/MailScanner/spamassassin
>>>
>>> The only configuration I've made to the clamav.conf file was to set:
>>> LogFile /var/log/clamav/clamav.log
>>>
>>> and in freshclam.conf:
>>> UpdateLogFile /var/log/clamav/freshclam.log
>>>
>>> As always, thanks for any ideas.
>>>
>>> --jeff
>>>
>>> Glenn Steen wrote:
>>>>
>>>> (Sorry for the resend... forgot to "Reply all"... Sigh:)
>>>>
>>>> 2008/11/18 Jeffrey Haas <jeff at life.illinois.edu>
>>>>>
>>>>> Hi Glenn -
>>>>>
>>>> Hi Jeff,
>>>>
>>>> Sorry for the somewhat late reply... I've been ... busy with work...
>>>>
>>>>> Hope you had a fun singing.  That sounds like a great gig!
>>>>
>>>> Always fun... Last performance tomorrow (I've done 13-14 shows of the
>>>> 40 they're giving), so it'll be a strange mix of loss (it's real fun)
>>>> and releif (since it "eats" a lot of time and energy). Oh well.
>>>>
>>>>
>>>>> I tried to test the SpamAssassin cache theory.
>>>>>
>>>>> My SpamAssassin cache settings are:
>>>>> Cache SpamAssassin Results = yes
>>>>> SpamAssassin Cache Database File = /var/spool/MailScanner/incoming/SpamAssassin.cache.db
>>>>>
>>>>> I tried stopping MailScanner, deleting the SpamAssassin.cache.db file (everything in /var/spool/MailScanner/incoming actually).  Then I restarted MailScanner and sent myself the troublesome message with:
>>>>> sendmail -t jeff at life.illinois.edu < bad.msg
>>>>>
>>>>> One MailScanner child process went to 100% and stopped processing mail.
>>>>>
>>>>> I stopped MailScanner, deleted the message with postsuper and set:
>>>>> Cache SpamAssassin Results = no
>>>>> to try eliminating the cache altogether.  But that test had the same result.  I connected to the troublesome process with strace, but there was no output.
>>>>>
>>>>> I also performed these tests on both our Ubuntu 7.10 & 8.04 systems.  I find some comfort in the idea that the results are consistent and repeatable (on my systems anyway).
>>>>>
>>>> I've tested the ones you sent... none of them throws my system ... it
>>>> detects the viruses quite OK actually...
>>>> Might this be something to do with your clamd? What does the
>>>> MailScanner process say it's doing while eating all the CPU (MS
>>>> rewrite the command line, as you know, so hitting "c" in top... or
>>>> using "ps -ef" or "ps auxww" would show what the child thinks it is
>>>> supposed to do)?
>>>>
>>>> Might be that my testbed isnt enough like yours, so I'll see if I get
>>>> any time for that tomorrow (had planned to look further at this this
>>>> evening, but the SecurID ACE server and the VPN decided to hate each
>>>> other:-).
>>>>
>>>>> I've cc'd Jules on this message in case he'd like to access the bad messages or the strace output.  I'd be happy to provide more info or run tests.  Just let me know what you think might be useful.
>>>>>
>>>>> --jeff
>>>>
>>>> If Jules can tear himself away from playing with Root and Cisco...:-).
>>>> Anyway, all good things willing, I'll know more tomorrow. Gut feeling
>>>> is that it is something specific to your installs, or else there would
>>>> be more on the list about this... MyDoom variants are pretty common,
>>>> after all. You installed MS from the tarball? And clamd from ...?
>>>>
>>>> Cheers
>>>> --
>>>> -- Glenn
>>>>>
>>>>> Glenn Steen wrote:
>>>>>>
>>>>>> 2008/11/16 Jeffrey Haas <jeff at life.illinois.edu>:
>>>>>>>
>>>>>>> Here are the URLs
>>>>>>>
>>>>>>> Bad messages at:
>>>>>>> <http://www.life.uiuc.edu/jeff/ms_freeze.tgz>
>>>>>>>
>>>>>>> strace output at:
>>>>>>> <http://www.life.uiuc.edu/jeff/ms_strace.tgz>
>>>>>>>
>>>>>>> Thanks again.
>>>>>>>
>>>>>>> --jeff
>>>>>>>
>>>>>> Hi eff,
>>>>>>
>>>>>> I'll not have time tolook into these until tomorrow, unfortunately....
>>>>>> Is on my way to my hobby project (singing backstage on a professional
>>>>>> musical... Show Boat... with a mixed South African/Swedish crew... Lot
>>>>>> of work,loads of fun:-), which'll take the rest of the day. You could
>>>>>> try two things:
>>>>>> - Send the same links to Jules (mailscanner at ecs.soton.ac.uk), and
>>>>>> - clear your SpamAssassin result cache database. You do that by
>>>>>> removing the SQLite files... Seeing as in the other message (that I
>>>>>> don't hhave time to look thoroughly at now) your findings do point a
>>>>>> finger at that... Then try dropping in the "problem queue files"
>>>>>> again.
>>>>>>
>>>>>> Cheers
>>>>
>>>>
>>>>
>>>>
>>
>>
>>
>> --
>> -- Glenn
>> email: glenn < dot > steen < at > gmail < dot > com
>> work: glenn < dot > steen < at > ap1 < dot > se
>>
>
>
>
> --
> -- Glenn
> email: glenn < dot > steen < at > gmail < dot > com
> work: glenn < dot > steen < at > ap1 < dot > se
>



--
-- Glenn
email: glenn < dot > steen < at > gmail < dot > com
work: glenn < dot > steen < at > ap1 < dot > se



-- 
-- Glenn
email: glenn < dot > steen < at > gmail < dot > com
work: glenn < dot > steen < at > ap1 < dot > se


More information about the MailScanner mailing list