Race condition on restart

David While dwhile at while.org.uk
Wed Apr 26 16:38:52 UTC 2017


There is a bug in MailScanner which leads to a race condition.

The init script which starts/stops & restarts MailScanner removes the 
PID file when MailScanner is stopped.

Unfortunately when MailScanner is issued with a SIGTERM (when issuing a 
restart or stop) then it also removes the PID file (line 1419 in 
MailScanner).

What is happening is that the init script issues the kill and then 
continues. This doesn't kill MailScanner but simply sends the signal. 
MailScanner does some processing before dying. Consequently there is now 
a race as to who will remove the PID file. The worst case is that it is 
removed after MailScanner has been restarted. This leads to the hourly 
ms-check constantly restarting MailScanner.

I fixed this in the init script by:

1. Removing the following lines from do_stop

                     # remove pid file
                     if [ -f $PIDFILE ] ; then
                                 rm -f $PIDFILE
                         fi

2. Moving the following lines from restart to do_stop where the above 
lines were removed

         s='-\|/';
         x=0
         i=0
         while [ "$x" -lt 300 -a -f $PIDFILE ]; do
            x=$((x+1));
            i=$(( (i+1) %4 ));
            printf "\r${s:$i:1}";
            sleep .1;
         done

What this does is wait for MailScanner to die and remove the PID file 
before continuing.

I don't think this is a permament fix as there ought to be a check that 
it has not timed out before continuing.

David While



More information about the MailScanner mailing list