Sendmail problems on RHEL5 (and solution)

Plant, Dean dean.plant at roke.co.uk
Tue Sep 11 10:03:00 IST 2007


Denis Beauchemin wrote:
> Hello all,
> 
> Ever since I switched to my new RHEL5 MS servers I was noticing many
> errors like these:
> Sep  7 00:10:36 132.210.244.13 sendmail[6929]: l873tB1s006929:
> collect: premature EOM: unexpected close
> Sep  7 00:10:36 132.210.244.13 sendmail[6929]: l873tB1s006929:
> collect: unexpected close on connection from pobox.sfu.ca,
> sender=<someone at sfu.ca> 
> 
> I could get thousands of these in a day and they resulted in delivery
> delays that were starting to annoy seriously my users because they
> were coming from legitimate servers.  I was also annoyed because the
> boxes 
> were running with more and more sendmail processes.
> 
> We finally tracked it down to a faulty TCP/IP default setup on RHEL5!
> To correct the problem I had to:
> sysctl -w net.ipv4.tcp_wmem="4096 16384 131072"
> sysctl -w net.ipv4.tcp_rmem="4096 87380 174760"
> 
> and modify /etc/sysctl.conf :
> net.ipv4.tcp_wmem="4096 16384 131072"
> net.ipv4.tcp_rmem="4096 87380 174760"
> 
> For some unknown reason the TCP/IP stack was telling some remote hosts
> to use a really small window size and this resulted in some equipment
> down the line breaking the connection.  It happened more often with
> big emails (the ones with attachments).
> 
> I don't know if this bug is also present on CentOS5, but it might
> be... 
> 
> The following commands might help you find out if you have the problem
> (quick hack):
> grep "unexpected close on connection" /var/log/maillog | perl -ne '
>   next unless /collect: unexpected close on connection from ([^,]+),/;
>   $f{$1}++;a broken
>   END{
>     foreach $i (sort keys %f){
>       printf "%25s : %d\n", $i, $f{$i};
>     }
>   }' | sort -k3n | tail
> 
> If you see some servers with hundreds of errors, you may have the
> problem... 
> 
> Denis

This might be related, when we moved to CentOS 5 we had issues with TCP
connections stalling and traced this down to a broken firewall and TCP
window scaling. This only happened when transmitting larger amounts of
data.

This is a known symptom of some broken firewalls which rewrite (rather
than remove) this option. This means that one end thinks a different
window scale is being used to the other, and things break.

You can echo 0 > /proc/sys/net/ipv4/tcp_window_scaling on the RHEL 5 box
to see if this is affecting you as this was a workaround until we had a
patch from the firewall vendor.

Dean


More information about the MailScanner mailing list