sendmail message splitting defeats bandwidth savings?

Mon Nov 3 16:47:57 GMT 2003

Furnish, Trever G wrote:

> I've had the sendmail message splitting running fine for a while, since it
> was the only way to get MailScanner whitelisting to be controlled granularly
> (per user instead of per message).
>
> However, I'm concerned that the resulting increase in mail-related bandwidth
> consumption is many times greater than the actual savings being gained by
> filtering out spam.  (Yes, I realize bandwidth savings aren't the only
> reason to filter spam, but they are important.)
>
> I'm unclear on why, technically, it still makes sense not to have MS split
> messages only when needed, instead of using sendmail queue groups to do it.
> It seems MS is already decoding the messages (MCP, HTML tag checking, etc),
> so the increase in cpu ought to be negligible.  All messages landing in
> mqueue should pass through MS, so MS can easily ensure it doesn't create
> duplicate queue IDs.
>
> Hopefully I'm just incorrect in my current understanding of what happens
> when a message is split by sendmail (please correct me if so), but this is
> how I think things change when queue groups are used:
>
> Without queue group message splitting:
> 1. One message comes in meant for many recipients at the same domain.
> 2. Sendmail writes one queue file pair.
> 3. MailScanner scans and re-queues that message.
> 4. Sendmail delivers the message, sending it ONLY ONCE over the wire to the
> next MX.
>
> With queue group message splitting:
> 1. One message comes in meant for many recipients at the same domain.
> 2. Sendmail writes many queue file pairs.
> 3. MailScanner scans and re-queues all of the (now many) messages.
> 4. Sendmail delivers the messages, one copy per recipient, resulting in the
> original message being sent MANY TIMES over the wire to the next MX.
>
> The message splitting feature applies to ALL messages, not just spam.  This
> means that we may drastically increase our bandwidth usage just by turning
> it on, regardless of whether we're doing spam checking.  I've already seen a
> few instances where the reason our internal WAN links were pegged for an
> hour could be directly traced to this change in delivery architecture.
>
> By contrast what I'd prefer MS to do is: if a message comes in bound for
> multiple recipients and only a few of those recipients should be handled
> specially (whitelisted), create separate copies of the message for those
> recipients, queuing the files into mqueue by generating its own IDs.
>
> To be fair, I realize this is probably not a big concern for most sites, but
> for a site with mail delivery to remote mail box servers over many expensive
> WAN links, this can be a significant problem.
>
> Any suggestions or corrections would be appreciated.

It might be worthwhile to put MS/SA machines at the remote sites, or
centralize the mail stores a bit and then put MS/SA machines at the
remote sites?

Some things you can do to help smooth out the load - though this will
increase message delay a bit...

1. set your incoming sendmail to limit 'max recipients' per message to
some smaller number (10?).
2. set your incoming sendmail 'max recipient throttle' to some lower
value, like 2 or 3, so if a message with 10 recipients comes in, and the
first 2 are 'user unknown', sendmail will pause a sec before accepting
the next.
3. set your incoming sendmail to defer messages based on load avg (this
only works IF the load avg of the machine IS being affected in a
significant way by the splitting of messages).
4. Use firewall/ipfw/whatever rules to control the bandwidth available
to smtp, so your wan links aren't saturated with mail.

Ken A.
Pacific.Net

>