thoughts? Would this defeat botnets?

Sun Nov 19 01:12:32 GMT 2006

Something like milter-error can block inbound connections based on past
failures.

Julian's IPBlock function can block inbound connections based on past
message transmission rate.

I'd like to solicite thoughts on an approach that takes those ideas a
bit further.  Is the approach valid (or would I be wasting my time by
trying to implement it)?

The idea would be similar to that used in Ironport's Senderbase, albeit
much simpler.

Problem: There are reportedly 75,000 Spamthru bots out there, and that's
just one botnet.  If I only let through one spam from each of those bots
per week, I'll still be overwhelmed.

Supposition #1: Most bots run on unmanaged systems that should never be
connecting to my mail server in the first place.  If I ever receive a
message from those systems, it'll be spam.

Idea: Keep a score for each sending IP address, forever.  If that
address sends me spam without sending me ham, it's blocked, permanently
(or until manual intervention on my part).  Reported false-negatives
could be parsed to contribute to the scores.

For example, if we consider a score <0 to mean the connecting system
should be blocked, then I would score each inbound message like so:
	- Each sender IP address score defaults to 0.
	- If the message is ham, add 1 to the score for the sender's IP
address.
	- If the message is spam, subtract 1 from the score of the
sender's IP address.
	- If the message is a reported false negative, subtract 2 from
the score for the sender's IP address (to counteract the 1 we added
originally).

Obviously this breaks some things:
	- Forwarders: If the connecting server isn't the original
sender, then he is either a forwarder, a secure relay or an open relay.
If he's an open relay, I'm happy he's blocked.  If he's a forwarder...I
think I'm ok with blocking him by default.  If he's a secure relay
(someone who only relays for his customers), I'm still ok with blocking
him by default, provided I can override that with exceptions later.

	- Outbound mail from clients: I don't care -- my inbound relays
only scan inbound mail, they don't deliver for clients or touch outbound
mail.

	- Outbound mail from my own users on their home machines: I'm
already trying to prevent that by using SPF, and if this helps spot a
bot, so much the better.

And also it obviously will fail for sender IP addresses that send both
spam and ham without any acceptible choice in the matter, such as secure
relays and mailing list servers.  For those I do business with, I think
I'd be ok putting in exceptions.  Other rules can still be applied to
identify the rest of the spam.

I anticipate someone will suggest using dynablock or other RBLs that
target dynamic IP addresses.  I'm already using dynablock within
spamassassin, but I'm still getting a lot of image spam.  I could use
imageinfo and fuzzyocr, but I really just do not consider those
sustainable long-term solutions.  All of the techniques spammers have
ever used in text can easily be applied to images, and I can't accept
the idea of multiplying my anti-spam server count by 100 to cope with
the additional overhead of applying ocr to everything first when we
eventually escalate to that level.

--
Trever