Spamware Impatience

This data was generated by instituting a 90 second waiting period before the 220 greeting is given to any host with a PTR name that “looks funny” (looks like a hostname automatically generated for a customer by an ISP, or otherwise looks particularly likely to be a spammer). The wait terminates early if the remote end closes the connection or sends premature data; if this happens, the time the remote host waited is logged.

The intent of this is to require hosts with a spam-correlated name form to indicate that the mail they wish to send to us has some nontrivial value by requiring that they accept the minor burden of waiting 90 seconds for an SMTP connection to proceed. The great majority of legitimate hosts are expected to wait at least that long. For the purpose of analysis, a host that gives up in this period is taken to be a spam host. RFC2821 specifies that a host SHOULD wait for 5 minutes for the 220 greeting.

Over the initial 48 hour test period, 361,345 SMTP connections were made by 92,140 remote hosts. Of these, 112,811 connections from 33,231 hosts were accepted; the others were rejected due to the remote hosts being on blacklists. Of those accepted, 98,090 connections from 29,487 hosts were given the delayed 220 treatment. Of those, 93,294 connections from 28,189 hosts were closed by the remote end during the 90 second wait period.

The above plot shows how long presumed spamware waits for a 220 message before closing its connection, based on SMTP transfer attempts from 109,285 unique hosts that closed the connection during the first 90 seconds.

In another form:
Count%Cumulative %Seconds
3934 3.60 3.600
1053 0.96 4.565
4205 3.85 8.4110
14312 13.10 21.5115
16439 15.04 36.5520
29012 26.55 63.0925
26081 23.86 86.9630
3253 2.98 89.9435
4805 4.40 94.3340
1232 1.13 95.4645
595 0.54 96.0050
1089 1.00 97.0055
1339 1.23 98.2360
311 0.28 98.5165
1148 1.05 99.5670
127 0.12 99.6875
115 0.11 99.7880
235 0.22100.0085

A possible improvement would be to record those hosts that pass the “wait test” and allow them to connect without delay in the future. This leaves open the possibility that spamware might be “quick-fixed” to take advantage of such a rule, though in the long run it seems likely that if spambags find delayed 220 to be a problem, the fix for it will be to improve spamware to allow it to keep a larger number of simultaneous connections open.

After 17 days and 109 thousand unique hosts effectively blocked by this method, two cases of legitimate mail being blocked have been identified. In one case, the remote administrator adjusted his SMTP timeout appropriately; in the other, an improvement was made to the “whitelist” pattern that is consulted before hosts are considered for the delayed 220 treatment.

In the above plot, note the number of hosts that time out after 0 seconds. This is partly a consequence of the fact that the delay is implemented in a way that causes it to be terminated when the remote host either closes the connection OR sends data. An SMTP sender SHOULD wait for the 220 message before sending anything. To see what remote hosts send when they attempt this, when it occurs the delay program reads the first line of pending data. Of the hosts that send data in the first second, 95% send “HELO dotted-quad-address”. This is presumably an attempt at a blind SMTP session, sending a precanned HELO/MAIL/RCPT/DATA down the pipe without any regard for what the other end sends back. This will probably work with most MTAs. The dotted-quad address given with HELO may be an attempt at further speeding up the session: if the receiving host does a forward lookup on the name given with HELO, its resolver will handle a dotted quad address internally without doing any potentially time-consuming DNS lookups.

However, such attempts are easily detected. This host now implements a one second 220 delay for all hosts, and rejects transfer attempts from any that send data in that period.

This plot shows how the spam load (in messages/day) received for my personal address has changed over the last 15 months.
The rightmost part shows the effect of instituting the 220 delay, as well as the simultaneous institution of a complete blocking of hosts that have no PTR record. This has effectively restored my spamload to where it was a bit over a year ago.

After 31 days and 1.8 million connections by 220,000 unique hosts blocked by the no-PTR rule, two instances of legitimate mail being blocked by this rule had been reported.

John DuBois, April 23 - May 5 2004