Readings: Free Email, Oligopolies, Cascades, Perfumes, etc.

[This is a long post, for which I’m sort of but not really sorry. I started trying to figure something out for my own purposes, and that turned into this, as happens.]

Why isn’t email free-er? To most observers it might seem free, but it isn’t. The monthly costs you pay at some of the largest bulk delivery companies to send out emails to a  paltry 5,000 people a day are hefty, pushing $1,000 a year. And they soar from there into many (many) thousands of dollars for larger numbers of contacts and messages.

How is it that something so technically straightforward, with so little marginal cost, and so well understood, continues to cost so much? Why hasn’t competition driven the price of this (seeming) commodity business to zero-ish?

There are at least three reasons one might posit:

1. It isn’t as cheap as it looks

Maybe providing email services isn’t actually all that cheap. Maybe there are hidden costs that people don’t realize, and those keep costs high. I think of it kind of like the following graph. 

The trouble is that it’s not obvious what the causes of that Mysterious Price Difference might be. The main costs in running an email server — or a cluster of servers — are the software, the hardware, the storage, and the bandwidth. All of these have declined for decades, and continue to decline faster than the year-over-year growth in email traffic or email accounts. It’s hard to imagine another cost not captured in the above four factors, but that’s about all you’re left with on the cost front.

2. It isn’t as easy as it looks 

While having email transit from A to B, even via a host of other mail servers, might seem technically straightforward, it isn’t. Specifically, the real complexity in getting email from place A to place B isn’t the A to B part. No, it’s making sure that email that should go from A to B does do that, and that email that shouldn’t — we often call this sort of email “spam — doesn’t. Differentiating spam from ham is usually thought of as non-trivial, and that creates an advantage for companies that are good at it. And getting good at it requires you to have billions of emails to work with — a corpus, in tech speak — so it’s hard to enter the market, etc.

Email isn’t as easy as it looks, dude.

If you don’t believe me, just tell your favorite techie that you are thinking of setting of setting up your own email server. You will get the look of pity, disgust, and horror that orthopedic surgeons normally reserve for people thinking of, say,  resurfacing their own hip in the bathroom. 

Granted, setting up an email server isn’t easy, but it also isn’t hard. Or at least it’s not hard in the usual sense of hard, like some NP-hard halting problem, or why the square root of -9 is equal to 9i, rather than not being equal to anything at all, etc.. There are many perfectly straightforward guides to putting an email server in the cloud somewhere, or on a Raspberry Pi, and even on appliances you can purchase with pre-built email servers on them. The tricky part about building an email server has mostly to do with maintaining them, filtering spam, and convincing other email hosting companies (and other email servers) that you’re not such a bad guy, so they accept (or transfer) your email.

Putting maintenance aside, which is too often treated as harder than it is, the first spam identification problem isn’t as hard as it once was. Open source software like SpamAssassin (when properly configured) do so well at this — something like 97% effectiveness in one study I saw — that you’d be hard-pressed to call detecting spam from textual or email envelope cues a good example of “not as easy as it looks”. The second problem shouldn’t be that hard, but turns out to be trickier, so I return to it in the next section. 

3. It’s a natural monopoly/oligopoly

While no-one argues this, it’s worth putting this out there: Maybe email provision is, despite appearances to the commodity contrary, a natural monopoly. After all, the big three email hosting companies (Google/Yahoo/Hotmail) have something like 70% of the market; the big three bulk email providers (SendGrid/MailChimp/Amazon SES) probably account for only a little less of the total message market (data is bad, so it’s hard to know, but SendGrid alone sends more than 40 billion emails a month). 

Why might this might be natural oligopoly? In large part because of the way the industry uses longevity as a proxy for reputation, which in turn drives your credibility score as an email service provider. Most new email servers are, well, new, and they don’t start off with a neutral reputation, as you might expect, but with a negative one. It takes very little to make that worse, but an active effort from recipients to make it better. People must seek out your messages in spam and tell email hosts that your message is actually not spam. Most people don’t do that, so email servers start with a crappy default reputation and go nowhere good from there.  The odds are stacked even worse against new providers, and the more “successful” you are, the worse the odds get, as it gets more likely your reputation becomes that of a bulk provider. (This, in part, explains the endless lawsuits involving Spamhaus, an email host rating outfit, and various bulk email providers. who think the former mostly protects incumbents.)

So, to answer my original question, why isn’t email cheaper, the answer is that it should be, and perhaps could be, but the industry has gone down a path that rewards incumbents for their incumbency, and makes it very difficult for new scale providers to enter the market, especially if they want to compete on price and volume. 

<><><><><><><><><><><><><><><><><><><><>

Here are some more articles and papers worth reading: