John Markoff’s NY Times article on Monday is a good starting point to understanding the dramatic changes in Web traffic patterns in the last two years. (I know, everything about the Internet is dramatic.) Some of the data referred to in the Markoff piece come from the ATLAS Internet Observatory 2009 Annual Report, which summarizes data collected from over 100 ISPs worldwide.
The report (a collaborative effort involving Arbor Networks, the University of Michigan, and Merit Network) provides hard quantitative data to back gut feelings. Enterprise data is moving to the cloud, right? That’s validated by a heavy-tailed graph showing that a few hosting entities dominate. Another key trend is also confirmed: we’re making fewer hops to get our content since we’re pulling directly from content distribution networks (CDNs) belonging to Google, Yahoo, and Facebook, the new Hyper Giants.
I’ve pulled out the best charts and graphs from the report for your consideration.
If you have the time, Arbor Networks’ Chief Scientist, Dr. Craig Labovitz, can walk you through the report’s findings that were presented at the North America Operator’s Group (NANOG) meeting in October 2009.
If not, here are the most important summarized graphically. The first chart shows the rise, since 2007, of the Hyper Giants. The Giants are content providers that have muscled into the top 10 traffic destinations usually dominated by the ISPs. Translation: Google and others have set up their own caching infrastructure at service provider locations. So instead of IP traffic passing through many routers spread out across domains, packets make just one or two hops to land on a server controlled by a content owner.
The next graph shows that Internet traffic is concentrated to about 150 ASNs– an autonomous system numbers is just a way to map disparate IP routing information to a single owner. In 2007, the cumulative distribution of content traffic was a very, very long tail spread out over thousands of entities. In 2009, there was a shift towards enterprise content hosted in the cloud and further consolidation in content ownership—i.e., YouTube is part of Google, etc. The cumulative distribution curve reveals that 50% of the traffic on the Internet is now controlled by a mere 100 or so owners.
(long pause) Wow.
When you speak of the “edge” of the internet, do you mean the point beyond which the internet no longer goes? Is there anyone out there trying to figure out the shape of the internet or it’s borders? Sure, we all know that it is not infinite, but is that where curiosity stops?
We’re discussing this on our blog today and it seems like you could help us out with an expert opinion: http://www.chickenmonkeydog.com/the-edge-of-the-internet/
For the purposes of post, my curiosity ended at the point where the CDNs are positioned, which is not all that many hops from a consumer’s DSL or cable modem–at the “edge”. CDNs are positioned close to these endpoints for performance reasons. The edge of the carrier’s network refers to the start of their core network– a place where all your network neighbors and their neighbors,etc are monster routers and other network components.
I suppose in a graph-oriented view, you can say that the edge is the average number of hops it take to get from one node to any other. If the average “diameter” is , say, 5, than I can get to any node in 5 hops, on average.