When both our Amsterdam 1 and Amsterdam 2 data centres, separate facilities at opposite ends of town, began to experience similar levels of packet loss, it became clear that something outside of our control was amiss.
Both facilities are connected to the internet by a handful of global transit providers though two they both share in common are Level3 Communications and Cogent Communications. A quick search on Twitter, the best place to find trending issues, revealed that Level3 was the problem.
Level3 is a Tier 1 Network. This means that they operate a lot of inter-country and inter-continent cables which are vital for everybody to remain connected online. Individual home/office broadband providers will sign agreements directly with or with partner companies of one or more Tier 1 Networks which connects their customers with the rest of the web. As part of this, the companies agree to receive and deliver traffic to houses and businesses on their lines – it is a two way process.
Telecom Malaysia, an internet service provider in Malaysia, is a user of Level3. Probably due to a human error, early this morning, they began giving incorrect information to Level3 as to who they could and couldn’t deliver traffic to and how much they could handle. Their systems gave the go ahead for Level3 to dump tremendous amounts of traffic on them which crippled their infrastructure, slowed down the internet for millions of people and left website owners unsure as to who could and couldn’t visit them. For theoretical sake, for a short period of time, we could say that “25 percent of people online could not access 25 percent of the internet” due to this.
As our facilities are connected directly to multiple Tier 1 Networks, it gives us the ability to respond to these problems fairly promptly. In Amsterdam 1, the NOC immediately severed our connection to Level3, diverting all traffic that would have been lost through our other providers. In Amsterdam 2, whilst we don’t have the official write up yet, we saw traffic flowing in very promptly so we imagine the NOC took similar steps.
This does open our eyes to how fragile the internet can be and a misconfiguration between just two companies can disconnect large parts of the internet from each other. We are lucky enough to own our own hardware and work directly with facilities who are connected to Tier 1 networks, giving us flexibility and the ability to control our own fate. People who are a client of a reseller, or even a reseller of a reseller, may find in scenarios like this they are left offline or confused for a much longer period of time.