An outage at an Amazon Web Services data center last week caused a massive number of online services to go offline. AWS provides cloud services to a great number of online services, and as such, a great number of online services experienced issues or went offline outright.
Services affected by the outage included, but were not limited to, Amazon, Duolingo, Fortnite, Hulu, Outlook, Pinterest, Reddit, Roblox, Signal, Xbox and Zoom. Also affected was Canvas, notable for serving as a learning management system for education systems across the globe, including UW-Platteville.
It is believed that the errors originated from AWS’s US-East-1 data center region, located primarily in Northern Virginia. No services could access DynamoDB, a database service that Amazon employs to store and manage user information and important data.
It turned out that the issue was with their Domain Name System, or DNS, which serves to translate domain names, such as google.com, into IP addresses, so that computers can connect to them and utilize their services.
The error occurred after two programs attempted to write an entry in the DNS at the exact same time, with the two entries cancelling each other out and resulting in an empty entry. This empty entry caused AWS’s DynamoDB database to malfunction, and with it, every service that relied on said database, ultimately leading to the massive outage witnessed last Monday.
Since the event, Amazon has been working to fix its systems, including ensuring that the condition that led to the massive outage doesn’t occur again.
Massive AWS Outage
0