Cloudy with a chance of lag

A “network congestion issue” has been blamed for a multi-hour outage to Google Cloud Platform (GCP) on Sunday.

The outage – which began around 19:25 UTC and persisted for more than four hours – rendered various Google and third-party services either laggy or unavailable.

The snafu affected Google Cloud, G Suite, and YouTube. Third-party services including Vimeo, Uber, and Snapchat were also said to have been impacted.

Google engineers battled to restore services blamed a “network congestion issue in eastern USA” for the issue with GCP.

No root cause beyond increased traffic has yet been offered, but Google has promised to release a post-mortem once it gets to the bottom of the problem.

A statement on the Google Cloud Status Dashboard offers a sitrep.

“The network congestion issue in eastern USA, affecting Google Cloud, G Suite, and YouTube has been resolved for all affected users as of 4:00pm US/Pacific,” it said.

“We will conduct an internal investigation of this issue and make appropriate improvements to our systems to help prevent or minimize future recurrence.”

Birds-eye view

Network monitoring firm ThousandEyes backed up Google’s assessment that networking – rather than a software glitch or hardware fault – caused the problem.

“ThousandEyes can confirm Google’s report of network congestion as a likely root cause of Sunday’s massive 4-hour outage, as we started seeing elevated packet loss in Google’s network as early as 12pm PT between sites on the eastern US, including Ashburn, Atlanta and Chicago, and various Google-hosted services,” ThousandEyes said.

“These issues started to impact users globally approximately 20 minutes prior to their public announcement of the issue, showing an early indication of what was to come.

“For the majority of the duration of the 4+ hour outage, ThousandEyes detected 100% packet loss for certain Google services from 249 of our global vantage points in 170 cities around the world. Starting at around 3:30pm PT, we started to see services slowly become reachable again, and the issue appeared to fully resolve by 4:45pm PT,” it added.

ThousandEyes further reported that the outage appeared to be centered on Chicago.

In response to a request to comment on the underlying reasons for its weekend GCP problems, Google re-iterated what it said overnight, indicating it doesn’t have anything further to offer (at least publicly) for now.

“The network congestion issue in eastern USA affecting Google Cloud, G Suite and YouTube has been resolved for all affected users as of 4:00pm US/Pacific,” a company spokesperson told The Daily Swig.

“We will conduct a post mortem and make appropriate improvements to our systems to prevent this from happening again. We sincerely apologize to those that were impacted by yesterday’s issues. Customers can always find the most recent updates on our systems on our status dashboard.”