- Google Cloud’s API service is responsible for extending massive
- Most of the region returned online in 40 minutes, but some took more time
- The company has promised to protect the future closure and improve communication
After the recent widespread closure of Google Cloud, which took sites like Spatif, Cloud Flair and Discover Offline, the company released its detailed report in which it thwarted users.
The company says the main reason was the code problem in the service control – a part of the company’s API management and policy checking system.
In particular, the wrong automatic quota update and lack of dealing with the appropriate error have mobilized the global crash loop, which not only showed 503 errors in Google Cloud Services, but also services using its APIS.
Google Cloud closure due to API problem
The closure also affected Google Cloud Infrastructure, as well as other famous Google Work space apps such as drive, documents, Gmail and calendars. However, Google Cloud’s third -party sites that access the API, including the famous music streaming platform spataf, have affected 678 users as well as some cloud flair services.
The company wrote, “On May 29, 2025, a new feature was added to service control for additional quota policy checks.” Report of the incident. “The problem of this change was that it was not suitable for dealing with the proper mistake, nor did the flag be protected.”
Google Cloud is proud that its site’s reliability engineering team has begun to support the incident within two minutes, which has identified the main reason within 10 minutes. “The Red Button (to disable the serving route) was ready at ~ 25 minutes from the beginning of the event,” Google said with the rollout within 40 minutes.
Although the small regions were relatively rapidly recovered, it took longer to return online-online-online in case of about two hours and 40 minutes in the case of this particular region.
On the day of its mini -incident report, Google Cloud has “promised to perform well.” Its more detailed report promises to advance the routine response, such as improving stable analysis and testing methods, auditing and service control architecture to overcome future events, but the company has also promised to improve the future of “better (this) external communication” as well as promise to improve its future. Lives