• About
  • Subscribe
  • Contact
Friday, May 9, 2025
    Login
  • Management Leadership
    • Growth Strategies
    • Finance
    • Operations
    • Sales and Marketing
    • Careers
  • Technology
    • Infrastructure and Platforms
    • Business Applications and Databases
    • Big Data, Analytics and Intelligence
    • Security
  • Industry Verticals
    • Finance and Insurance
    • Manufacturing
    • Logistics and Transportation
    • Retail and Wholesale
    • Hospitality and Tourism
    • Government and Public Services
    • Utilities
    • Media and Telecommunications
  • Resources
    • Whitepapers
    • PodChats
    • Videos
  • Events
No Result
View All Result
  • Management Leadership
    • Growth Strategies
    • Finance
    • Operations
    • Sales and Marketing
    • Careers
  • Technology
    • Infrastructure and Platforms
    • Business Applications and Databases
    • Big Data, Analytics and Intelligence
    • Security
  • Industry Verticals
    • Finance and Insurance
    • Manufacturing
    • Logistics and Transportation
    • Retail and Wholesale
    • Hospitality and Tourism
    • Government and Public Services
    • Utilities
    • Media and Telecommunications
  • Resources
    • Whitepapers
    • PodChats
    • Videos
  • Events
No Result
View All Result
No Result
View All Result
Home Technology

Beware the multi-headed hydra of cloud resilience

Lydia Leong by Lydia Leong
October 27, 2020
Photo by Digital Buggu from Pexels: https://www.pexels.com/photo/colorful-toothed-wheels-171198/

Photo by Digital Buggu from Pexels: https://www.pexels.com/photo/colorful-toothed-wheels-171198/

Clients have recently been asking a lot more questions about the comparative resilience of cloud providers.

Identity services are a particular point of concern (for instance, the Azure AD outage of October 1st and Google Cloud IAM outage of March 26th) since when identity is down, the customer can’t access the cloud provider’s control plane (and it may impact service use in general) — plus there’s generally no way for the customer to work around such issues.

The good news is, hyperscale cloud providers do a pretty good job of being robust. However, the risk of smaller, more hosting-like providers can be much higher — and there are notable differences between the hyperscalers, too.

Operations folks know: Everything breaks. Physical stuff fails, software is buggy, and people screw up (a lot). A provider can try its best to reduce the number of failures, limit the “blast radius” of a problem, limit the possibility of “cascading failures”, and find ways to mitigate the impact on users. But you can’t avoid failure entirely. Systems that are resilient recover quickly from failure.

If you chop off the head of a hydra, it grows back — quickly. We can think about five key factors — heads of the hydra — that influence the robustness, resilience, and observed (“real world”) availability of cloud services:

  • Physical design: The design of physical things, such as the data center and the hardware used to deliver services.
  • Logical (software) design: The design of non-physical things, especially software — all aspects of the service architecture that is not related to a physical element.
  • Implementation quality: The robustness of the actual implementation, encompassing implementation skill, care and meticulousness, and the effectiveness of quality-assurance (QA) efforts.
  • Deployment processes: The rollout of service changes is the single largest cause of operational failures in cloud services. The quality of these processes, the automation used in the processes, and the degree to which humans are given latitude to use good judgment (or poor judgment) thus have a material impact on availability.
  • Operational processes: Other operational processes, such as monitoring, incident management — and, most importantly, problem management — impact the cloud provider’s ability to react quickly to problems, mitigate issues, and ensure that the root causes of incidents are addressed. Both proactive and reactive maintenance efforts can have an impact on availability.

A sixth factor, Transparency, isn’t directly related to keeping the hydra alive, but matters to customers as they plan for their own application architectures and risk management — contributing to customer resilience.

Transparency includes making architectural information to customers, as well as delivering outage-related visibility and insight to customers. Customers need real-world info — like current and historical outage reports and the root-cause-analysis port-mortems that offer insight into what went wrong and why (and what the provider is doing about it).

When you think about cloud service resilience (or the resilience of your own systems), think about it in terms of those factors. Don’t think about it like you think about on-premises systems, where people often think primarily about hardware failures or a fire in the data centre. Rather, you’re dealing with systems where software issues are almost always the root cause. Physical robustness still matters, but the other four factors are largely about software.

First published on Gartner Blog Network

Related:  Gartner’s 2023 strategic tech trends
Tags: cloud providersGartnerhyperscale
Lydia Leong

Lydia Leong

Lydia Leong is a Distinguished VP and Analyst with Gartner for Technical Professionals (GTP). Ms. Leong's coverage is focused on cloud computing and infrastructure strategies, particularly infrastructure as a service (IaaS), along with platform as a service (PaaS) as it intersects IaaS. She also covers a constellation of related topics, such as cloud strategy, management and governance; cloud managed service providers (MSPs); and cloud operations including DevOps. Because cloud computing is reshaping the IT landscape, her research covers a broad range of topics related to the transformation of IT organizations, data centres and technology providers. Over the course of her Gartner career, she has worked in all three of Gartner's major research divisions, advising business and technical leadership at end-user organizations and vendors, as well as investors. She was Gartner's Analyst of the Year in 2010.

No Result
View All Result

Recent Posts

  • APAC CIOs rethink cybersecurity investments amid expanding threat landscape
  • Study finds almost half of businesses bank on AI-enabled cybersecurity for EDR and XDR
  • AI drives cloud market growth in Q1
  • ARTHALAND chooses OutSystems to advance real estate sustainability
  • Experts warn against AI-powered deepfake impersonation scams

Live Poll

Categories

  • Big Data, Analytics & Intelligence
  • Business Applications & Databases
  • Business-IT Alignment
  • Careers
  • Case Studies
  • CISO
  • CISO strategies
  • Cloud, Virtualization, Operating Environments and Middleware
  • Computer, Storage, Networks, Connectivity
  • Corporate Social Responsibility
  • Customer Experience / Engagement
  • Cyber risk management
  • Cyberattacks and data breaches
  • Cybersecurity careers
  • Cybersecurity operations
  • Education
  • Education
  • Finance
  • Finance & Insurance
  • FutureCISO
  • General
  • Governance, Risk and Compliance
  • Government and Public Services
  • Growth Strategies
  • Hospitality & Tourism
  • HR, education and Training
  • Industry Verticals
  • Infrastructure & Platforms
  • Insider threats
  • Latest Stories
  • Logistics & Transportation
  • Management Leadership
  • Manufacturing
  • Media and Telecommunications
  • News Stories
  • Operations
  • Opinion
  • Opinions
  • People
  • Process
  • Remote work
  • Retail & Wholesale
  • Sales & Marketing
  • Security
  • Tactics and Strategies
  • Technology
  • Utilities
  • Videos
  • Vulnerabilities and threats
  • White Papers

Strategic Insights for Chief Information Officers

FutureCIO is about enabling the CIO, his team, the leadership and the enterprise through shared expertise, know-how and experience - through a community of shared interests and goals. It is also about discovering unknown best practices that will help realize new business models.

Quick Links

  • Videos
  • Resources
  • Subscribe
  • Contact

Cxociety Media Brands

  • FutureIoT
  • FutureCFO
  • FutureCIO

Categories

  • Privacy Policy
  • Terms of Use
  • Cookie Policy

Copyright © 2022 Cxociety Pte Ltd | Designed by Pixl

Login to your account below

or

Not a member yet? Register here

Forgotten Password?

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Management Leadership
    • Growth Strategies
    • Finance
    • Operations
    • Sales and Marketing
    • Careers
  • Technology
    • Infrastructure and Platforms
    • Business Applications and Databases
    • Big Data, Analytics and Intelligence
    • Security
  • Industry Verticals
    • Finance and Insurance
    • Manufacturing
    • Logistics and Transportation
    • Retail and Wholesale
    • Hospitality and Tourism
    • Government and Public Services
    • Utilities
    • Media and Telecommunications
  • Resources
    • Whitepapers
    • PodChats
    • Videos
  • Events
Login

Copyright © 2022 Cxociety Pte Ltd | Designed by Pixl

Subscribe