Cloud vulnerability teardown: what's important and what you can ignore
Breaking down the challenges of vulnerabilities in the cloud and how to identify if your team is at risk
TL;DR
Explore the unique challenges of vulnerability management in the cloud, including dynamic resources, complex, multi-layered and multi-cloud environments, and cloud-native development
Identify where your team is struggling with vulnerability management to better focus your remediation efforts
Software vulnerabilities are one of the most common attack vectors—and one of the most frustrating for victims. In many cases, patches for the flaws attackers exploit have been available for months or longer, and more timely remediation might have prevented breaches.
On the other hand, the scale and diversity of modern computing environments make it impractical to expect every vulnerability to be addressed at the moment of discovery. That’s even truer for cloud environments, where misconfigurations are more common, change is more frequent, and visibility is more challenging.
The truth, however, is that many vulnerabilities are highly unlikely to lead to problems. Instead of aiming for 100% remediation—an unrealistic goal of questionable value—organizations should prioritize the vulnerabilities that pose the greatest exploit risk.
In this blog, we’ll discuss vulnerability management for cloud environments. We’ll look at the challenges it poses, and how Expel helps security teams quickly identify and address the most critical vulnerabilities.
Vulnerability management is different in the cloud
The premise of vulnerability management (VM) is simple: Identify the security flaws or weaknesses in an organization’s software, and then work to reduce or eliminate the risks they pose. In traditional on-prem computing environments, applications reside within a perceived secure perimeter, change infrequently, and have relatively few interdependencies and integrations—making VM fairly straightforward.
It’s a different story in the cloud. Businesses are moving beyond sequential development, monolithic applications, and data center deployment. And many security and IT teams find their established VM scanning tools and practices are poorly suited for the dynamic, complex, and decentralized world of cloud-native applications. As potential security gaps proliferate, they often struggle with several challenges.
Dynamic and ephemeral resources
By design, cloud infrastructure changes constantly. Applications are deployed, updated, and moved. Demand fluctuates. And instances of cloud resources—such as storage and compute—spin up and down continuously. This constant change can introduce new vulnerabilities at a pace unmatchable by traditional assessment cycles. It also makes it difficult to maintain visibility and real-time monitoring of the vulnerabilities teams do identify.
Signs your VM practice is struggling in this more dynamic environment include:
Vulnerability management backlogs: Your environment hosts a high and rising volume of vulnerabilities that haven’t been assessed or prioritized yet.
Confusion about vulnerability age: The age of identified vulnerabilities keeps resetting as related ephemeral resources like containers are replaced. This makes it difficult to track key metrics, such as mean open vulnerability age (MOVA).
Static remediation service level agreements (SLAs): The SLA for the remediation of a vulnerability is based solely on its severity. It fails to account for resource availability, changes in system criticality, or new intelligence on the changing nature of a threat. All of these real-time factors can affect the remediation timeline.
Reappearing vulnerabilities: The same vulnerability—identified by the current common vulnerabilities and exposures (CVEs) ID—keeps popping up in your backlog. This is because you’re only remediating vulnerabilities in production, not at their source in development.
Complex, multi-layered environments
Cloud environments encompass a stack of services, applications, and network layers across various cloud providers. Each has its own potential vulnerabilities. This can create numerous interdependencies across a broad landscape, making it hard to determine which vulnerabilities are most important. Teams spend too much time trying to understand the environment and can’t give enough attention to prioritization.
Complexity may be an issue for your VM team if you’re seeing:
Ineffective prioritization: Teams are setting priorities for remediation based only on the severity of each vulnerability. This narrow approach overlooks essential considerations, such as:
The likelihood and possibility of exploitation
The criticality of the relevant asset
The possibility of public accessibility
The blast radius of a potential attack
The reachability of the detected vulnerability by attackers—(i.e., whether it can even be exploited within the environment)
Unprioritized vulnerabilities: Some critical vulnerabilities aren’t prioritized at all, leaving them at risk of exploitation.
Intelligence-blind prioritization: Teams aren’t factoring the latest threat intelligence into vulnerability prioritization. They’re making decisions based on theoretical risk, rather than insight into which of the vulnerabilities are being actively exploited or targeted in the wild.
A narrow focus on patching: Remediation efforts immediately focus on patching the vulnerability. They don’t evaluate other risk mitigation methods that could be more efficient or effective in the long term, such as implementing compensating security controls.
Lackluster results: Validation measures—such as pentest findings, compliance audits, and insurance audits—find that too many critical vulnerabilities are posing an unacceptable level of risk in your environment.
No evaluation of effort versus impact: Teams have no way to consider the value of a fix in the context of the level of effort it requires. Would production of a critical service have to be taken down? Would the team have to remove a third-party library, breaking the application?
Decentralized and multi-cloud environments
Organizations using multiple cloud providers or a hybrid environment will face inconsistencies as teams work across different platforms, each with individual sets of tools, security protocols, and configurations. Vendor-dependent patching adds to the problem, as many vulnerabilities in SaaS solutions can only be patched by the vendor, delaying remediation.
When teams lack an effective way to work across platforms, the problem often manifests as:
Inconsistent security posture: Teams are unable to take a unified and holistic approach to VM, so they leave some environments more secure than others.
Fragmented visibility: Instead of a comprehensive and organized view of vulnerabilities across environments, teams must switch between separate screens for each platform. This makes it hard to understand, prioritize, and communicate the risk facing individual business units and the organization as a whole. Additionally, cloud resources aren’t always online for VM evaluations, so visibility may be limited.
Manual tracking: Teams use spreadsheets to track vulnerabilities, risk acceptances, and remediation measures—a burdensome and error-prone method.
Lack of metrics: When VM is siloed by platform, organizations lack aggregated metrics to understand the overall status and effectiveness of the practice across their hybrid or multi-cloud environment.
Inconsistent or absent SLAs: With different ways of working across different platforms, teams can’t establish uniform SLAs for remediation. And sometimes, these SLAs don’t exist at all.
Frustrating vulnerability review meetings: These sessions to review the organization’s accumulation of unaddressed vulnerabilities are extremely long and include unpleasant surprises.
Lack of standardized secure resources: Inconsistency across platforms can make it difficult or impossible to implement golden images, infrastructure-as-code (IaC) templates, vulnerability scanning technologies, and landing zones that already incorporate security best practices.
No process for unpatchable platforms: Some platforms can’t be patched. They may be too critical to take offline, require third-party patching, or lack an available patch. For such cases, teams need an alternative way to handle the risk.
Cloud-native development implications
VM methods developed for traditional applications aren’t well adapted for the unique characteristics of cloud-native applications, including DevOps, containerization, and continuous integration and continuous delivery/deployment (CI/CD). The speed and complexity of modern development practices can leave dangerous gaps, letting vulnerabilities flow unchecked into production.
Cloud-native development may be posing problems for your team if you discover:
Untracked cloud-native vulnerabilities: Cloud-native development introduces a broad universe of additional vulnerability types to track. These include those related to:
APIs
Third-party code, a.k.a. the software supply chain
Third-party integrations
Containers
Serverless functions
Some teams overlook these new risks, focusing solely on the more traditional vulnerabilities of monolithic apps and static infrastructure.
Vulnerabilities reaching production: Teams that haven’t adapted to the speed and modular nature of cloud-native development may miss vulnerabilities at earlier stages in the DevOps cycle, such as in the registry, pipeline, or non-production environments. Similarly, they can fail to remediate vulnerabilities at their source in code before it’s rehydrated.
Ghost vulnerabilities: Remediation validation can be especially challenging for cloud-native applications. This can allow remediated vulnerabilities to keep popping up or persist in VM tool results. This happens when new instances are spun up using old configurations, or when the affected service hasn’t been restarted post-remediation.
Ownership issues
VM in the cloud requires coordination among multiple groups. SecOps teams deploy vulnerability assessment tools and determine which vulnerabilities to prioritize. IT operations tests and remediates patches. Other groups—such as dedicated VM, risk management, internal audit, or line-of-business teams—may define their own policies and processes and make their own decisions about exceptions. All of this can blur ownership and responsibility. Without consistent or shared goals across teams, it’s hard to optimize VM.
The problem is exacerbated when a lack of proper controls lets anyone spin up their own cloud environment. This shadow IT can make it especially difficult to maintain a full survey of the environment with detailed ownership information.
Ownership issues may be undermining your VM practice if teams are experiencing:
Disagreements about priorities and policies: Security demands consistency. Teams not on the same page can lead to conflicts, waste time, create distractions, and lead to inconsistent practices.
Lack of an updated asset inventory: Without clear lines of accountability, it’s too easy for tasks to fall through the cracks. Individual teams can end up tracking the assets they prioritize without maintaining a comprehensive and up-to-date inventory of the entire environment.
Lack of follow-through: Unclear ownership can lead to a lack of visibility and coordination for identified vulnerabilities. They may go unprioritized for remediation, or linger without a risk acceptance beyond the specified SLA. When a risk acceptance is submitted, it may expire without review.
Unexpectedly long delays: With no single owner pressing for action, weeks or longer can pass before a vulnerability is remediated.
Why vulnerability prioritization is critical — and difficult
The extreme volume and velocity of vulnerabilities in the typical hybrid, multi-cloud environment makes it neither feasible nor effective to aim for complete and timely mitigation. Beyond the massive staff this would require, it’s also inefficient to go after vulnerabilities that don’t pose a significant threat. Instead, organizations need to focus on effectively prioritizing the most important fixes. However, vulnerability prioritization can be a time-consuming and difficult task.
Traditional prioritization approaches rely on conventional risk scoring frameworks like the common vulnerability scoring system (CVSS). But the CVSS lacks the full context needed to make optimal decisions. As generic, one-size-fits-all ratings, CVSS scores aren’t aligned with the strategy, tactics, and operational objectives of a given business. And they’re not analyzed within the context of your business architecture (the systems, dependencies, and data stores powering specific services). Many vulnerabilities rated critical or high may have no significant relevance to your organization.
Similarly, the CVSS focuses on potential threats, not actual threat activity. Criteria such as the use of privileged credentials or remote execution can be good indicators of possible threats. But they don’t indicate how the vulnerability is actually being exploited in the wild, or whether a given asset or organization is a likely target. Prioritization must also consider multiple factors, including active threats, attack feasibility, and other customized security practices and controls.
In some cases, the CVSS can even be misleading about actual risk. Threat actors often target vulnerabilities rated “medium” and “low” as paths of least resistance, since potential victims may not give them the same level of scrutiny. In practice, this means a lower CVSS score can translate into a higher likelihood of exploitation. Also, the CVSS has a long backlog of unscored vulnerabilities. Just because a vulnerability hasn’t been labeled “critical” doesn’t mean it’s more or less likely to be exploited.
This lack of context isn’t limited to CVSS scores. Most common VM metrics (time-to-detection, vulnerability age, patching rate) aren’t risk-based, and often lead to ineffective, low-value prioritization with negative impacts and high costs. Neither velocity nor volume of vulnerabilities patched are accurate indicators of a VM program’s success.
This post was originally published on Expel's blog. Reach out personally or see resources below how Expel - the leading cloud MDR - can help:
Chaos to clarity: risk-based prioritization in vulnerability management: A guide on why CVSS isn’t getting it done, vulnerability management ownership and metrics issues, and how risk-based vulnerability prioritization is the foundation of a better approach
Expel Vulnerability Prioritization data sheet: How Expel takes the guesswork out of which vulnerabilities pose the greatest risk, informed by your unique
business context
Monthly check-in - vulnerabilities exploited in incidents prevented by Expel: monthly coverage on particular vulnerabilities our SOC analysts have actually encountered. Sometimes we’ll look at something seen in last month; other times it’ll be a trend we’ve seen over time. And in circumstances involving a widespread vulnerability, we can dive into how attackers leverage it. What better way to highlight this activity than by sharing real stories on real vulnerabilities?