The Security Operations Center Cannot Hold

Hyperautomation, Open Security Data Architecture, and the Future of SIEM

Share on Social

"Things fall apart, the center cannot hold; Mere anarchy is loosed upon the world."
-- W.B. Yeats

Foreward

These documents always have the same lead: "In today's ever-evolving security landscape, the threats are increasing in complexity and velocity. We are experiencing key shortages in people and technology capabilities."

How about we don't do that this time?

How about we start with some observations and work from there.

Today's Security Market

The security market and SIEM market are ready for a change, and it may not be the change we are expecting.

Some Deepwatch Security Market Observations:

  • The SIEM market is built on the consumption and correlation of security-relevant data into a single tool to enable practitioners and security teams to detect and investigate security issues.
  • The traditional definition of "Security Data" no longer fits. Firewall, EDR, Authentication, Endpoint, IDS/IPS, DNS blocks, are all still relevant but have been shaded into many areas and many tools.
  • SIEM vendors built a legacy model (which made sense at the time) on the expectation of continuing data volume increases, and related cost increases. While many have attempted to move to different cost models to try and keep costs stabilized, the amount of data to be analyzed continues to gallop ahead of expectations.
  • Security budgets that have steadily increased for years, averaging a 16% year-over-year increase in 2021, 17% year-over-year in 2022, have dropped to 6% year-over-year increase in 2023.
  • Gartner® Forecasts Global Security and Risk Management Spending to Grow 14% in 20241
  • Machine-created logs have been increasing at over 50% in volume year over year.
  • SIEM costs are set to continue to increase not just due to ingestion but also due to new data transport costs when looking at multi-cloud logging and transport fees.
  • At a minimum, SIEM costs will continue to increase as part of the continued security and risk spending trend which is averaging 13% year-over-year over the last 3 years.2
  • Merger and acquisition activity in the SIEM vendor space is going to shake up the market in the coming months, but in comparison to the other facts, it is more a symptom than a cause.

Download

The Security Operations Center Cannot Hold

Get a downloadable copy of the whitepaper here, or scroll to keep reading the content.

The Evolution in SOC and SIEM

SIEMs and security operations centers (SOCs) must evolve to address budget changes based on cost-benefit analysis, and the continuing flood of data and data lakes that will be increased by the varying new AI/ML tools. There is too much pressure building from the outside with budgets, and too much information pressed into the confined space of skills, people, and time.

All of which begs the question: What can be done?

The answer we're all looking for is more flexible SIEMs that embrace a non-ingestion-based cost model and allow for AI/ML input and output. However; even this is not a long-term answer. Today's SIEMs are already providing some different pricing options and we are still dealing with a lack of data visibility and scale. Data normalization and ongoing detection engineering are still problems.

SOCs are experiencing pressures both internally and externally. Externally, from the amount of data that needs to be analyzed and alerted upon, the rising costs associated with the collection of this data from multiple locations, like multi-cloud infrastructure, SaaS-based systems, and on-premise and virtualized assets. Internally, pressures continue to rise from skills gaps, ongoing efforts in detection engineering and normalization to keep on top of all the data format and location changes, and pure alert volume.

If these pressures were equal across the entire surface of the security center balloon, then things might hold. However, we know this is an impossibility, and either internal or external pressure will invariably spike and pop that balloon.

If we do not relieve the pressure, the Security Operations Center cannot hold.

Reducing external pressure of data overload and ingestion costs

The trend of reducing log volume for SIEM ingestion has shifted from a niche to mainstream, with prominent organizations offering solutions aimed at this goal. However, these solutions, along with tools for log accuracy assessment, address immediate needs rather than providing comprehensive fixes.

At Deepwatch, our years of experience with Splunk customers have involved reviewing and optimizing log sources to manage costs effectively. Despite efforts to maximize value, log costs and complexity remain a persistent challenge. Trimming logs to save costs can compromise security, leading to detection strategy issues and needing significant strategy overhauls.

While we've established best practices to maximize customer value, log costs and expansion persist as challenges. Attempting to manipulate log volumes upstream can lead to detection strategy issues and require full-scale strategy overhauls, neither of which enhances the visibility crucial for effective security operations.

Trimming log sources is not the answer - there are hidden security costs that are being traded for recurring subscription costs. We need more flexible ingestion options for the new age that change the paradigm of data ingestion for security and embrace the concept of decentralized security data storage and retention options.

Taking Cloud-to-Cloud Complexity into Consideration

Another cost factor often not taken into account when selecting a SIEM is the cost associated with moving large amounts of data from one cloud provider to another.

Deepwatch has years of experience in hosting SIEMs in, and collecting logs from major public cloud infrastructure environments. We repeatedly run into hidden costs within these environments, such as making sure that customers take into account the costs of sending logs from their Azure environment to an AWS hosted SIEM or the cost of data and log utilization in Microsoft Sentinel for logs not included freely in their Microsoft license, as examples.

In short, the expectation that all of an organization's security-relevant logs can be collected in a single location is just no longer a workable idea. With the advent of multi-cloud offerings, and the expansion of log data, putting all data in one proverbial basket just isn’t going to be possible for the majority of organizations.

Will there still be companies that can afford to pull this off? Absolutely.

Is that number the majority of the market? Definitely not.

The Impact of AI and ML on Data Ingestion Levels

Though difficult to quantify, we have ground rules we know are going to be true. Forbes puts it this way:

"All aspects of AI-machine learning models, continuous learning, generalization, and predictive and descriptive analytics-are dependent on massive data sets. The more diverse and comprehensive the data, the better AI can perform. This is why data is often referred to as the 'training fuel' for AI."

All ML-based concepts need a wide variety of data to learn and train from, so the natural tendency is to point tools at as much as they can consume and to use popular parlance, "just send it".

This is categorically a bad idea for a couple of different reasons:

  • The biggest risk to AI is data poisoning because bad quality data will not only create a bad output but train the model to go totally off the chart for all future computation and predictions.
  • This influx of data, even if it is clean data, still needs to be looked at on a volume-based project basis. An influx of different data sets into an AI/ML-enabled SIEM has the possibility of overflowing from a volume perspective and leading to unexpected storage and search costs at best, and at worst, system damage and loss of operational capabilities.

At this time, we are too early in the development and utilization of AI/ML technologies to fully understand the data and cost impact of these technologies on security operations groups. We do know that the Gartner Hype Cycle is real, and while these tools and techniques hold promise, we're still figuring out where they'll be most effective.

Reducing the internal pressure of alert overload, time to investigate, and time to remediate

We've discussed how to reduce the external data pressure on the security operations center, now let's turn our attention to reducing the internal pressure from alert overload, skills shortages, and just an overall lack of time.

The simple answer to this issue is pretty straightforward. A blanket quote from every security analyst who has ever worked a ticket can be summed up with: "I need more time…" Time is one resource always in short supply.

But we can look at how to maximize the time at hand for making decisions, taking actions, and judging the effectiveness of those steps.

Where are the top priorities to save time? The industry-accepted focal points for security operations are:

  • Optimizing alert fidelity reducing false positives
  • Accelerating validation, enriching alerts with context, streamlining triage
  • Expediting response and remediations
  • Utilizing feedback for continual improvement

These practices ensure SOC analysts can efficiently identify and address genuine threats, enhancing overall cybersecurity effectiveness.

Alert Fidelity and Reduction of False Positives

Almost
70%
True Positive Ratio

Deepwatch MDR customers who ingest EDR content into the SIEM find alerts average around 9% of daily ingestion, with an almost 60% signal-to-noise ratio, and a Critical priority true positive ratio of almost 70%

Data format standardization is crucial, especially regarding monolithic SIEMs' ability to consolidate diverse alert types, formats, metadata, and raw log data into a unified platform. However, this consolidation increases data ingestion pressures, necessitating solutions to manage alert overload without compromising detection capabilities.

Leveraging current platforms and data standards enables decentralized detection, allowing stream-based detections without requiring data format normalization and correlation. Our analysis shows that endpoint-based detection tools are highly effective and generate lower data volumes compared to other detection methods, providing a high volume of actionable “true positive” alerts from a low volume of SIEM ingested data.

However, relying solely on EDR alerts and abandoning other log sources is not advisable. While EDR alerts are valuable for identifying ongoing malicious activity at the target level, ideally, we aim to intercept threats as they enter the environment. By leveraging different detection capabilities based on the volume of data to review, and employing data standardization, we can extract concise alert information.

Further, utilizing hyperautomation and enrichment helps gather additional details, tasking, and internal intelligence. This comprehensive approach assists in accurately discerning true positives from false positives and prioritizing alerts based on risk, exposure, and potential business and operational impact.

Speed of Validation, Enrichment, Triage, and Impact

Accelerating validation, enrichment, triage, and impact assessment hinges on the volume and relevance of available information. Hyperautomation and data standards play a vital role in alleviating this pressure by expediting enrichment searches, enabling enrichment from non-standard OSINT or external sources, and facilitating the use of complex validation and triage playbooks. The flexibility of automation and the readability of returned data empowers analysts to execute more sophisticated processes efficiently.

Effective Response and Remediations

Validation, enrichment, triage, and impact assessment boil down to asking, "Do I care, and how severe is it?" Analysts initially determine true positives/false positives before escalating if needed and examining enrichments. Many organizations now integrate target data and risk assessment into their alert review process, alongside CVE and vulnerability information.

Addressing alert overload is an ongoing industry challenge, with newer approaches focusing on gathering more data to enhance understanding and alert prioritization.

Reducing mean time to investigate, respond, or take action is crucial to alleviate internal pressure on Security Analysts and enhance operational efficiency.

Utilizing Feedback for Continual Improvement

The attempt to regain control over time spent on analysis and response process is not a simple, one time exercise. Rather, gaining time back has to be an ongoing process so that an appropriate focus is maintained on both quality of analysis and effectiveness of response. Choosing a quick response to save time that is damaging to the operations of the company is a step backward. Regular reviews of the program status and capabilities are imperative to pinpoint areas for improvement and cost savings.

Advancing toward Cyber Resilience and the future of SIEM

We've focused on discussing the market changes that pose greater challenges and problems rather than highlighting some of the positive developments over the past year or so. Now, let's shift our attention to the positives.

Deepwatch is leading the security operations industry into a new operational concept with the idea of cyber resilience. The approach accepts that there is going to be a bad day, and works to ensure that customers are prepared for that eventuality. The first step on the journey to cyber resilience is anticipating risk, which requires understanding an organization’s threat detection capabilities, understanding operational and technical risks, and response capabilities that include both human process responses and automation-based response capabilities.

Cyber resilience is more about focusing on the right response at the right time and improvement than it is about focusing on detecting external threats individually.

Hyperautomation

The quick definition of hyperautomation is the ability to scale and manage automation workloads that enable the core of operations, playbooks, analysis, and data interaction from a centralized location that supersedes the capabilities of today's SOAR tooling.

Deepwatch led the push into SOAR with global security operations years ago and has maintained one of the largest and most complex SOAR installations for years. We made the change to hyperautomation in 2023, to move beyond first-generation SOAR capabilities to gain additional capabilities in flexibility, complex logical decision trees, and scalability.

Data Standards

As Deepwatch has started putting the frameworks in place to support the ability to measure and improve security operations for cyber resilience, we have run into one consistent issue across all of our customers, and hyperautomation brought the problem into focus. It's the data. We know it's always about the data, but in this case, it really is all about the data. But it's not just about the amount of data, or the increasing number of sources of security relevant data. Moving forward, it's about the organization, availability, and determination of security relevant data.

Hyperautomation in security operations offers a significant benefit tied to a concept often overlooked: standardized, interchangeable parts. This concept, akin to one of the most pivotal inventions in human history, revolutionized manufacturing. While hyperbole isn't warranted, the parallel remains: standardizing data inputs/outputs to frameworks or normalizing data at the logical layer offers operational flexibility and capability reminiscent of the early days of SIEM technology, when it was fresh and tailored for on-premise realities.

Decentralized Data Locations

While the continuing spread and location of data is a major cause for the pressure on SIEMs and the continued rising costs, decentralized data locations and data availability are a net positive for the industry. The ability to have multi-cloud log information sitting in affordable and stable storage locations, with easy-to-use API and access, is a net positive for the industry. We can't continue to pull data from these disparate locations into a secondary, centralized container and consider it cost-effective or efficient.

Considerations for Truly Effective Responses

Effective response capabilities are critical for modern SOCs, so how do we chart an effective response and validate a remediation?

Standardized data empowers hyperautomation to gather data from various sources for enrichment, triage, and risk assessment. Additionally, a central data lake, alongside standardization, accumulates the operational history and activities of the SOC, forming the core of a hub-and-spoke data architecture. This central hub houses the records of detections, vulnerabilities, and responses related to the assets and identities under the SOC's monitoring and protection.

Why is this important?

The old adage goes: "Insanity is doing the same thing over and over again, and expecting a different outcome." But for some reason, despite its familiarity, security operations hasn't learned that lesson.

Automation and Elimination of Repetitive Manual Tasks

Security operations teams continue to reset the same user's password over and over, or they kill an identified process on an endpoint over and over again. Teams must review these repetitive actions and determine if the actions and responses are actually useful, and improve cyber resilience.

With this ability to better understand the assets, identities, common threats, and operational and technical risks, Deepwatch believes that we can help the industry show growth and more effective use of existing tooling and improve detection and response capabilities. We also understand that we are one of the few organizations that not only can take an automated response at the right time, but we have the security experts and capabilities in other tools, like VM, EDR, Firewall. Together they can help find the root of problems and end the cycle of “empty” automated responses.

The security center can no longer hold the amount of data it needs to provide for our security future. We must let it evolve and expand past its current constraints and costs. An Open Security Data Architecture is the natural next step toward cyber resilience operations and is overdue in the security market.

Artificial Intelligence

Where does AI fit in?

Let's start by recalling that the current hype in AI is only one facet: the Large Language Models (LLM), and that there are a lot more capabilities in the AI family. In short: Everywhere. Deepwatch employs Artificial Intelligence (AI) and Machine Learning (ML) to identify valuable data patterns. We utilize established data with LLM or neural mapping to significantly enhance detection and response recommendations. This includes statistically analyzing the effectiveness of response actions against known threats and exploring alternative response options or compensating controls to mitigate future threats.

Download

The Security Operations Center Cannot Hold

Get a downloadable copy of the whitepaper here, or scroll to keep reading the content.

Open Security Data Architecture

How is the Open Security Data Architecture going to help resolve, or reduce the external and internal pressures we discussed?

Reducing the constraints and bottlenecks of a single security data repository, making better use of the native analysis capabilities in a wide variety of already deployed security tools, and migrating away from costly SIEM ingestion and storage models can all be addressed through Open Security Data Architecture, without sacrificing detection and response capabilities.

An Open Security Data Architecture is built upon multiple data sets and locations. As long as the collected data abides by an agreed upon data organization standard, then AWS Security Lake data can stay in the AWS Security Lake, and Microsoft Sentinel alerts can stay in Azure, S3 buckets can stay in S3 buckets, the list goes on from there.

The real benefit of an Open Security Data Architecture is that it allows us to move to a distributed data architecture and away from a single collection and normalization point of a SIEM. Multimodal GenAI and ML are going to open up a plethora of future opportunities for the security industry, and at the heart of all it is a better understanding of data and capabilities.

A graphic showing centralized and decentralized data sources of a customer's security stack and how it ties into the Deepwatch Platform

Open Security Data Architecture for Cyber Resilience

What this gives a Cyber Resilient Security Operations group is not only a distributed detection and response capability to reduce the alert load, ingestion load, and ability to recursively search live and historical data, but more importantly gives the organization the ability to determine if their analysis and response actions were effective, allowing them to measurably improve over time.

For Deepwatch customers, all of this is supported by Deepwatch Security Experts who support their clients with 24/7/365 analysis and response, perform tuning and custom detection work within the correlation engine, and can also take action on other programmatic issues such as vulnerability scanning, EDR policy modification, and deployment, or Firewall management actions.

This capability is mentioned in the Gartner Future of Security Architecture: Cybersecurity Mesh Architecture (CSMA) report. "The key capability of the layer is in its ability to take signals from many different point products and apply a relationship-based risk scoring matrix to feed multiple types of decision points. This layer is an evolution of what SIEM, SOAR, UEBA and XDR vendors are doing today. Currently no vendor has all of the capabilities in this layer available as a product offering. This mesh of dynamic scoring provides the ability for this layer to trigger defensive actions before attacks materialize."

We believe, the differences in what Gartner advocates for and what Deepwatch will deliver is in the more flexible nature of the consumption signals from the different point products, and depth and understanding that Deepwatch can provide with our patented Security Index in conjunction with the multimodal GenAI enabled data lake for relationship-based risk scoring.

As they also mention: "Currently no vendor has all the capabilities in this layer available". We think, that still holds true, and while different vendors are actively working to build out their platforms to provide more of this mesh based response capabilities, it is assumed that organizations will pursue and acquire point products based on best of breed, and expected outcomes. Not just acquire products from their selected platform. For us, as a trusted security partner Deepwatch's Cyber Resilience Platform is built to achieve the goal mentioned in the Gartner report. A customer focused platform that understands the risks, responses, and enables ongoing improvement to an organization's security program and posture. Using a flexible mix of point products and/or vendor stacks or platforms, along with hyperautomation and data lake based risk scoring to enable the right response at the right time, we believe, enables the achievement of the goal laid out in the Gartner report.

The SOC Cannot Hold Without Change

The legacy centralized security data model is reaching a breaking point. The cost decisions being made are being influenced by outsized data ingestion costs, and need to be solved for. Existing "duct tape and bailing wire" workarounds are being pushed to the limit and alerting response capabilities need to advance to allow more time for analysts.

In the end, change is coming for security leaders. Change in the way we collect and utilize security data. Change in the process of security data enrichment. Change in response and remediation speed and efficacy. Now the traditional security effort, always an exercise in evolution, is poised to undergo a revolution. Pressure is building between the need for more data visibility, the changing nature of data collection, and shrinking security budgets. If we do not relieve the pressure, the Security Operations Center cannot hold.

1Gartner Press Release, Gartner Forecasts Global Security and Risk Management Spending to Grow 14% in 2024, 28 September 2023, https://www.gartner.com/en/newsroom/press-releases/2023-09-28-gartner-forecasts-global-security-and-risk-management-spending-to-grow-14-percent-in-2024.

2https://www.gartner.com/en/newsroom/press-releases/2022-10-13-gartner-identifies-three-factors-influencing-growth-i

GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.

Let's Talk

Ready to Become Cyber Resilient?

Meet with our managed security experts to discuss your use cases, technology and pain points and learn how Deepwatch can help.