Security Operations Center (SOC) metrics provide quantifiable measures of your organization’s security posture and operational efficiency. They give SOC managers the visibility they need to pinpoint gaps, streamline workflows, and reduce risk. With dozens of dashboards, thousands of alerts, and a flood of telemetry, it can be hard to know which metrics to track and what insights they’ll reveal. We recommend focusing on three core areas of awareness:.

Threat detection and response effectiveness
Analyst team cognitive load
Business growth preparedness

This blog provides a starting place for managers to build out metrics for these three areas to bolster decision-making.

Threat detection and response effectiveness

Q: What are SOC metrics and why do they matter?

SOC metrics are data‐driven measures—like mean time to detect (MTTD) or false positive rate—that quantify how well your security operations function. They help managers identify bottlenecks, validate tool investments, and make informed decisions that reduce risk and improve team efficiency.

Q: How is detection coverage calculated and what does it tell me?

Detection coverage is the percentage of MITRE ATT&CK techniques you’ve instrumented and tested. Divide the number of uniquely covered techniques by the total 194 ATT&CK techniques. Coverage indicates how likely your team is to receive an alert when an adversary executes each behavior, guiding where to build new detections.

Q: What is Mean Time to Detect (MTTD) and what’s a good target?

MTTD measures the average time from when malicious activity starts to when it’s flagged by your SOC. Subtract the “Alerted At” timestamp from the “Activity Started At” timestamp for each incident. Top‑performing teams typically achieve 30 minutes to four hours.

Q: What is Mean Time to Respond (MTTR) and how should I use it?

MTTR is the average time from activity occurrence to containment or resolution. Calculate it by measuring the elapsed time for each alert—from event start to “stop the bleeding”—and averaging over a period (usually 30 days). A two‑ to four‑hour range is generally acceptable; shorter times indicate lower overall risk.

Q: How do false negative and false positive rates affect SOC effectiveness?

The false negative rate tracks how often true threats are dismissed as false positives; aim for ≤1%. The false positive rate is the proportion of alerts without real threats; use severity‑based thresholds (e.g., critical <25%, high <50%) to prevent alert fatigue.

Q: What is SOC capacity and how do I calculate expected work?

SOC capacity = number of analysts × available triage hours per day × days per month. Expected work = MTTR × monthly alert volume × 1.15 buffer. Ensure capacity exceeds expected work by at least 15% to handle surges.

Q: How can AI and LLMs enhance SOC metrics collection and analysis?

AI and large language models can automate data aggregation across tools, extract early indicators from unstructured logs, and surface anomalous patterns. LLM‑powered agents accelerate triage and reduce manual errors, enabling faster, more accurate metric insights.

Q: How does Prophet Security impact SOC metrics?

Prophet Security’s Agentic AI SOC Analyst autonomously triages and investigates every alert, improving key metrics like MTTI, MTTD, and MTTR. By handling repetitive investigative tasks, it frees analysts to focus on critical threats, lowers false negative rates through consistent evidence gathering, and boosts capacity to comfortably exceed expected work volumes—driving measurable gains across your entire SOC.

When evaluating this metric you’re really trying to understand if you have the appropriate visibility and response time to effectively contain threat actors before they are able to achieve their objectives. Secureworks in their latest State of the Threat report stated that the median time for a ransomware operator to achieve their objectives is just under 24 hours. That comes down to finding threats, investigating threats, and responding to threats.

What is Detection Coverage?

Detection coverage measures the percentage of detections your team has implemented and tested that align to a known framework, namely MITRE ATT&CK. The purpose of this measure is to illustrate the likelihood that your operations team would receive some form of signal in the event of an incident.

How do I measure detection coverage?

In ATT&CK’s case, there are 194 techniques or behaviors commonly associated with threat actors prior to data exfiltration and impact. You would simply divide the detections you have that uniquely cover a technique by the total number of techniques.

What does good detection coverage look like?

For most teams, striving for 100% coverage of something like MITRE ATT&CK is both unadvisable and unachievable. The amount of operational energy to build, implement, and review the resulting detections would consume considerable numbers of team cycles and require a long-tail of tuning and maintenance.

However, the probabilities suggest that teams don’t need perfect coverage to identify threats. Remember, a threat actor has a limited number of behaviors they perform and have to accomplish every step in the threat lifecycle to achieve their objectives. With the simplifying assumptions that:

Every technique or behavior is equally likely by a threat actor (not true)
Your current visibility would record some form of monitored telemetry for 7 of the 10 techniques that would occur prior to Impact (assuming a minimum of 1 per lifecycle stage)
Your team wants to detect at least 3 of those behaviors for a threat actor without further investigation

That would require 127 detections to be covered or ~65% detection coverage for those outcomes. Teams with less experienced investigators typically want to opt for higher coverage on the detection front to account for slower investigation times.‍

How should this metric direct action?

Justify and validate that the team has appropriate telemetry visibility into the organization
Direct what detections the team should focus on building. Remember, there isn’t an equal likelihood for every behavior. Make sure to invest in behaviors that are most common (as seen in threat intelligence, threat reports, or even social media).

Additional Notes: In the event your team is light on capacity and needs to suppress detections, the strong candidates will be ones with high false positive ratios that are already covered by another detection. Your effective coverage won’t decrease but your overall capacity will increase.

What is Mean Time to Detect (MTTD)?

Mean time to detect measures the average amount of time it takes for an organization to identify an incident from the point the activity started. This measure is evaluating both your detection coverage and alert ingestion pipeline.

How do I measure MTTD?

Mean Time to Detect is calculated by subtracting the “Alerted At” time from the “Activity Started At” time for the earliest event in an incident.

What is a good MTTD?

The closer to zero, the better. Generally high performing organizations will fall somewhere between 30 minutes and 4 hours. Latency tends to get added from events early in the attack lifecycle, like VPN authentications using compromised valid credentials, that may be harder to identify before more direct action on a managed device in the environment.

How should this metric direct action?

Validate that where you’re investing in detection coverage is paying off in incident identification
Necessitates post-incident detection reviews to validate earliest activity and glean any potential behaviors for detections

What is Mean Time To Respond (MTTR)?

Mean time to respond measures the average amount of time it takes for your security team to go from security event to containment or resolution. This measure encompasses your entire operational pipeline from ingesting security telemetry, acknowledging the alert, triage and investigation, and initial containment.

How do I measure MTTR?

Mean Time to Respond is calculating the total time it takes from activity occurrence to stopping the bleeding for each alert and averaging them. Typically this is measured across a 30 day period.

What is a good MTTR?

Two to four hours is a generally acceptable range across all alert severities. The lower the MTTR, the lower the risk of a significant security incident.

Organizations may also choose to split their MTTR by detection severity and assign different levels of priority. In that case, the following would be strong starting points:

Critical: 1 hour
High: 2 hours
Medium: 4 hours
Low: 8 hours

How should this metric direct action?

SOC managers should investigate if there are significant differences in MTTR between alert sources, alert types, and individuals to identify pain points or training opportunities
MTTR should force a SOC manager further into their investigative process. Is the majority of the time spent with alert ingestion, alerts waiting to be actioned, triage and investigation, or performing initial response actions?

Additional Notes: I’m personally a bigger fan of Median Time to Respond over Mean Time to Respond. Medians will naturally have a greater resistance to outliers that may throw off your metrics.

What is False Negative Rate / Investigation Error Rate?

Error rate or false negative rate measures how often an analyst or automation incorrectly dispositions true positive activity as a false positive. This informs security teams of poor documentation, lack of training, or alert fatigue that may result in missed critical detections.

How do I measure false negative rate?

False Negative rate is notoriously challenging to measure – after all if you could calculate your errors you wouldn’t have them in the first place. Shameless plug: check out the quality control process laid out by Expel that uses sampling to simplify your false negative learnings.

What is a good false negative rate?

At or below 1% error.

How should this metric direct action?

SOC managers should investigate if there are process improvements that need to be made or training opportunities to limit the number of false negatives.

What is SOC Capacity and Expected Work?

SOC Capacity measures how much total available time your team has to disposition security alerts.

Expected Work is the total amount of alert management work you expect in a given month.

How do I measure SOC capacity and expected work?

For a simple approach, we’re going to remove some real world components like arrival rates (how quickly do alerts come into the queue at once) and periodicity (what time of day do these alerts typically occur) and solve for a completely uniform distribution of events.

We’re assuming your analysts are working 8 hour days and that 70% of their time (5.6 hours) is available to do work outside of lunch, breaks, meetings, etc. Capacity is always measured in hours.

Total SOC Capacity (20 days / 1 month) =

[# of Security Analysts] * 5.6 hours * [% time spent triaging] * [# time in days]

Example:

You have a team of 5 security analysts with 40% of their time spent triaging alerts

5 analysts * 5.6 hours * 0.40 * 20 = 224 hours

Expected Work is calculated as MTTR * Average Alert Volume. If it takes the team on average 1.1 hours to respond to an alert and you have 200 of them a month, ideally your team would have a SOC capacity around 250 hours (1.1 hours * 200 alerts * 1.15 percent surge buffer).

What does good look like?

Good is a relative term here, but generally you want to have capacity available that exceeds your average work volume by 15% in a given month so that your team can handle alert surges.

How should this metric direct action?

SOC managers should ensure that they have available capacity to manage their expected work in a given month and be able to account for future growth
Expected Work that’s exceeding capacity is ripe for support with automation or appropriate detection tuning
When SOC capacity is significantly below Expected Work, you run significant risk for burnout which cannot be sustained
If your Alert Latency / Alert Dwell time is high as a component of MTTR, take a hard look at capacity as a second order measure to suss out of resourcing is your constraint

Additional notes: For a deeper understanding of capacity planning outside of the simplified model above, checkout Jon Hencinski’s post on the subject here

Analyst team cognitive load

Just like a well oiled machine, an efficient SOC isn’t “redlining” at all times. With the expanded responsibilities of SOC teams these days, pushing security teams too hard for too long can lead to attrition and burnout. SOC managers have to be mindful by talking to their directs during 1:1’s, but can also use the following SOC metrics as starting points to identify morale issues.

What is Mean Time to Investigate (MTTI)?

This is the average duration between acknowledging an alert, investigating the activity, and resolving the alert.

How do I measure MTTI?

Alert Resolved Time - Alert Acknowledged Time

What is a good MTTI?

Top organizations will be averaging between 10 minutes to an hour, depending on alert volume and automation. Keep in mind that for organizations with high amounts of false positive alerts, tuning those alerts may artificially raise your mean time to investigate. That’s not a bad thing.

How should this metric direct action?

Are there alert types with significantly higher MTTI? Is there an opportunity for automation or enrichment to support faster conclusions?
Are there any alert types with high MTTI that are also high volume? These can be mind numbing for analysts. Prioritize for tuning, training, or automation.

Additional note: Like MTTR, I’m a personal fan of using Median instead of Mean for this measure for some outlier resistance.

What is Alert Latency / Alert Wait Time / Alert Dwell Time?

This measure describes the amount of time between suspicious activity occurring that triggered an alert and when an analyst or engineer acknowledges that alert.

How do I measure alert latency?

_|Alert Acknowledged Time - Activity Start Time|

What is good alert latency?

Similar to MTTR, Alert Latency is typically managed based on alert severity. The following would be in the range of the top 10% of security organizations:

Critical: 20 minutes
High: 1 hour
Medium: 2 hours
Low: 6 hours

How should this metric direct action?

Are there specific alerts or classes of technology that the team is consistently avoiding? Does it line up with an extended Mean Time to Investigate for those alerts? This might indicate a training or tuning opportunity.
Is your alert volume higher than average? Does the SOC have the capacity to take on the workload given. You’ll need to invest in people or automation.

What is False Positive Rate?

This is how accurate your detections are at finding threat activity.

How do I measure false positive rate?

Number of false positive alerts / total number alerts in a given period.

What is a good false positive rate?

Unfortunately, there’s a good bit of variation between organizations depending on risk tolerance to specific high false positive prone behaviors and overall alert volume.

False positive rates should be considered in the severity of alerts generated by your team, as high severity alerts demand urgent responses. If your team is constantly needing to be high availability for low priority false positive alerts, you’ll start to see alert fatigue. Any security organization adhering to the following is operating well above average.

Critical = <25% false positive rate
High = <50% false positive rate
Medium = <75% false positive rate
Low = <90% false positive rate

How should this metric direct action?

Are there alerts that require tuning or downgrading for not meeting the severity threshold?

Business growth preparedness

Security operations is often a reactive function in an organization, but anticipating future business growth allows a SOC manager to effectively resource or prioritize spend to reach business objectives.

What are Alerts per Unit of Growth?

The intent of this measure is to have a rough proxy based on historical data to show how internal growth across a single axis (customers, employees, AWS workloads) impacts alert volume as a ratio. Tracking this measure over time can help account for work variability or underscore efficiency changes to detection engineering.

How do I measure alerts per unit of growth?

Total Monthly Alerts / Internal Growth Unit

As an example, your company has an OKR to grow to 1000 employees by the end of the year. Your historical trend for internal growth has been 1.15 alerts per user. If you currently have 500 employees you can anticipate your total monthly alert volume to move from 575 alerts to 1,150 alerts. You can then calculate Expected Work with your current MTTR and start capacity planning.

Additional note: This won’t accurately account for atypical growth like new product launches or new onboarded security products

What does good look like?

There’s no right answer here. Your goal should be to directionally forecast what the security team needs to deliver based on the business growth goals. You’ll normally come across these growth goals during Company All-Hands or OKR reviews.

How should this metric direct action?

How do internal growth projections impact our overall capacity modeling? When do we need to budget for additional automation or resources to accommodate?

Additional note: In a more self-serving vein, SOC managers have an incredible opportunity to showcase non-linear growth of their security organization if they can break the Alerts / Unit of Growth trend. This is typically through internal innovation or automation.

Wrapping up

Establishing and maintaining effective SOC metrics is crucial for any security operations team aiming to stay ahead of modern threats. By focusing on key areas such as detection and response capabilities, analyst cognitive load, and preparedness for business growth, SOC managers can gain valuable insights into their operational efficiency and areas for improvement. These metrics not only provide a tangible way to measure performance but also guide strategic decisions to enhance security posture, optimize workflows, and ensure the sustainability of the team. As the security landscape continues to evolve, regularly reviewing and adapting these metrics will be essential for maintaining a robust and resilient SOC.

At Prophet Security, we're building an AI SOC Analyst that applies human-level reasoning and analysis to triage and investigate every alert, without the need for playbooks or complex integrations. Request a demo of Prophet AI to learn how you can triage and investigate security alerts 10 times faster.

Frequently Asked Questions (FAQ)

What are SOC metrics and why do they matter?
SOC metrics are data‐driven measures—like mean time to detect (MTTD) or false positive rate—that quantify how well your security operations function. They help managers identify bottlenecks, validate tool investments, and make informed decisions that reduce risk and improve team efficiency.

How is detection coverage calculated and what does it tell me?
Detection coverage is the percentage of MITRE ATT&CK techniques you’ve instrumented and tested. Divide the number of uniquely covered techniques by the total 194 ATT&CK techniques. Coverage indicates how likely your team is to receive an alert when an adversary executes each behavior, guiding where to build new detections.

What is Mean Time to Detect (MTTD) and what’s a good target?
MTTD measures the average time from when malicious activity starts to when it’s flagged by your SOC. Subtract the “Alerted At” timestamp from the “Activity Started At” timestamp for each incident. Top‑performing teams typically achieve 30 minutes to four hours.

What is Mean Time to Respond (MTTR) and how should I use it?
MTTR is the average time from activity occurrence to containment or resolution. Calculate it by measuring the elapsed time for each alert—from event start to “stop the bleeding”—and averaging over a period (usually 30 days). A two‑ to four‑hour range is generally acceptable; shorter times indicate lower overall risk.

How do false negative and false positive rates affect SOC effectiveness?

False negative rate tracks how often true threats are dismissed as false positives; aim for ≤1%.
False positive rate is the proportion of alerts without real threats; use severity‑based thresholds (e.g., critical <25%, high <50%) to prevent alert fatigue.

What is SOC capacity and how do I calculate expected work?

SOC capacity = number of analysts × available triage hours per day × days per month.
Expected work = MTTR × monthly alert volume × 1.15 buffer.
Ensure capacity exceeds expected work by at least 15% to handle surges.

How can AI and LLMs enhance SOC metrics collection and analysis?
AI and large language models can automate data aggregation across tools, extract early indicators from unstructured logs, and surface anomalous patterns. LLM‑powered agents accelerate triage and reduce manual errors, enabling faster, more accurate metric insights.

How does Prophet Security impact SOC metrics?
Prophet Security’s Agentic AI SOC Analyst autonomously triages and investigates every alert, improving key metrics like MTTI, MTTD, and MTTR. By handling repetitive investigative tasks, it frees analysts to focus on critical threats, lowers false negative rates through consistent evidence gathering, and boosts capacity to comfortably exceed expected work volumes—driving measurable gains across your entire SOC.

‍

Download this essential ebook

Your definitive guide to evaluating AI-powered SOC solutions that actually work

Download Ebook

Insights

Text Link

Grant Oviatt

AUTHOR

Grant is the Head of Security Operations at Prophet Security.

SOC Metrics & KPIs that Matter: MTTR, MTTD, MTTI, False Negatives, and more

Threat detection and response effectiveness