Mastering Cybersecurity Alert Triage: Cutting Through Noise

Building better alerts and frequently tuning them are crucial to managing alert fatigue and burnout, improving job satisfaction, and addressing some of the other SOC challenges we’ve talked about before – especially when it comes to effective escalation, incident response, and leveraging automation with your security information and event management (SIEM) solution and overall security systems.

However, alert triage—the process of efficiently sorting and prioritizing incoming cybersecurity alerts—and the investigation, play an equally, if not more, impactful role on both analyst satisfaction and security outcomes.

Alerts are only the starting point in identifying cybersecurity threat activity in your organization. Successfully triaging and investigating alerts demands thoughtful lines of questioning, environmental context, and additional investigative data sources to turn facts into a narrative of benign or malicious activity. However, without a strong process, the wealth of security information coming from various sources - such as a SIEM, firewalls, and intrusion detection systems (IDS) – can easily overwhelm an analyst and lead them down rabbit holes that aren’t fruitful, hindering incident response. It can also lead to high-priority alerts being missed, and potential threats slipping through the security nets.

It’s no surprise then that according to an IDC study, it’s taking most security teams around 30 minutes to investigate each false positive alert. When these alerts pile up, the time and effort spent chasing non-issues quickly add up, draining already limited resources. This long investigation cycle slows down response times and affects other critical incident management metrics, such as mean time to detect (MTTD) and mean time to respond (MTTR), ultimately increasing organizational risk.

In this blog post, I’ll explain the key best practices in alert triage and investigative workflows that alleviate the biggest drivers of alert fatigue while supporting continuous improvement and informed decisions.

1. Understand what triggered your alert in triage

The first question to consider during alert triage may seem obvious, but it's frequently missed: what triggered the alert? As an investigator, lacking this information means you are essentially flying blind, increasing the chance that your investigation will not follow the most effective line of questioning. Moreover, without understanding the cause of the alert, you are more likely to misinterpret the evidence you examine.

To start an investigation on the right note, it's helpful to establish playbooks that outline the initial steps for investigating each type of alert. These playbooks should include specific indicators of compromise (IOCs) that might be a sign of malware or phishing, or behaviors identified in the detection, along with references to their sources, such as whitepapers, tweets, etc., or other related alerts. This approach provides analysts with immediate context to begin triage and investigation while reinforcing effective incident response and remediation workflows.

If you lack a threat intelligence team or your security products don’t adequately explain their alerts, there's a wealth of open-source intelligence available. Often, a simple Google search, a check on VirusTotal, or a query to ChatGPT about the relevant details or alert content can yield deeper insights to steer the investigation forward, enhancing real-time decision-making and incident management.

However, it’s important to note that context should not be confused with conclusions. Use intelligence to steer your investigation, but be cautious about drawing conclusions based solely on your search results.

2. Ask the right investigative questions in alert triage

I often find that inexperienced analysts pull back the same sources of evidence, regardless of their investigative lead. Usually, it’s because they lack a defined alert triage process and, as a result, fail to get a complete picture.

Below is a summary of the investigative process I have used effectively in the past. I've distilled it down into a few questions investigators can use to triage an alert, along with the most valuable data collections that can help answer each question.

As you might expect, different investigation types warrant different investigative processes. In this example, we’ll frame our process around investigating identity (IAM) - based alerts, specifically from intelligent document processing (IDP) solutions.

- Did the user logon with multifactor?

If the user had MFA enabled, the premise for compromise is that either the user’s session was hijacked or their MFA was compromised. Evaluate the current logon session reflected in the alert and historical logons over the past hour.‍

- What’s the reputation of the source IP address?

Next, look at the reputation of the source IP address. Generally, you’re expecting malicious activity to originate from an unusual location or IP block that differs from expected behavior. Use an IP lookup solution to gather infrastructure metadata to assist in categorization and prioritizing alerts.

- Does the user’s logon activity follow their standard behavior?

If the previous questions haven’t been clear-cut, start evaluating how this authentication compares to standard behavior. This will require historical authentication logs and optionally supplemental data sources like EDR or your asset inventory solution.

- What actions occurred during the session?

Did the user elevate their privileges, create an account, establish persistence, or access a sensitive application that they normally don’t? Ensure that SIEM logs include the entire session activity (for Okta, this is searchable with the External Session ID value) and that they provide critical alerts and the data necessary for the timely remediation of security incidents and cyberattacks.

- Is this activity ongoing?

If this activity is considered malicious, does the threat actor still have access to the environment? Gather IDP logging after the alert and look for the same source entity and user tuple, any created user logins, or multi-factor method to assess the potential impact, identify vulnerabilities and potential risks, and determine if escalation is required for effective incident management and remediation.

Wrapping up

It’s important to remember that alert triage and investigation are just one part of alert management. Next time, we’ll look at an example Okta alert and put the guidance outlined above into practice. Good investigators are separated from great ones by the questions they ask and by their ability to leverage automation, artificial intelligence, and well-defined workflows integrated with ticketing systems.

Hopefully, this post has encouraged you to think about your own investigation process when you approach alerts. The key is to have a structured methodology to help your team members avoid alert fatigue, and wade through the mountains of telemetry at your disposal to accurately and efficiently triage.

Towards that end, we've developed Prophet AI to emulate how a top-tier analyst triages and investigates security alerts, thereby eliminating the stress and high costs associated with manual investigations.

Are you ready to elevate your security operations and handle alerts with unprecedented efficiency? Get a demo of Prophet AI to learn how you can triage and investigate security alerts 10 times faster.