Alert Triage and Investigation in Cybersecurity: Best Practices

Grant Oviatt
Grant Oviatt
May 20, 2024

Building better alerts and frequently tuning them are crucial to managing alert fatigue and burnout, improving job satisfaction, and addressing some of the other SOC challenges we’ve talked about before. However, alert triage—the process of efficiently sorting and prioritizing cybersecurity alerts—and the investigation, play an equally, if not more, impactful role on both analyst satisfaction and security outcomes

Alerts are only the starting point in identifying cybersecurity threat activity in your organization. Successfully triaging and investigating alerts demands thoughtful lines of questioning, environmental context, and additional investigative data sources to turn facts into a narrative of benign or malicious activity. However, without a strong process, the wealth of security information available can easily overwhelm an analyst or lead them down rabbit-holes that aren’t fruitful.

It’s no surprise then that according to an IDC study, it’s taking most security teams around 30 minutes to investigate each false positive alert

In this blog post, I’ll explain the key best practices in alert triage and investigative process that alleviate the biggest drivers of alert fatigue.

1. Understand what triggered your alert in triage

The first question to consider during alert triage may seem obvious, but it's frequently missed: what triggered the alert? As an investigator, lacking this information means you are essentially flying blind, increasing the chance that your investigation will not follow the most effective line of questioning. Moreover, without understanding the cause of the alert, you are more likely to misinterpret the evidence you examine.

To start an investigation on the right note, it's helpful to establish playbooks that outline the initial steps for investigating each type of alert. These playbooks should include specific indicators of compromise (IOCs) or behaviors identified in the detection, along with references to their sources, such as whitepapers, tweets, etc., or other related alerts. This approach provides analysts with immediate context to begin triage and investigation.

If you’re lacking a threat intelligence team or your security products don’t explain their alerts adequately, there's a wealth of open-source intelligence available. Often, a simple Google search, a check on VirusTotal, or a query to ChatGPT about the relevant details or alert content can yield deeper insights to steer the investigation forward.

However, it’s important to note that context should not be confused with conclusions. Use intelligence to steer your investigation, but be cautious about drawing conclusions based solely on your search results.

2. Ask the right investigative questions in alert triage

I often find that inexperienced analysts pull back the same sources of evidence, regardless of their investigative lead. Usually, it’s because there’s no process to guide the way they triage an alert and ensure they get a complete picture.

Below is a summary of the investigative process I have used effectively in the past. I've distilled it down into a few questions investigators can use to triage an alert, along with the most valuable data collections that can help answer each question. 

As you might expect, different investigation types warrant different investigative processes. In this example, we’ll be framing our process around investigating identity (IAM) based alerts, specifically from IDP solutions.

- Did the user logon with multifactor?

If the user had MFA enabled, the premise for compromise is that either the user’s session was hijacked or their MFA was compromised. Evaluate the current logon session reflected in the alert, and historical logons over the past hour.

- What’s the reputation of the source IP address?

Next would be looking at the reputation of the source IP address. Generally, you’re expecting malicious activity to originate from an unusual location or IP block that differs from expected behavior. Use an IP lookup solution to gather infrastructure metadata.

- Does the user’s logon activity follow their standard behavior?

If the previous questions haven’t been clear cut, start evaluating how this authentication compares to standard behavior. This will require historical authentication logs, and optionally supplemental data sources like EDR or your asset inventory solution.

- What actions occurred during the session?

Did the user elevate their privileges, create an account, establish persistence, or access a sensitive application that’s unusual? You’ll want to have logs that include the entire session activity (for Okta, this is searchable with the External Session ID value).

- Is this activity ongoing?

If this activity is considered malicious, does the threat actor still have access to the environment? Gather IDP logging after the alert and look for the same source entity and user tuple, any created user logons, or multi factor method.

Wrapping up

It’s important to remember that alert triage and investigation is just one part of alert management. Next time we’ll look at an example Okta alert and put the guidance outlined above into practice. Good investigators are separated from great ones by the questions they ask. Hopefully this post has encouraged you to think about your own investigation process when you approach alerts. The key is to have a structured methodology to wade through the mountains of telemetry at your disposal to accurately and efficiently triage.

Towards that end, we've developed Prophet AI for Security Operations to emulate how a top-tier analyst triages and investigates security alerts, thereby eliminating the stress and high costs associated with manual investigations.

Are you ready to elevate your security operations and handle alerts with unprecedented efficiency? Get a demo of Prophet AI to learn how you can triage and investigate security alerts 10 times faster.

Further reading

What is MFA fatigue attack?
Investigating geo-impossible travel alert
Top 3 scenarios for auto remediation
Automated incident response: streamlining your SecOps
SOC metrics that matter
Key SOC tools every security operations needs
Demystifying SOC automation
SOC analyst challenges vs SOC manager challenges
Alert tuning best practices: keys to reducing false positives
How to investigate Okta alerts

Discover Prophet AI for Security Operations
Ready to see Prophet Security in action?
Request a Demo