AI-Powered Security Operations: What Actually Works in 2026

Security operations and SRE share more DNA than either community usually acknowledges. Both involve monitoring large volumes of signals to detect anomalies, both require rapid triage and investigation when something goes wrong, and both are fighting the same fundamental scaling problem: the volume of events requiring attention grows faster than the human capacity to handle them. The AI tooling that's transforming SRE incident response is also transforming security operations — with some important differences in how you apply it.

This is what AI-powered security operations looks like in practice, where it works, and where the risks are.

The Signal Problem in Security

Security operations has a signal problem that's worse than SRE's equivalent. A Prometheus alert for high error rate fires when the error rate is actually high. A SIEM alert for "unusual access pattern" fires constantly — most "unusual" access is legitimate user behavior that happens to deviate from the historical baseline.

The false positive rate in traditional SIEM deployments is notoriously bad. Security teams that tune their rules for low false positives miss real incidents. Teams that tune for low false negatives (don't miss anything) are buried in alerts that are 99% noise. The result is alert fatigue, desensitization, and the psychological reality that analysts stop treating alerts as meaningful signals.

LLM-powered security analysis changes the economics of this problem. Rather than tuning rules to produce a manageable volume of alerts, you can route a higher volume of alerts through an AI triage layer that handles the preliminary investigation, correlates across sources, and surfaces only the alerts that warrant human attention.

The AI Security Analyst Pattern

The pattern mirrors the SRE incident response agent but adapted for security context:

SECURITY_TRIAGE_PROMPT = """
You are a security operations analyst. Your job is to investigate security alerts
and determine which warrant immediate human attention.

For each alert, you will:
1. Gather additional context using available tools
2. Assess the likelihood this represents a real threat vs. a false positive
3. Identify the potential impact if it is a real threat
4. Recommend an action: ESCALATE (human review needed), MONITOR (watch for recurrence),
   or CLOSE (confirmed false positive)

ESCALATE immediately for:
- Any evidence of credential compromise or unauthorized access
- Lateral movement indicators (accessing systems the account doesn't normally touch)
- Data exfiltration patterns (large outbound transfers to unusual destinations)
- Signs of persistence (new IAM roles, unusual scheduled tasks, modified startup scripts)
- Anything involving production databases or secrets stores

DO NOT:
- Take any blocking or remediation action without human approval
- Mark as false positive if you cannot conclusively explain the behavior
- Dismiss alerts involving privileged accounts or production systems without escalation

When uncertain: ESCALATE. A false escalation is less costly than a missed incident.
"""

security_tools = [
    {
        "name": "get_cloudtrail_events",
        "description": "Retrieve CloudTrail events for a specific principal or resource",
        "input_schema": {
            "type": "object",
            "properties": {
                "principal_arn": {"type": "string"},
                "resource_arn": {"type": "string"},
                "hours_back": {"type": "integer", "default": 24}
            }
        }
    },
    {
        "name": "get_iam_principal_history",
        "description": "Get IAM access history, last activity, and permission summary for a principal",
        "input_schema": {
            "type": "object",
            "properties": {
                "principal_arn": {"type": "string"}
            },
            "required": ["principal_arn"]
        }
    },
    {
        "name": "check_guardduty_findings",
        "description": "Get recent GuardDuty findings for an account or resource",
        "input_schema": {
            "type": "object",
            "properties": {
                "account_id": {"type": "string"},
                "resource_id": {"type": "string"},
                "severity_min": {"type": "number", "default": 4.0}
            }
        }
    },
    {
        "name": "get_vpc_flow_logs",
        "description": "Query VPC Flow Logs for traffic patterns involving specific IPs or resources",
        "input_schema": {
            "type": "object",
            "properties": {
                "source_ip": {"type": "string"},
                "destination_ip": {"type": "string"},
                "hours_back": {"type": "integer", "default": 1}
            }
        }
    },
    {
        "name": "lookup_ip_reputation",
        "description": "Check threat intelligence for an IP address",
        "input_schema": {
            "type": "object",
            "properties": {
                "ip_address": {"type": "string"}
            },
            "required": ["ip_address"]
        }
    }
]

CloudTrail Analysis: The High-Value Use Case

CloudTrail records every API call in your AWS account. This is an extraordinarily rich audit log — and an extraordinarily noisy one. A medium-sized AWS account generates millions of CloudTrail events per day. Finding the meaningful ones manually is impractical.

LLM analysis of CloudTrail patterns is one of the highest-value AI security applications because:

The data is structured (JSON events with consistent schema)
The anomalies are semantic ("this principal doesn't usually call CreateRole") not just statistical
The context LLMs bring — knowing that IAM permission escalation is suspicious, understanding the significance of accessing a secrets store — is exactly what's needed to distinguish signal from noise

The query pattern:

-- Athena query: Unusual IAM activity in the last 24 hours
SELECT 
    userIdentity.arn as principal,
    eventName,
    COUNT(*) as call_count,
    MIN(eventTime) as first_seen,
    MAX(eventTime) as last_seen,
    ARBITRARY(sourceIPAddress) as source_ip
FROM cloudtrail_logs
WHERE 
    eventTime > NOW() - INTERVAL '24' HOUR
    AND eventSource = 'iam.amazonaws.com'
    AND eventName IN (
        'CreateRole', 'AttachRolePolicy', 'CreateUser',
        'CreateAccessKey', 'UpdateAssumeRolePolicy',
        'PutRolePolicy', 'AddUserToGroup'
    )
GROUP BY userIdentity.arn, eventName
ORDER BY call_count DESC;

Feed the results to an LLM with context about which principals normally perform which operations, and you get semantic analysis that threshold-based alerting cannot provide.

Log Analysis for Threat Detection

Beyond CloudTrail, application and infrastructure logs contain security-relevant signals: authentication failures, privilege escalation attempts, unusual data access patterns, injection attack signatures. LLM-based log analysis can handle the semantic understanding that regex-based SIEM rules miss.

The key pattern: don't send raw logs to the LLM. Pre-filter to the relevant window and relevant signals, then send structured summaries:

def analyze_auth_logs_for_threats(service: str, hours_back: int = 1) -> SecurityAssessment:
    # 1. Pre-filter with cheap queries
    failed_auths = query_failed_authentications(service, hours_back)
    high_volume_users = [u for u in failed_auths if u['count'] > 10]
    suspicious_ips = [u for u in failed_auths if is_unusual_geo(u['source_ip'])]
    
    if not high_volume_users and not suspicious_ips:
        return SecurityAssessment(severity="low", requires_review=False)
    
    # 2. Enrich the suspicious events
    enriched = []
    for event in high_volume_users + suspicious_ips:
        enriched.append({
            "user": event['user'],
            "source_ip": event['source_ip'],
            "failure_count": event['count'],
            "failure_types": event['error_codes'],
            "ip_reputation": lookup_ip_reputation(event['source_ip']),
            "user_last_successful_auth": get_last_success(event['user']),
            "user_normal_locations": get_usual_geos(event['user'])
        })
    
    # 3. LLM assessment — focused, not raw logs
    response = claude.messages.create(
        model="claude-haiku-4-5-20251001",  # Fast model for triage
        system=SECURITY_TRIAGE_PROMPT,
        messages=[{
            "role": "user",
            "content": f"Assess these authentication events: {json.dumps(enriched)}"
        }]
    )
    
    return parse_assessment(response)

What AI Security Operations Cannot Replace

Human judgment on escalation paths. When the AI identifies a potential compromise, a human security engineer needs to make the containment decision: disable the credential, block the IP, quarantine the instance. Automated containment without human review will occasionally block legitimate activity. The blast radius of a containment mistake is high enough that human approval is required.

Novel attack patterns. AI security analysis is strongest when the attack patterns resemble historical patterns in the training data. Novel attack techniques — zero-days, supply chain compromises, sophisticated social engineering that hasn't appeared in security training data — are exactly what AI is worst at detecting. Skilled human analysts and adversarial thinking remain essential.

Legal and compliance judgment. "What should we preserve for potential legal proceedings?" "Do teams have a notification obligation to affected users?" "How do we communicate this to the board?" These are not AI questions.

Red team creativity. Adversarial thinking — imagining novel attack paths before an attacker does — requires human creativity. AI can assist with threat modeling by synthesizing known attack patterns, but the creative generation of novel attack scenarios is still a human discipline.

The Honest Risk: AI in Security Systems

Deploying AI in security systems introduces a risk that deserves explicit acknowledgment: adversaries will eventually attempt to manipulate the AI. Prompt injection — embedding instructions in log data or user input that attempt to influence the LLM's analysis — is a real attack vector against AI security tooling.

Mitigations: treat the LLM's output as advisory (human in the loop for consequential actions), sanitize inputs before passing to the LLM, monitor for unusual LLM output patterns that might indicate manipulation, and maintain non-AI detection paths as a fallback.

AI security tooling that operates autonomously without human oversight is a significant risk. AI security tooling that augments human analysts — handling the volume, surfacing the signal, preparing the context — is a significant capability improvement. The distinction is important.

*Zak Hassan is a Staff SRE specializing in security operations automation, AI-powered infrastructure, and data platform reliability. Find him at zakhassan.com or on LinkedIn.*

Topic Paths

SRE and Reliability Kubernetes and Platform Engineering Observability and Incident Learning AI Infrastructure and Operations Identity Reliability Cloud Cost and Capacity

About the Author

Zak Hassan writes about reliability engineering under real scale constraints.

Staff-level SRE and platform engineer focused on identity reliability, Kubernetes, observability, cloud architecture, AI infrastructure, and reducing operational uncertainty.

Connect on LinkedIn