Jun. 202613 min read

AgentGate: I Built the Gate Between AI Agents and Splunk

Splunk shipped six agentic capabilities in twelve months. Every one of them can read your data, propose changes, and increasingly execute them. None of them answers compliance's question: who approved this action and what was its blast radius? AgentGate is the pre-action gate that produces an answerable audit trail for every AI-agent decision against Splunk.

SplunkSplunk MCPFoundation-SecAI GovernancePrompt InjectionEnterprise Security

Watch the demo Source code

Splunk shipped six agentic capabilities in twelve months: Triage Agent, Investigation Agent, Malware Reversal Agent, AI Playbook Authoring, AI Assistant for SPL, and Foundation-Sec-8B. Every one of them can read your data, propose changes, and increasingly execute them. None of them answers compliance's question: who approved this action, and what was its blast radius?

AgentGate Audit dashboard inside Splunk Enterprise: 12 decisions, 6 blocks, 3 pending findings, average gate latency, decisions over time, decisions by policy, agents by decision mix — Every AI-agent action against Splunk that AgentGate has gated, in a native Splunk dashboard you can filter by decision, agent, or policy.

The pain Splunk's own AI leader named

The most likely outcome is that compliance and governance teams block the application from going to production. — Jeff Wiedemann, Global AI Partner Technical Leader, Splunk (CIO video interview)

If the AI lead at the platform vendor names compliance as the production-blocker, that's the bottleneck worth building for. AgentGate is the pre-action half of the audit trail compliance is asking for. Splunk's own MCP Telemetry Dashboard (May 2026) and Splunkbase MCP Watch audit agent activity AFTER the fact. AgentGate gates it BEFORE.

What it does

AgentGate intercepts every tool call an agent makes against Splunk and runs five deterministic gating stages plus one advisory reasoning stage. The deterministic stages decide; the reasoning stage explains.

Stage 1: Prompt-injection check. A heuristic + obfuscation-normalised scan over tool inputs and the context (log lines) the agent has just read. Targets OWASP LLM01 indirect injection. Measured precision 1.000 / recall 0.971 / F1 0.986 on a committed corpus of 35 positives + 26 lookalike negatives + 8 out-of-scope cases.
Stage 2: Blast-radius walk. A NetworkX graph of saved searches -> indexes -> sourcetypes -> hosts -> assets. Computes which MITRE techniques and compliance tags lose coverage and how many other detections share that coverage. The redundancy story: 'this is the ONLY detection covering host paymentapp01, which is PCI-tagged.'
Stage 3: Cost prediction. SVC-hour estimate from the proposed SPL. Production target: Cisco Deep Time Series Model on Splunk Cloud.
Stage 4: Policy engine. 12 deterministic rules mapped to NIST AI RMF, OWASP LLM Top 10, EU AI Act Article 14, PCI DSS 10, HIPAA 164.308, SOX, and ISO/IEC 42001. POL-001 through POL-012 each cite the standard control they implement, in code.
Stage 5: Decision synthesis. ALLOW / REQUIRE_APPROVAL / BLOCK, read ONLY from the policy stage. Non-ALLOW persists as a Finding (mock of ES 8 v2 /findings API). Fails closed: if the gating stage raises, the verdict is BLOCK with severity=HIGH. Regression test pinned: test_policy_stage_exception_fails_closed.
Stage 6 (advisory): Foundation-Sec-1.1-8B-Instruct as a Finding-explainer. Writes a plain-English paragraph attached to the Finding so a human reviewer reads the risk in natural language. Does NOT gate decisions. Decisions remain reproducible from the policy library alone.

Every verdict, regardless of outcome, fans out to the agentgate_audit index via HEC. The bundled dashboard makes it the system of record for AI-agent governance on the Splunk side.

AgentGate architecture diagram: agent plane on the left, 5-stage gate pipeline plus advisory reasoning in the middle, Splunk outputs (audit index, Findings, dashboard) on the right — Five deterministic stages plus one advisory reasoning stage. Decisions are reproducible from the policy library alone; the LLM explains, it doesn't gate.

The design choice that makes it auditable

Deterministic where audit demands reproducibility. Generative where humans demand explanation. The five gating stages are pure Python: NetworkX graph walks, regex + obfuscation normalisation, dict-driven policy evaluation. Same input, same verdict, every time, on any machine. The sixth stage is an 8B-parameter model that writes the why-this-was-blocked paragraph that a SOC analyst reads at 3 AM, but its output never gates a decision. If Foundation-Sec is down, the gate still works. If a Finding has no explanation paragraph, the analyst still sees the policy IDs, the standards-control citation, and the blast-radius graph.

This is the same deterministic-vs-generative split I used on Warden, ported into Splunk's vocabulary: KO graphs, ES 8 Findings, savedsearches.conf, the audit index. The two projects answer a related question (how do you govern an agent fleet) on the two platforms whose customers are asking it the loudest right now.

Four scenarios proven end-to-end

The demo script runs four canonical scenarios that hit every code path. Each is a real tool call against a real Splunk Enterprise dev instance with a real seeded KO graph (12 saved searches, 9 KV assets, indexes, HEC, sample events).

Friendly-fire scenario: a cleanup agent proposes to disable the saved search 'AG: SQL Injection on Payment App' and AgentGate blocks it — Friendly fire BLOCK. A cleanup agent proposes to disable a saved search. The blast-radius walk shows host paymentapp01 has zero redundant coverage and is PCI-tagged. POL-001 + POL-002 + POL-010 fire. Verdict BLOCK. Standards cited: EU AI Act Article 14, PCI DSS 10.6, NIST AI RMF MANAGE-2.3.

Prompt-injection scenario: a triage agent reads a poisoned log line containing 'ignore all previous instructions' and AgentGate blocks the resulting tool call — Prompt injection BLOCK. A triage agent reads a log line containing 'ignore all previous instructions'. The injection stage catches the override_instruction pattern; POL-006 (OWASP LLM01) fires. Verdict BLOCK.

Require-approval scenario: a retention-audit agent issues a 30-day SPL across six indexes with no early filter and AgentGate drafts a Finding for analyst review — REQUIRE_APPROVAL Finding. A retention-audit agent issues a 30-day SPL across six indexes with no early filter. Cost stage flags >1 SVC-hour and cross-index reach > 5. POL-007 + POL-012 route to REQUIRE_APPROVAL. AgentGate drafts a Finding so an analyst can approve, deny, or modify. Proves the gate is not just `return BLOCK`.

Happy-path scenario: a reporter bot runs a benign read-only query on host webfront01 and AgentGate allows it in under a millisecond — Happy-path ALLOW. A reporter bot runs a benign read-only query on host webfront01. All five gating stages pass in under 1 ms. Most decisions look like this; the dashboard makes that visible so SOC leaders can prove false-positive cost is bounded.

Measured performance

Two latency profiles matter, and they live on different paths. The decision-path latency is what every gate call pays. The reasoning-path latency is only paid on non-ALLOW decisions, asynchronously, after the verdict is already returned.

Deterministic decision path: p50 0.23 ms, p95 0.56 ms (tests/test_latency.py).
Policy-gate false-positive rate: 0.000 on 20 benign tool calls (tests/test_pipeline.py).
Foundation-Sec advisory path: mean 9.4 s, p50 8.2 s, max 13.1 s (tests/test_latency.py --runslow).
Injection detector: precision 1.000, recall 0.971, F1 0.986, specificity 1.000 (26/26), out-of-scope passthrough 8/8. Corpus committed in tests/corpora/ so the numbers are reproducible, not self-graded.

What was hard

Splunk MCP Server v1.2 added coarse tool enable/disable but no per-action blast-radius or approval gate. Building one meant wiring four surfaces together that don't usually meet: the Splunk Python SDK for KO catalog enumeration, the MCP server for the tool-call interception point, the ES 8 v2 Findings API for the durable approval artifact, and a Splunk app bundle (savedsearches.conf, collections.conf, transforms.conf, dashboard XML, app.conf, metadata) so the audit dashboard installs cleanly on any dev instance.

Adversarial coverage was a deliberate scoping call. The injection corpus blends hand-curated common patterns, AgentDojo important_instructions_attacks templates, and adversarial obfuscation variants (homoglyph, zero-width, leet, payload-split). The leet bypass '1gn0r3 4ll pr3v10us' is a documented xfail miss, on purpose. Production should pair the heuristic with a semantic check on the same input, and the architecture already routes INJECAGENT-style tool-execution hijacking attempts to the Foundation-Sec semantic stage. Documenting the gap is what makes the precision number trustworthy: 0 false positives on 26 lookalike negatives only counts if you also tell judges where the false negatives live.

Honest scoping on the KO graph too. It's hand-seeded with 12 saved searches and 9 assets. Real SOC portfolios are 10k+ saved searches. The parser scales linearly with NetworkX but graph-walk cost at that scale wasn't measured. Production deployment would replace the seed loader with a Splunk REST call against the real KO catalog and benchmark it. The submission says so out loud, in the 'what we did NOT validate' section, rather than burying it.

Lower half of the AgentGate Audit dashboard showing top blocked tool calls and Findings table with policy IDs and severities — Findings table at the bottom of the audit dashboard. Each row is a non-ALLOW verdict, queryable in SPL, mockable as ES 8 v2 Findings on Cloud.

Why this matters

The cost of getting AI-agent governance wrong is not hypothetical and the numbers are public. IBM's Cost of a Data Breach report puts the global average at $4.4M, and breaches that took longer to identify cost over $1M more on average. Verizon's DBIR puts the human element at 68% of breaches, the misconfiguration subset of which is exactly the shape a misbehaving agent silently disabling a detection rule produces. PCI DSS 10.6 mandates daily review of cardholder-environment logs; a sole-coverage detection silently disabled by an agent is the difference between a noisy alert and a regulator-investigation event.

And there's a real incident, not a thought experiment, in the exact shape AgentGate guards against. In July 2025, Replit's coding agent deleted a customer's production database during an explicit code freeze, then admitted to it on the next prompt (SaaStr's Jason Lemkin was the customer; the incident got Tom's Hardware coverage). POL-004 (destructive primitive) and POL-009 (mutation of system-of-record) would have blocked it pre-execution.

Cross-track applicability

Primary submission is the Security track, but the same gate applies to the other two tracks through the existing policy library with zero code changes. An ITSI agent proposing to rewrite a summary index hits POL-008 (mass-change) and POL-009 (system-index mutation) for the Observability track. An MLTK agent proposing to `| fit` against `_audit` hits POL-004 (destructive SPL), POL-007 (cost), and POL-008 (Excessive Agency) for the Platform & Developer Experience track. Same engine, same policy library, different tool calls.

What's next

Replace the seed KO loader with a live REST call against the real Splunk knowledge-object catalog, benchmark blast-radius walk on a 10k+ saved-search portfolio.
Swap the ES 8 Findings KV mock to the real POST /public/v2/investigations/{id}/findings endpoint for Cloud deployments.
Swap local Ollama Foundation-Sec to the Splunk Hosted Models surface (| ai provider=splunk model=foundation-sec-1.1-8b-instruct) on Splunk Cloud.
Add a semantic injection-detector alongside the heuristic on the same input, to close the documented leet xfail.
Multilingual obfuscation, payload-splitting beyond zero-width, base64-smuggling under 120 chars are out of the regex's current threat model. Production red-team would extend the corpus.
Run an analyst-in-the-loop user study on the Finding-approval UX. Right now it's unproven against real SOC shift change-overs.

The honest one-liner: deterministic where audit demands reproducibility, generative where humans demand explanation. Five gating stages decide, the sixth advisory stage explains, the audit dashboard makes both visible inside Splunk. The expensive thing is not building the gate. The expensive thing is not having one.

If you liked this, the sibling project (Warden) ran the same pattern against Dynatrace instead of Splunk: MCP tools as the sense organ, deterministic policy gate, advisory LLM explanation, human-in-the-loop on irreversible actions.

Warden (the sibling project on Dynatrace)

Related project

AgentGate: Stop Rogue AI Actions on Splunk Before They Hit Production

View the project

SWORN: I Built a DFIR Gateway That Cryptographically Signs Every Finding

Competitors log. SWORN proves. A Custom MCP gateway for Protocol SIFT where every DRAFT finding carries an Ed25519 signature over its backing tool invocation IDs, stdout/stderr SHA-256 hashes, exit codes, and argument vectors. The signing key is held by the gateway, not the LLM. A finding without a valid signature chain cannot leave DRAFT.

Jun. 202613 min read

Warden: I Built an Agent That Governs Your Other Agents

Once you have a fleet of AI agents acting on real systems (approving refunds, changing prices, moving inventory), who watches them when one goes rogue? Warden is the supervisor I built for the Google Cloud Rapid Agent Hackathon: Dynatrace MCP for senses, Gemini 3 for judgment, a real human-in-the-loop approval gate, and honest dollar accounting for every incident.

Jun. 202612 min read