Primary pillar

AI Security

I build and evaluate AI security systems with one practical question in mind: can the system safely do useful security work, show its evidence, and handle untrusted inputs without turning a confident summary into a false sense of safety?

What I Work On

My AI security work sits where cloud security, application security, incident response, and agentic systems meet.

Security agents

Investigation over summarization

I work on AI systems that help security teams investigate incidents, gather evidence, and build timelines. The important part is not whether an agent can sound right. It is whether it can ask the right questions, use the right tools, and produce findings that a human responder can verify.

Prompt injection and RAG

Untrusted context is the threat model

Many AI risks show up when user content, retrieved documents, repository context, and tool permissions are mixed together. I focus on practical controls for instruction hierarchy, retrieval boundaries, tool use, secret handling, and reviewable output.

Developer workflows

Security for AI-assisted building

AI makes it easier to move fast, but it also makes old mistakes easier to repeat at scale: hardcoded secrets, vulnerable dependencies, unsafe infrastructure defaults, and code nobody fully reviewed. The goal is to put checks close to where builders already work.

Current focus

Production AI security investigation agent

My recent work includes AI-powered investigation capabilities for security incident response. The job is not to replace responders. It is to reduce the time spent collecting and correlating evidence, while keeping the investigation transparent, auditable, and reviewable by security teams.

Evaluation Matters

Security agents need to be measured on evidence, not just fluent answers.

Research

SIR-Bench

SIR-Bench is a benchmark for evaluating investigation depth in security incident response agents. It looks beyond whether an agent reaches the right triage decision and asks whether the agent actually discovered new evidence, used tools appropriately, and performed real investigative work instead of repeating the alert back in cleaner language.

Field note

Accuracy is not enough

A security agent can produce a correct-looking answer for the wrong reason. In incident response, that gap matters. Good evaluations should reward concrete findings, sourceable evidence, useful uncertainty, and escalation paths when the agent does not have enough information.

Projects

A few public projects that reflect how I approach AI security: build, measure, test, and keep the claims grounded.

Secret exposure

AI Leak Watch

A public dashboard tracking potentially exposed AI provider keys on GitHub. It started from a simple question: if AI agents are getting access to more tools and data, why are we still treating leaked model provider keys like a small billing problem?

Threat modeling

ThreatForest

An agentic threat modeling and attack tree generator that analyzes architecture and repository context, maps attack steps to MITRE ATT&CK techniques, and produces mitigation ideas for review. The output is a starting point for better security conversations, not a substitute for judgment.

Developer security

Automated Security Helper

ASH brings security scanning closer to local development and CI/CD workflows. For AI-assisted development, that matters: the faster code is produced, the more important it becomes to catch secrets, vulnerable dependencies, and risky infrastructure patterns before they spread.

How I Think About AI Security

The useful work is usually less glamorous than the headlines, and that is fine.

1

Define the job before judging the model

For a security agent, “good” depends on the task. Triage, evidence collection, exploitability review, code scanning, threat modeling, and executive summarization all need different success criteria.

2

Make evidence visible

Security work should leave a trail. If an agent claims a credential was used, a role was created, or a repository is affected, the next question is simple: where is the evidence?

3

Treat context as hostile until proven otherwise

Prompts, documents, tickets, READMEs, logs, and retrieved snippets can all contain instructions the system should not follow. RAG and agent security starts with separating data from authority.

4

Design for human handoff

The best security automation gives responders a clearer starting point. It should preserve uncertainty, explain limitations, and make it easy for a human to take over when the case requires judgment.

Related Reading

Start here if you want the broader context behind the projects and research.