Securing AI-generated code: a checklist for security leaders
Your engineers adopted coding agents months ago. The volume of code entering your repos went up; the review capacity didn’t. Meanwhile prompts are carrying sensitive data out of the building, and agents that read external content and call tools have opened an injection surface your AppSec program was never designed for.
None of this means saying no to AI. It means putting controls around it that hold up. Here’s the checklist we work through with security leaders.
1. Treat AI-generated code as untrusted input
The agent is fast and confident, which is exactly the problem. It will invent APIs that don’t exist, skip tests for the code it wrote, and reach for the first pattern that compiles — including insecure ones.
- Gate it in CI, not in someone’s head. Static analysis, dependency scanning, and secret detection should run on every PR regardless of who — or what — wrote it.
- Require tests for AI-written code. A diff that adds logic but no tests is a review smell. Make it a policy your pipeline can enforce.
- Flag hallucinated dependencies. Agents suggest packages that don’t exist (or worse, that an attacker has since registered). Validate that every new dependency resolves to a known-good source.
2. Stop sensitive data from entering prompts
Every prompt is an egress channel. Secrets, PII, customer data, and proprietary source can all leave one request at a time.
- Inventory where prompts are sent — which tools, which providers, under which data agreements.
- Add redaction at the boundary. Strip secrets and PII before requests leave your network, the same way you’d handle logging.
- Set a data-residency policy and enforce it in tooling, not in a wiki page nobody reads.
3. Map the prompt-injection surface
This is the genuinely new risk. An agent that reads a web page, a ticket, or a file and then calls tools can be steered by content it ingests. Classic input validation doesn’t cover “the document told the agent to exfiltrate the repo.”
- Constrain tool permissions. An agent should hold the narrowest set of capabilities its task requires — and no standing access to anything destructive.
- Put untrusted content behind a trust boundary. Treat anything the agent reads from outside as adversarial.
- Log and audit every tool call. When agents reach internal systems, do it through audited paths — which is exactly what a well-built MCP layer gives you: auth, scoped access, and a full audit trail.
4. Close the hosted-API path where the law requires it
For regulated workloads — healthcare, finance, defense, EU residency — sending data to a hosted API isn’t an option. The answer isn’t to ban AI; it’s to bring the model in-house.
Self-hosted inference keeps sensitive data inside your environment while still giving engineers the agent workflows everyone else has. Done right, it’s a compliance win, not a tax.
5. Make “good” measurable
You can’t secure what you can’t observe. The same eval and monitoring infrastructure that keeps AI features high-quality doubles as a security control: it catches when an agent’s behavior drifts, surfaces unsafe outputs, and gives you evidence for an audit instead of assurances.
The throughline
Security leaders who win at this don’t slow the team down — they make the safe path the easy path. Audited tool access beats a policy banning tools. Redaction at the boundary beats a memo asking people to be careful. Scanning in CI beats hoping reviewers catch everything.
If you want a read on how AI-generated code and prompts actually flow through your org today — and where the cheapest, highest-impact controls are — that’s the audit we run. Compliance plus AI gets budget in any environment; the trick is spending it where it matters.