OpenAI has introduced Aardvark, an autonomous agentic security researcher powered by its latest GPT-5 large language model (LLM). Designed to function like a human cybersecurity expert, Aardvark can scan, understand, and patch code at scale.
Currently in private beta, the tool aims to assist developers and security teams in identifying and fixing vulnerabilities throughout the software development lifecycle.
According to OpenAI, Aardvark operates by embedding itself into a project’s development pipeline, continuously monitoring code commits and repository changes. The agent identifies potential weaknesses, assesses their exploitability and severity, and then generates targeted patches using LLM-based reasoning and integrated tools.
It also builds a contextual threat model for each project, ensuring its security recommendations align with the system’s architecture and design objectives.
When a possible flaw is detected, Aardvark attempts to validate the vulnerability by recreating the exploit in a sandboxed environment. If confirmed, it leverages OpenAI Codex, the company’s coding agent, to automatically produce a patch for human review.
OpenAI said the system has already been deployed internally and with select external partners, helping to uncover at least ten new CVEs in open-source projects.
The initiative showcases GPT-5’s enhanced capabilities in deep reasoning and adaptive tool selection, made possible through what OpenAI calls “GPT-5 thinking” and its real-time routing system—a mechanism that dynamically selects the most appropriate model for each task based on context and complexity.
OpenAI is not alone in pursuing autonomous cybersecurity solutions. Earlier this month, Google introduced CodeMender, an AI agent designed to detect, rewrite, and patch insecure code while preventing future vulnerabilities. Like Aardvark, CodeMender aims to collaborate with maintainers of critical open-source projects to integrate its AI-generated security fixes.
Together with initiatives like XBOW and gpt-oss-safeguard, these tools represent a new generation of AI-driven code security systems focused on continuous analysis, exploit validation, and proactive patch generation.
“Aardvark represents a defender-first approach to cybersecurity,” OpenAI explained. “By catching vulnerabilities early, validating real-world exploitability, and delivering clear, reviewable fixes, it helps teams strengthen security without slowing innovation. We believe this brings expert-level protection within reach for all developers.”
