
 
            Today, we are thrilled to announce Aardvark, an innovative agentic security researcher powered by GPT-5. In the ever-evolving landscape of technology, software security stands as one of the most critical and challenging frontiers. Each year, tens of thousands of new vulnerabilities are uncovered across both enterprise and open-source codebases. Defenders are faced with the daunting task of identifying and patching these vulnerabilities before adversaries can exploit them. At OpenAI, we are committed to shifting this balance in favor of defenders, and Aardvark represents a significant breakthrough in the field of AI and security research.
Aardvark is an autonomous agent designed to assist developers and security teams in discovering and fixing security vulnerabilities at scale. Currently, Aardvark is available in private beta, where it will be validated and refined in real-world settings. This agent continuously analyzes source code repositories to identify vulnerabilities, assess their exploitability, prioritize their severity, and suggest targeted patches.
Unlike traditional program analysis techniques such as fuzzing or software composition analysis, Aardvark employs LLM-powered reasoning and tool use to understand code behavior and pinpoint vulnerabilities. Aardvark mimics the approach of a human security researcher by reading code, analyzing its structure, writing and executing tests, and utilizing various tools to enhance its findings.
Aardvark operates through a multi-stage pipeline that includes:
Analysis: Aardvark begins by analyzing the entire repository to generate a threat model that reflects its understanding of the project's security objectives and design. Commit Scanning: It inspects commit-level changes against the repository and threat model, scanning for vulnerabilities as new code is added. When a repository is first connected, Aardvark will also examine its history to identify existing issues. Validation: Upon identifying a potential vulnerability, Aardvark attempts to trigger it in a sandboxed environment to confirm exploitability. This step ensures that the insights returned to users are accurate and of high quality, with minimal false positives. Patching: By integrating with OpenAI Codex, Aardvark provides generated patches for identified vulnerabilities, which are then attached for human review and efficient one-click patching.Aardvark seamlessly integrates with GitHub, Codex, and existing workflows, delivering clear, actionable insights to engineers without hindering development processes. While its primary focus is on security, Aardvark has also shown the capability to uncover other bugs, such as logic flaws, incomplete fixes, and privacy issues.
Aardvark has been operational for several months, continuously monitoring OpenAI's internal codebases as well as those of select external alpha partners. Within OpenAI, Aardvark has successfully surfaced meaningful vulnerabilities, enhancing the organization's defensive posture. External partners have praised the depth of its analysis, with Aardvark identifying issues that arise under complex conditions.
In benchmark testing on “golden” repositories, Aardvark has identified an impressive 92% of known and synthetically-introduced vulnerabilities, showcasing its high recall and effectiveness in real-world applications.
Aardvark has also been applied to open-source projects, where it has discovered and responsibly disclosed numerous vulnerabilities—ten of which have received Common Vulnerabilities and Exposures (CVE) identifiers. As a beneficiary of decades of open research and responsible disclosure, we are committed to giving back to the community by contributing tools and findings that enhance the security of the digital ecosystem.
To further support open-source software security, we plan to offer pro-bono scanning to select non-commercial open-source repositories. We have also updated our outbound coordinated disclosure policy, adopting a developer-friendly approach that focuses on collaboration and scalable impact, rather than rigid timelines that can pressure developers.
In a world where software has become the backbone of every industry, vulnerabilities pose a systemic risk to businesses, infrastructure, and society at large. In 2024 alone, over 40,000 CVEs were reported, with our testing indicating that approximately 1.2% of commits introduce bugs—small changes that can lead to significant consequences. Aardvark represents a new defender-first model in software security, acting as an agentic security researcher that partners with teams to deliver continuous protection as code evolves.
By catching vulnerabilities early, validating their real-world exploitability, and providing clear fixes, Aardvark enhances security without impeding innovation. We firmly believe in expanding access to security expertise and, starting with our private beta, we will gradually broaden access as we gather insights and feedback.
We invite select partners to participate in the Aardvark private beta. Participants will gain early access and collaborate directly with our team to refine detection accuracy, validation workflows, and the overall reporting experience. Together, we can forge a stronger future for software security.
