June 2025 — San Francisco. In a breakthrough moment for AI-assisted security research, security expert Sean Heelan has revealed that OpenAI’s o3 large language model independently discovered a zero-day vulnerability (CVE-2025-37899) in the Linux kernel. The flaw affects ksmbd, the in-kernel SMB3 server responsible for network file sharing on Linux systems.

What makes this discovery extraordinary is that the model identified the vulnerability without any auxiliary tools, symbolic execution, or static analysis frameworks—just by reading and reasoning over raw code via API calls to o3.

A Zero-Day Found With a Prompt

The vulnerability, CVE-2025-37899, is a use-after-free triggered by a race condition in the smb2_session_logoff handler. It arises when two threads share access to a session: one frees a pointer (sess->user) during a logoff, while the other continues to access it—resulting in potential kernel memory corruption or arbitrary code execution.

OpenAI’s o3 model successfully identified this flaw in code spanning over 12,000 lines. It even produced a structured vulnerability report, outlining the issue, affected paths, and exploitation conditions—something typically expected from experienced security researchers.

“LLMs have made a real leap in code reasoning. They won’t replace experts, but they can make us dramatically more efficient,” Heelan wrote in his blog post.

Not Just Luck: Outperforms Other Models

To benchmark o3, Heelan tested it against a previously known vulnerability (CVE-2025-37778), also a use-after-free in the Kerberos session setup path. Out of 100 test runs:

o3 detected the bug 8 times
Claude Sonnet 3.7 found it 3 times
Claude 3.5 failed to detect it entirely

This 2–3× performance advantage confirms o3’s potential as the most capable LLM to date for real-world vulnerability research.

Key Advantages of o3 in Vulnerability Research

✅ True code reasoning — o3 correctly identifies concurrency bugs requiring understanding of thread interleaving and shared state.

✅ Human-like reporting — its outputs resemble a concise, well-written vulnerability disclosure.

✅ No need for toolchains — it operates solely on textual prompts and raw code.

✅ Accelerates review — helps validate existing bugs, evaluate patch completeness, and spot overlooked edge cases.

Key Limitations

⚠️ High false positive rate — In some tests, the signal-to-noise ratio was 1:50.

⚠️ False negatives still common — Complex or large codebases reduce detection rates.

⚠️ Prompt engineering still critical — Crafting the right code context is essential for success.

⚠️ Does not understand runtime behavior — Unlike dynamic tools (e.g., fuzzers), o3 lacks real-time insight into system execution.

Why This Zero-Day Matters

The most astonishing revelation came from a subtle insight: when o3 found the logoff vulnerability, it also corrected the flawed fix previously suggested for CVE-2025-37778. The original patch nulled the pointer after freeing, assuming this would prevent misuse. o3 correctly reasoned that in multi-threaded scenarios, another thread could still access freed memory before it is nulled, making the fix insufficient.

In some of its outputs, o3 pointed this out—demonstrating not just pattern matching but genuine multi-threaded reasoning.

Implications for the Security Community

This discovery marks a paradigm shift. For the first time, a general-purpose language model has:

Identified a real, critical kernel vulnerability before it was publicly known
Outperformed competing models in security-focused benchmarks
Demonstrated real value in code auditing and patch validation workflows

The implications are clear: LLMs are now viable assistants for vulnerability research. Not replacements—but accelerators. They should be integrated into toolchains, IDEs, and CI pipelines to amplify analyst efficiency and provide a second set of eyes in security-critical codebases.

Conclusion

With o3, OpenAI has shown that large language models can move beyond theoretical promise to practical impact in offensive and defensive security. While they remain imperfect, their capability is now well above the noise floor and worthy of integration into serious security workflows.

CVE-2025-37899 has since been patched. Administrators running Linux distributions with ksmbd support are urged to apply the latest kernel updates immediately.

Source: sean.heelan.io

X (Twitter) Facebook Pinterest LinkedIn Email WhatsApp

OpenAI’s o3 Model Discovers Linux Kernel Zero-Day: A New Era for Vulnerability Research?

A Zero-Day Found With a Prompt

Not Just Luck: Outperforms Other Models

Key Advantages of o3 in Vulnerability Research

Key Limitations

Why This Zero-Day Matters

Implications for the Security Community

Conclusion

Related articles

Bootkitty: The First UEFI Bootkit for Linux Raises New Cybersecurity Threats

How to Install Wine 10.0 on RedHat-Based Distributions

Beginner’s guide to the mkdir command on Linux

Is LiteSpeed or OpenLiteSpeed the Best Option for Speeding Up a WordPress Website?

Run macOS on Any Computer with Proxmox Thanks to the OSX-Proxmox Project

Linux 6.6 Makes It Easy To Disable IO_uring System-Wide to Enhance Security

New Improvements in Linux PAE Support for 32-Bit Systems

Understand memory once and for all: RAM, Cache, and Storage

Practical Guide: Installing FreeBSD 14.3 on a Virtual Machine or Bare-Metal Server