Black Hat USA 2025 | LLM-Driven Reasoning for Automated Vulnerability Discovery Behind Hall-of-Fame

This video is a Black Hat USA 2025 talk titled “BinWhisper: LLM-Driven Reasoning for Automated Vulnerability Discovery Behind Hall-of-Fame” by Qinrun Dai and Yifei Xie. The core idea is that vulnerability research still depends heavily on either manual auditing or fuzzing, and the speakers argue that LLMs are most useful not as fully autonomous hackers, but as structured reasoning helpers inside a guided workflow. The talk starts with a manual reverse-engineering walkthrough of CVE-2024-34587, using a Samsung video/RTCP parsing path as the example. They show that the actual…

Black Hat USA 2025 | AI Agents for Offsec with Zero False Positives

Summary Using Large Language Models (LLMs) for offensive security (vulnerability discovery) currently results in an overwhelming number of false positives. To solve this, Dolan-Gavitt proposes shifting away from asking AI to "grade its own homework." Instead, security teams must use Non-AI Deterministic Validation—forcing the AI agent to provide undeniable, mathematically verifiable proof that an exploit works. The Problem: The Specter of False Positives When LLMs are fed source code and asked to find vulnerabilities, they confidently hallucinate bugs. This is a mathematical inevitability due to the Bayesian Base Rate Fallacy.…

Black Hat USA 2025 | Racing for Privilege

The main point is that Intel’s modern Spectre v2 defenses, especially eIBRS, can fail because branch predictor updates happen asynchronously. The researchers show that this timing creates “Branch Predictor Race Conditions” (BPRC), where branch predictions can be learned or applied with the wrong privilege context. In practice, that breaks intended isolation boundaries such as user-to-kernel, guest-to-hypervisor, and even barriers meant to flush unsafe predictions. The talk’s key attack is called Branch Privilege Injection (BPI). In plain English: an unprivileged process can trick the CPU into treating attacker-controlled branch…

Black Hat USA 2025 | Breaking Control Flow Integrity by Abusing Modern C++

"Coroutine Frame-Oriented Programming: Breaking Control Flow Integrity by Abusing Modern C++" by Marcos Bajo: OverviewThe presentation introduces a novel exploitation technique called Coroutine Frame-Oriented Programming (CFOP). It demonstrates how attackers can leverage C++20 coroutines to completely bypass modern Control Flow Integrity (CFI) defenses (such as Intel CET and Microsoft CFG) that are designed to prevent code-reuse attacks like ROP (Return-Oriented Programming). Key Concepts & Background Control Flow Integrity (CFI): A defense mechanism that prevents attackers from redirecting a program's execution flow by enforcing valid transition paths for indirect jumps and calls.…

Black Hat USA 2025 | Clue-Driven Reverse Engineering by LLM in Real-World Malware Analysis

Here is a comprehensive summary of the Black Hat USA 2025 presentation "Pay Attention to the Clue: Clue-driven Reverse Engineering by LLM in Real-world Malware Analysis" by Tien-Chih Lin and Wei-Chieh Chao from CyCraft Technology. Summary The presentation explores how to effectively use Large Language Models (LLMs) for malware reverse engineering while overcoming their biggest flaw: hallucinations. The speakers introduce Celebi, an automated, context-aware system that uses the internal mechanics of LLMs (attention heads and token probabilities) to verify if the AI is telling the truth, ultimately resulting in faster, more accurate…

Black Hat USA 2025 | Hack to the Future: Owning AI-Powered Tools with Old School Vulns

"Hack To The Future: Owning AI-Powered Tools With Old School Vulns" by Nils Amiet and Nathan Hamiel at Black Hat USA 2025: Core Thesis The integration of generative AI into developer productivity tools (like AI code reviewers and data analytics assistants) is creating massive new attack surfaces. While the underlying Large Language Models (LLMs) are not being "hacked," the applications wrapping them are poorly designed, overly permissive, and riddled with classic, "old-school" vulnerabilities like Remote Code Execution (RCE), Prompt Injection, and Insecure Direct Object Reference (IDOR). Because these AI…

Black Hat USA 2025 | Reinventing Agentic AI Security With Architectural Controls

"When Guardrails Aren't Enough: Reinventing Agentic AI Security With Architectural Controls," delivered by David Brauchler III from NCC Group. The Core Thesis The central argument of the presentation is that guardrails are not security boundaries. Much like Web Application Firewalls (WAFs) in the early days of the internet, AI guardrails are merely statistical heuristics. They reduce risk but do not provide "hard" security guarantees and can always be bypassed by a determined attacker. As AI evolves into "agentic" systems—where models can execute tool calls, read databases, and take actions—relying solely…

Black Hat USA 2025 | Invoking Gemini for Workspace Agents with a Simple Google Calendar Invite

"Invitation is All You Need! TARA for Targeted Promptware Attack against Gemini-Powered Assistants," presented by Ben Nassi, Or Yair, and Stav Cohen. Core Premise The presentation highlights a new, highly practical class of cyberattack called "Promptware," specifically targeting Large Language Model (LLM) powered personal assistants like Google's Gemini for Workspace and Android. The researchers demonstrate how an attacker can completely compromise a user's AI assistant simply by sending them a Google Calendar invitation containing hidden, malicious instructions. The Attack Mechanism: Indirect Prompt Injection Unlike traditional hacking that targets memory corruption or…

Black Hat USA 2025 | Training Specialist Models: Automating Malware Development

"Training Specialist Models: Automating Malware Development" explores how small, specialized Large Language Models (LLMs) can be trained to outperform massive generalist models in specific, highly technical tasks—specifically, the creation of evasive malware. Here is a summary of the key points: The Problem with Current ModelsAvery identifies a gap in the current AI landscape for offensive security professionals: Large Generalists (OpenAI, Anthropic): These models are highly capable but come with privacy concerns, high costs, and strict safety filters (refusals) that make them difficult to automate for red teaming. Small Local Models…

Black Hat USA 2025 | Watching the Watchers: Exploring and Testing Defenses of Anti-Cheat Systems

Introduction to the Anti-Cheat Ecosystem The World of Game Cheats: The speakers explore the fast-paced, high-stakes battleground between cheat developers (attackers) and anti-cheat systems (defenders) in modern competitive shooter games []. The Cheat Economy: Cheating is a massive industry. Cheats are often sold via subscription models by well-run, sometimes legally registered companies, with some cheats costing upwards of $200 a month []. Because it is so lucrative, the attack-defense cycle is incredibly rapid. The Shift to the Kernel: Historically, cheats operated in user mode. As anti-cheats adapted, the…