{"id":346,"date":"2026-04-13T02:56:23","date_gmt":"2026-04-13T02:56:23","guid":{"rendered":"https:\/\/haco.club\/?p=346"},"modified":"2026-04-13T02:56:23","modified_gmt":"2026-04-13T02:56:23","slug":"black-hat-usa-2025-llm-driven-reasoning-for-automated-vulnerability-discovery-behind-hall-of-fame","status":"publish","type":"post","link":"https:\/\/haco.club\/?p=346","title":{"rendered":"Black Hat USA 2025 | LLM-Driven Reasoning for Automated Vulnerability Discovery Behind Hall-of-Fame"},"content":{"rendered":"\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"Black Hat USA 2025 | LLM-Driven Reasoning for Automated Vulnerability Discovery Behind Hall-of-Fame\" width=\"640\" height=\"360\" src=\"https:\/\/www.youtube.com\/embed\/WVjnipkKp4U?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<p>This video is a Black Hat USA 2025 talk titled <strong>\u201cBinWhisper: LLM-Driven Reasoning for Automated Vulnerability Discovery Behind Hall-of-Fame\u201d<\/strong> by Qinrun Dai and Yifei Xie. The core idea is that vulnerability research still depends heavily on either manual auditing or fuzzing, and the speakers argue that LLMs are most useful not as fully autonomous hackers, but as structured reasoning helpers inside a guided workflow.<\/p>\n\n\n\n<p>The talk starts with a manual reverse-engineering walkthrough of <strong>CVE-2024-34587<\/strong>, using a Samsung video\/RTCP parsing path as the example. They show that the actual bug is relatively straightforward once the right code path and buffer relationships are understood, but that the hard part is reconstructing the call chain, data flow, and data structures that make the bug obvious.<\/p>\n\n\n\n<p>From there, the speakers explain what LLMs are good at and what they are not. Their point is that a na\u00efve prompt like \u201cis this function vulnerable?\u201d produces shaky answers because the model has to invent too many missing assumptions. So BinWhisper feeds the model much richer context: decompiled code, argument descriptions, parent-function context, reconstructed data structures, and predefined memory-corruption bug patterns.<\/p>\n\n\n\n<p>The system they present is <strong>hybrid<\/strong>, not fully automatic. Humans choose the target and verify the final results; static analysis builds the global call graph; then AI agents locate packet-receiving and parsing functions, simplify the call graph, reconstruct relevant structures, analyze for bugs, and generate a report. The slides explicitly label the workflow as a mix of <strong>[Human]<\/strong>, <strong>[Static Analysis]<\/strong>, and <strong>[AI]<\/strong> steps.<\/p>\n\n\n\n<p>Their real-world target is <strong>SecVideoEngineService<\/strong>, which they describe as high-privilege, remotely reachable, installed by default on mobile phones, and therefore attractive from an attacker\u2019s perspective. The pitch is that this kind of target is too large and messy for a single prompt, but workable if the reasoning process is broken into smaller specialist stages.<\/p>\n\n\n\n<p>The practical outcome is that BinWhisper reportedly uncovered multiple Samsung issues, including <strong>five bugs<\/strong> labeled <strong>SVE-2024-1490, 1492, 1494, 1495, and 1496<\/strong>, in media\/RTP frame-handling functions such as <code>rtp_dep_h264_put_frm<\/code>, <code>rtp_dep_h265_put_frm<\/code>, <code>rtp_dep_h263_put_frm<\/code>, and <code>rtp_dep_h263plus_put_frm<\/code>. One slide describes the first bug as an unchecked write pattern that can overflow fixed-size arrays and potentially overwrite adjacent memory.<\/p>\n\n\n\n<p>The benchmarking section compares several models and emphasizes tradeoffs rather than naming one universal winner. The reported confidence ranges from about <strong>55% to 92%<\/strong>, with runtimes from about <strong>1 hour 23 minutes to 21 hours 18 minutes<\/strong>, and estimated cost from about <strong>$1 to $45<\/strong> depending on the model. The closing message is blunt: <strong>LLMs can help find real bugs, but they still need humans to frame the problem and validate the answers<\/strong>.<\/p>\n\n\n\n<p>So, in one sentence: this video argues that <strong>LLMs are most effective in vulnerability research when used as tightly guided reasoning components inside a human-led reverse-engineering pipeline, and the authors claim that approach found real Samsung bugs<\/strong>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>This video is a Black Hat USA 2025 talk titled [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[35,5],"class_list":["post-346","post","type-post","status-publish","format-standard","hentry","category-black-hat","tag-llm","tag-security"],"_links":{"self":[{"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/posts\/346","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/haco.club\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=346"}],"version-history":[{"count":1,"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/posts\/346\/revisions"}],"predecessor-version":[{"id":347,"href":"https:\/\/haco.club\/index.php?rest_route=\/wp\/v2\/posts\/346\/revisions\/347"}],"wp:attachment":[{"href":"https:\/\/haco.club\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=346"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/haco.club\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=346"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/haco.club\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=346"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}