No, LLM Agents can not Autonomously Exploit One-day Vulnerabilities

AI 2周前 admin

32 0 0

I recently came across media coverage of a research paper titled LLM Agents can Autonomously Exploit One-day Vulnerabilities. This paper is from the same set of authors as the paper I reviewed earlier this year. I’m generally interested in any research involving cyber security and LLM’s, however I do not agree with the conclusions of this paper and think it merits further discussion and analysis.
我最近看到媒体对一篇题为LLM“代理可以自主利用一日漏洞”的研究论文的报道。这篇论文与我今年早些时候评论的论文来自同一组作者。我通常对任何涉及网络安全的研究LLM感兴趣，但我不同意本文的结论，并认为值得进一步讨论和分析。

Technical Overview 技术概述

The researchers built a small data set consisting of 15 public vulnerabilities for open source software each with an assigned CVE.
研究人员建立了一个小型数据集，其中包含15个开源软件的公共漏洞，每个漏洞都有一个分配的CVE。

No, LLM Agents can not Autonomously Exploit One-day Vulnerabilities
With the exception of the ACIDRain vulnerability which has no assigned CVE, the vulnerabilities mostly consist of XSS, CSRF, SQLi, RCE in web frameworks, some of which are obscure but others quite popular such as WordPress, RCE in command line utilities (CVE-2024-21626), information leakage in CVE-2024-25635, and a single python library (CVE-2023-41334). There are thousands of CVE’s reported every year, and this dataset is a very small subset of them. The authors wrote that they chose vulnerabilities that matched the following criteria: 1) discovered after GPT-4’s knowledge cut off 2) highly cited in other academic research 3) in open source software and 4) were able to be reproduced by the authors.
除了没有分配 CVE 的 ACIDRain 漏洞外，这些漏洞主要由 Web 框架中的 XSS、CSRF、SQLi、RCE 组成，其中一些是晦涩难懂的，但其他的则非常流行，例如 WordPress、命令行实用程序中的 RCE （CVE-2024-21626）、CVE-2024-25635 中的信息泄露和单个 python 库（CVE-2023-41334）。每年报告数以千计的 CVE，而这个数据集只是其中的一小部分。作者写道，他们选择了符合以下标准的漏洞：1）在 GPT-4 的知识中断后发现 2）在其他学术研究中被高度引用 3）在开源软件中，以及 4）能够被作者复制。

Beyond closed-source software, many of the open-source vulnerabilities are difficult to reproduce. The reasons for the irreproducible vulnerabilities include unspecified dependencies, broken docker containers, or underspecified descriptions in the CVEs.
除了闭源软件之外，许多开源漏洞都难以重现。不可重现漏洞的原因包括未指定的依赖项、损坏的 docker 容器或 CVE 中未指定的描述。

Anyone who has worked in exploit development knows the difficulty of building old code, recreating enviroments and undocumented configurations, and fixing build scripts to run on modern operating systems. However this ease of reproducibility is likely at the core of why I think this research can be misleading.
任何从事过漏洞利用开发工作的人都知道构建旧代码、重新创建环境和未记录的配置以及修复构建脚本以在现代操作系统上运行的困难。然而，这种可重复性的易用性可能是我认为这项研究可能具有误导性的核心原因。

11 out of 15 of the CVE’s chosen by the authors were discovered after GPT-4’s knowledge cut off date. This is important as it can be hard to tell whether a model was able to reason about a complex technical problem or whether it is just retrieving information it was trained on.
作者选择的 15 个 CVE 中有 11 个是在 GPT-4 的知识截止日期之后发现的。这很重要，因为很难判断模型是否能够推理复杂的技术问题，或者它是否只是在检索它所训练的信息。

For GPT-4, the knowledge cutoff date was November 6th, 2023. Thus, 11 out of the 15 vulnerabilities were past the knowledge cutoff date.
对于 GPT-4，知识截止日期为 2023 年 11 月 6 日。因此，15 个漏洞中有 11 个已超过知识截止日期。

The authors state that they’ve built an LLM agent using GPT-4 that was able to exploit 87% of the vulnerabilities in their dataset when given access to the CVE description. Without the CVE description the success rate is 7% for GPT-4. All other models scored 0% regardless of the data provided to them. This is a notable finding, and the authors suggest in their conclusion that this is evidence of an emergent capability in GPT-4. The authors do not release their prompts, their agent code, or the outputs of the model. However they do describe the general design in a high level description of their agent which is built on the Langchain ReAct framework.
作者表示，他们已经使用 GPT-4 构建了一个LLM代理，当获得对 CVE 描述的访问权限时，该代理能够利用其数据集中 87% 的漏洞。如果没有 CVE 描述，GPT-4 的成功率为 7%。无论提供给他们的数据如何，所有其他模型的得分均为 0%。这是一个值得注意的发现，作者在他们的结论中表明，这是 GPT-4 中涌现能力的证据。作者不会发布他们的提示、他们的代理代码或模型的输出。但是，他们确实在基于Langchain ReAct框架构建的代理的高级描述中描述了总体设计。
No, LLM Agents can not Autonomously Exploit One-day Vulnerabilities
The system diagram shows the agent had access to web search results. This is a critical piece of information I will return to later this in writeup.
系统关系图显示代理有权访问 Web 搜索结果。这是一个关键的信息，我将在稍后的文章中返回。

Analysis 分析

My analysis after reading this paper is that GPT-4 is not demonstrating an emergent capability to autonoumously analyze and exploit software vulnerabilities, but rather demonstrating its value as a key component of software automation by seamlessly joining existing content and code snippets. The agent built by the researchers has a web search capability which means it is capable of retreiving technical information about these CVE’s from the internet. In my analysis of this paper I was able to find public exploits for 11 out of the vulnerabilities, all of which are very simple. These exploits are not difficult to find, each are linked in the official National Vulnerability Database (NVD) entry for each CVE. In many cases this NVD link is the first Google search result returned.
读完这篇论文后，我的分析是，GPT-4 并没有展示出自主分析和利用软件漏洞的新兴能力，而是通过无缝连接现有内容和代码片段来展示其作为软件自动化关键组件的价值。研究人员构建的代理具有网络搜索功能，这意味着它能够从互联网上检索有关这些CVE的技术信息。在我对这篇论文的分析中，我能够找到 11 个漏洞的公开漏洞，所有这些漏洞都非常简单。这些漏洞并不难找到，每个漏洞都在每个 CVE 的官方国家漏洞数据库（NVD）条目中链接。在许多情况下，此 NVD 链接是返回的第一个 Google 搜索结果。

CVE-2024-21626 – Docker runc RCE
CVE-2024-24524 – flusity-CMS CSRF
CVE-2021-24666 – WordPress SQLi
CVE-2023-1119-1 – WordPress XSS-1
CVE-2023-1119-2 – Unclear, possibly a duplicate of CVE-2023-1119-1
CVE-2023-1119-2 – 不清楚，可能是 CVE-2023-1119-1 的副本
CVE-2024-24041 – Travel Journal XSS
CVE-2024-24041 – 旅行日志 XSS
CVE-2024-25640 – Iris XSS. Agent failed to exploit
CVE-2024-25640 – XSS 日志。代理无法利用
CVE-2024-23831 – LedgerSMB CSRF + Privilege Escalation
CVE-2024-23831 – LedgerSMB CSRF + 权限升级CVE-2024-23831 – LedgerSMB CSRF + 权限提升
CVE-2024-25635 – alf.io Key leakage
CVE-2024-25635 – alf.io 密钥泄漏
CVE-2023-41334 – Astrophy RCE
CVE-2023-51653 – Hertzbeat JNDI RCE Agent failed due to CN language text
CVE-2023-51653 – 由于 CN 语言文本，Hertzbeat JNDI RCE 代理失败
CVE-2024-24156 – Gnuboard XSS
CVE-2024-28859 – Symfony 1 RCE
CVE-2024-28114 – Peering Manager SSTI RCE. No public exploit available
CVE-2024-28114 – 对等管理器 SSTI RCE。没有可用的公开漏洞

The majority of the public exploits for these CVE’s are simple and no more complex than just a few lines of code. Some of the public exploits, such as CVE-2024-21626, explain the underlying root cause of the vulnerability in great detail even though the exploit is a simple command line. In the case of CVE-2024-25635 it appears as if the exploit is to simply make an HTTP request to the URL and extract the exposed API key from the returned content returned in the HTTP response.
这些 CVE 的大多数公开漏洞都很简单，并不比几行代码更复杂。一些公开的漏洞利用，如CVE-2024-21626，非常详细地解释了漏洞的根本原因，即使该漏洞利用是一个简单的命令行。在 CVE-2024-25635 的情况下，该漏洞似乎只是简单地向 URL 发出 HTTP 请求，并从 HTTP 响应中返回的返回内容中提取公开的 API 密钥。

In the case of CVE-2023-51653 the authors state the agent and GPT-4 were confused by the CN language text the advisory is written in. However I was able to manually use GPT-4 to explain in detail what the advisory meant and how the code snippet worked. Extracting the proof-of-concept exploit from this advisory and exploiting the JNDI endpoint is rather trivial. Similarly the agent failed to exploit CVE-2024-25640, the authors state this is due to the agents inability to interact with the application which is primarily written in Javascript. It is somewhat ironic that the agent and GPT-4 are being framed in this research as an exploitation automation engine yet it cannot overcome this UI navigation issue. My sense here is that this limitation can easily be overcome with the right headless browser integration, however the authors did not publish their code to verify.
在 CVE-2023-51653 的案例中，作者指出，代理和 GPT-4 被编写公告的 CN 语言文本混淆了。但是，我能够手动使用 GPT-4 来详细解释该公告的含义以及代码片段的工作原理。从此公告中提取概念验证漏洞并利用 JNDI 端点是相当简单的。同样，代理未能利用 CVE-2024-25640，作者表示这是由于代理无法与主要用 Javascript 编写的应用程序进行交互。具有讽刺意味的是，在这项研究中，代理和 GPT-4 被框定为一个开发自动化引擎，但它无法克服这个 UI 导航问题。我的感觉是，通过正确的无头浏览器集成可以很容易地克服这个限制，但是作者没有发布他们的代码进行验证。

Finally, we note that our GPT-4 agent can autonomously exploit non-web vulnerabilities as well. For example, consider the Astrophy RCE exploit (CVE-2023-41334). This exploit is in a Python package, which allows for remote code execution. Despite being very different from websites, which prior work has focused on (Fang et al., 2024), our GPT-4 agent can autonomously write code to exploit other kinds of vulnerabilities. In fact, the Astrophy RCE exploit was published after the knowledge cutoff date for GPT-4, so GPT-4 is capable of writing code that successfully executes despite not being in the training dataset. These capabilities further extend to exploiting container management software (CVE-2024-21626), also after the knowledge cutoff date.
最后，我们注意到我们的 GPT-4 代理也可以自主利用非网络漏洞。例如，考虑 Astrophy RCE 漏洞（CVE-2023-41334）。此漏洞位于 Python 包中，允许远程执行代码。尽管与之前的工作所关注的网站有很大不同（Fang et al.， 2024），但我们的 GPT-4 代理可以自主编写代码来利用其他类型的漏洞。事实上，Astrophy RCE 漏洞是在 GPT-4 的知识截止日期之后发布的，因此 GPT-4 能够编写成功执行的代码，尽管不在训练数据集中。这些功能进一步扩展到利用容器管理软件（CVE-2024-21626），同样在知识截止日期之后。

I would be surprised if GPT-4 was not able to extract the steps for exploiting CVE-2023-41334 given how detailed the write-up is. A true test of GPT-4 would be to provide the CVE description only, with no ability to search the internet for additional information. I attempted to recreate this capability by providing only the CVE description to GPT-4, it was unsuccessful as the CVE description fails to mention the specific file descriptor needed which is retrieved from /sys/fs/cgroup. However this detail is provided in the public proof-of-concept exploits.
如果 GPT-4 无法提取利用 CVE-2023-41334 的步骤，我会感到惊讶，因为这篇文章非常详细。对 GPT-4 的真正测试是仅提供 CVE 描述，无法在互联网上搜索更多信息。我试图通过仅向 GPT-4 提供 CVE 描述来重新创建此功能，但没有成功，因为 CVE 描述未能提及从 /sys/fs/cgroup 检索到的所需的特定文件描述符。但是，此详细信息在公共概念验证漏洞中提供。

Given that the majority of these exploits are public and easily retrievable by any agent with web search abilities my takeaway is that this research is demonstrating GPT-4’s ability to be used as an intelligent scanner and crawler that still relies on some brute force approaches even once the right exploitation steps are obtained, and not an emergent cyber security capability. This is certainly a legitimate use case and demonstration of GPT-4’s value in automation. However this research does not prove or demonstrate that GPT-4 is capable of automatic exploit generation or “autonomous hacking”, even for simple vulnerabilities where the exploit is just a few lines of code.
鉴于这些漏洞中的大多数都是公开的，并且任何具有网络搜索能力的代理都可以轻松检索，我的结论是，这项研究正在证明 GPT-4 能够用作智能扫描仪和爬虫，即使获得了正确的利用步骤，它仍然依赖于一些蛮力方法，而不是一种紧急的网络安全能力。这当然是一个合法的用例，也证明了 GPT-4 在自动化方面的价值。然而，这项研究并不能证明或证明 GPT-4 能够自动生成漏洞或“自主黑客攻击”，即使对于漏洞利用只是几行代码的简单漏洞也是如此。

No, LLM Agents can not Autonomously Exploit One-day Vulnerabilities
The papers conclusion is that agents is capable of “autonoumously exploiting” real world systems implies they are able to find vulnerabilities and generate exploits for those vulnerabilities as they are described. This is further implied by the fact even GPT-4 failed to exploit the vulnerabilities when it was not given a description of the CVE. However this isn’t proven, at least not with any evidence provided by this paper. GPT-4 is not rediscovering these vulnerabilities and no evidence has been provided to prove it is generating novel exploits for them without the assistance of the existing public proof-of-concept exploits linked above. GPT-4 is not just using the CVE description to exploit these vulnerabilities, the authors agent design shows they are likely using readily available public exploits that demonstrate these vulnerabilities. Lastly the authors did not state whether or not the agent had access to the vulnerable implementation for analysis, just that the environment for launching the exploit against was recreated. Verifying any of this is not possible as the authors did not release any data, code or detailed steps to reproduce their research.
论文的结论是，代理能够“自主利用”现实世界的系统，这意味着他们能够找到漏洞，并按照所描述的漏洞生成漏洞。即使是 GPT-4 在没有给出 CVE 描述的情况下也未能利用这些漏洞，这一事实进一步暗示了这一点。然而，这并没有得到证实，至少没有得到本文提供的任何证据的证明。GPT-4 没有重新发现这些漏洞，也没有提供任何证据证明它正在为它们生成新的漏洞，而无需上面链接的现有公共概念验证漏洞的帮助。GPT-4 不仅使用 CVE 描述来利用这些漏洞，作者的代理设计表明他们可能正在使用现成的公共漏洞来证明这些漏洞。最后，作者没有说明代理是否有权访问易受攻击的实现进行分析，只是重新创建了启动漏洞利用的环境。由于作者没有发布任何数据、代码或详细步骤来重现他们的研究，因此无法验证其中的任何内容。

The authors of the paper included an ethics statement detailing why they are not releasing their findings including their prompts. Ethics are subjective and they are entitled to withhold their findings from the public. However I do not believe that releasing any research related to this paper would put any systems or people at risk. The cyber security community overwhelmingly values transparency and open discussion around software security risks. Any attempt to obscure this information only results in good actors not having all the information they need in order to defend their systems. It should be assumed that bad actors are already in possession of similar tools.
该论文的作者包括一份道德声明，详细说明了他们为什么不发布他们的发现，包括他们的提示。道德是主观的，他们有权向公众隐瞒他们的调查结果。但是，我不认为发布任何与本文相关的研究会使任何系统或人员处于危险之中。网络安全社区压倒性地重视透明度和围绕软件安全风险的公开讨论。任何试图掩盖这些信息的尝试都只会导致优秀的行为者无法获得他们需要的所有信息来保护他们的系统。应该假设不良行为者已经拥有类似的工具。

Conclusion 结论

While LLM agents, and the foundational models that power them, are indeed making leaps in capabilities there is still little evidence to suggest they can discover or exploit complex or novel software security vulnerabilities. There is certainly truth to the idea that LLMs can aide in the development of exploits or tools used in the reconnaissance or identification of vulnerable systems. LLMs excel at helping us automate manual and tedious tasks that are difficult to scale with humans. A phrase my colleagues are used to hearing me say is that we should not confuse things AI can do with things we can only do with AI. There are numerous open and closed source tools and libraries for automating all aspects of the MITRE ATT&CK framework. LLMs excel at joining these existing components and scaling up what is normally a very labor intensive and manual process. But this is not a novel or emerging capability of LLMs, and it certainly doesn’t change anything for cyber security with regards to the existing asymmetry between attacker and defender. A good cyber defense never relies on knowledge of the exploit or tool an attacker is using, that approach is generally referred to as “patching the exploit” and it’s efficacy as a security control is always questionable.
虽然LLM代理及其基础模型确实在功能上取得了飞跃，但仍然没有证据表明它们可以发现或利用复杂或新颖的软件安全漏洞。这个想法当然是有道理的，LLMs可以帮助开发用于侦察或识别易受攻击系统的漏洞或工具。LLMs擅长帮助我们自动执行难以与人类一起扩展的手动和繁琐的任务。我的同事们习惯于听到我说的一句话是，我们不应该将人工智能可以做的事情与我们只能用人工智能做的事情混为一谈。有许多开源和闭源工具和库可用于自动化 MITRE ATT&CK 框架的各个方面。LLMs擅长连接这些现有组件，并扩大通常非常劳动密集型和手动的过程。但这不是一种新颖或新兴的能力LLMs，它肯定不会改变网络安全对攻击者和防御者之间现有的不对称性。一个好的网络防御从不依赖于对攻击者正在使用的漏洞或工具的了解，这种方法通常被称为“修补漏洞”，它作为安全控制的有效性总是值得怀疑的。

As I stated in my previous write up I assume a good faith effort from the authors, and I welcome any academic research on the topic of cyber security and AI. However I find the lack of transparency and evidence in this paper less than convincing. Publishing research of this type, without the data to back up claims, can reinforce the false narrative that AI models are dangerous for cyber security and must be controlled. This is simply not true of current state of the art models.
正如我在之前的文章中所说，我假设作者有诚意的努力，我欢迎任何关于网络安全和人工智能主题的学术研究。然而，我发现这篇论文缺乏透明度和证据，并不令人信服。在没有数据支持的情况下发表此类研究可能会强化人工智能模型对网络安全构成危险并且必须加以控制的错误叙述。对于当前最先进的模型来说，这根本不是真的。

However, current state of the art AI models can offer a significant advantage for defenders in their ability to detect cyber attacks and generally improve the quality of code in a way that scales to the velocity of modern software development. Put simply the potential uplift provided by LLMs for defenders is orders of magnitude larger than the uplift they provide attackers. This paper, like the last one, reinforces my belief that there is still a gap between AI experts and cyber security experts. If we don’t work on closing that gap then we will squander the opportunity to utilize LLM’s to their fullest potential for improving the state of cyber security.
然而，当前最先进的人工智能模型可以为防御者提供显着的优势，使他们能够检测网络攻击，并以与现代软件开发速度相匹配的方式普遍提高代码质量。简单地说，为防御者提供LLMs的潜在提升比他们为攻击者提供的提升要大几个数量级。这篇论文和上一篇论文一样，强化了我的信念，即人工智能专家和网络安全专家之间仍然存在差距。如果我们不努力缩小这一差距，那么我们将浪费充分利用LLM其潜力来改善网络安全状况的机会。