How hackers can read your chats with ChatGPT or Microsoft Copilot

AI 2周前 admin
8 0 0

How hackers can read your chats with ChatGPT or Microsoft Copilot

What information can be extracted from intercepted AI chatbot messages?
可以从截获的 AI 聊天机器人消息中提取哪些信息?

Naturally, chatbots send messages in encrypted form. All the same, the implementation of large language models (LLMs) and the chatbots built on them harbors a number of features that seriously weaken the encryption. Combined, these features make it possible to carry out a side-channel attack when the content of a message is restored from fragments of leaked information.
当然,聊天机器人以加密形式发送消息。尽管如此,大型语言模型(LLMs)的实现以及基于它们构建的聊天机器人具有许多严重削弱加密的功能。综合起来,这些功能使得当从泄露信息片段中恢复消息内容时,可以进行侧信道攻击。

To understand what happens during this attack, we need to dive a little into the details of LLM and chatbot mechanics. The first thing to know is that LLMs operate not on individual characters or words as such, but on tokens, which can be described as semantic units of text. The Tokenizer page on the OpenAI website offers a glimpse into the inner workings.
要了解这次攻击期间会发生什么,我们需要深入研究聊天机器人机制的细节LLM。首先要知道的是,LLMs不是对单个字符或单词本身进行操作,而是对标记进行操作,标记可以被描述为文本的语义单元。OpenAI 网站上的 Tokenizer 页面提供了对内部工作原理的一瞥。

How hackers can read your chats with ChatGPT or Microsoft Copilot

This example demonstrates how message tokenization works with the GPT-3.5 and GPT-4 models. Source
此示例演示消息标记化如何与 GPT-3.5 和 GPT-4 模型配合使用。源

The second feature that facilitates this attack you’ll already know about if you’ve interacted with AI chatbots yourself: they don’t send responses in large chunks but gradually — almost as if a person were typing them. But unlike a person, LLMs write in tokens — not individual characters. As such, chatbots send generated tokens in real time, one after another; or, rather, most chatbots do: the exception is Google Gemini, which makes it invulnerable to this attack.
如果你自己与人工智能聊天机器人互动过,你就会知道促进这种攻击的第二个功能:它们不会大块地发送响应,而是逐渐发送——几乎就像一个人在打字一样。但与人不同的是,LLMs用标记书写——而不是单个字符。因此,聊天机器人一个接一个地实时发送生成的令牌;或者,更确切地说,大多数聊天机器人都是这样做的:谷歌双子座是个例外,这使得它不受这种攻击的影响。

The third peculiarity is the following: at the time of publication of the paper, the majority of chatbots didn’t use compression, encoding or padding (appending garbage data to meaningful text to reduce predictability and increase cryptographic strength) before encrypting a message.
第三个特点是:在论文发表时,大多数聊天机器人在加密消息之前没有使用压缩、编码或填充(将垃圾数据附加到有意义的文本中以降低可预测性并增加加密强度)。

Side-channel attacks exploit all three of these peculiarities. Although intercepted chatbot messages can’t be decrypted, attackers can extract useful data from them — specifically, the length of each token sent by the chatbot. The result is similar to a Wheel of Fortune puzzle: you can’t see what exactly is encrypted, but the length of the individual words tokens is revealed.
侧信道攻击利用了这三个特性。尽管截获的聊天机器人消息无法解密,但攻击者可以从中提取有用的数据,特别是聊天机器人发送的每个令牌的长度。结果类似于命运之轮的谜题:你看不到到底加密了什么,但各个单词标记的长度被揭示出来。

How hackers can read your chats with ChatGPT or Microsoft Copilot

While it’s impossible to decrypt the message, the attackers can extract the length of the tokens sent by the chatbot; the resulting sequence is similar to a hidden phrase in the Wheel of Fortune show. Source
虽然无法解密消息,但攻击者可以提取聊天机器人发送的令牌的长度;生成的序列类似于《命运之轮》节目中的一个隐藏短语。源

Using extracted information to restore message text
使用提取的信息恢复消息文本

All that remains is to guess what words are hiding behind the tokens. And you’ll never believe who’s good at guessing games: that’s right — LLMs. In fact, this is their primary purpose in life: to guess the right words in the given context. So, to restore the text of the original message from the resulting sequence of token lengths, the researchers turned to an LLM…
剩下的就是猜测令牌后面隐藏着什么单词。你永远不会相信谁擅长猜谜游戏:没错——LLMs。事实上,这是他们生活的主要目的:在给定的上下文中猜测正确的单词。因此,为了从生成的令牌长度序列中恢复原始消息的文本,研究人员转向了LLM…

Two LLMs, to be precise, since the researchers observed that the opening exchanges in conversations with chatbots are almost always formulaic, and thus readily guessable by a model specially trained on an array of introductory messages generated by popular language models. Thus, the first model is used to restore the introductory messages and pass them to the second model, which handles the rest of the conversation.
第二LLMs,准确地说,因为研究人员观察到,与聊天机器人对话中的开场白几乎总是公式化的,因此很容易被一个专门训练的模型猜到,该模型是由流行的语言模型生成的一系列介绍性信息。因此,第一个模型用于恢复介绍性消息并将其传递给第二个模型,该模型处理会话的其余部分。

How hackers can read your chats with ChatGPT or Microsoft Copilot

General scheme of the attack. Source
攻击的一般方案。源

This produces a text in which the token lengths correspond to those in the original message. But specific words are brute-forced with varying degrees of success. Note that a perfect match between the restored message and the original is rare — it usually happens that a part of the text is guessed wrong. Sometimes the result is satisfactory:
这将生成一个文本,其中标记长度与原始消息中的标记长度相对应。但具体的词语是蛮力的,取得了不同程度的成功。请注意,恢复的消息和原始消息之间的完美匹配很少见 – 通常会发生部分文本被猜错的情况。有时结果是令人满意的:

How hackers can read your chats with ChatGPT or Microsoft Copilot

In this example, the text was restored quite close to the original. Source
在此示例中,文本恢复到与原始文本非常接近。源

But in an unsuccessful case, the reconstructed text may have little, or even nothing, in common with the original. For example, the result might be this:
但在不成功的情况下,重建的文本可能与原始文本几乎没有共同之处,甚至没有共同之处。例如,结果可能是这样的:

How hackers can read your chats with ChatGPT or Microsoft Copilot

Here the guesswork leaves much to be desired. Source
在这里,猜测还有很多不足之处。源

Or even this: 甚至这个:

How hackers can read your chats with ChatGPT or Microsoft Copilot

As Alice once said, “those are not the right words.” Source
正如爱丽丝曾经说过的那样,“这些话不对。源

In total, the researchers examined over a dozen AI chatbots, and found most of them vulnerable to this attack — the exceptions being Google Gemini (née Bard) and GitHub Copilot (not to be confused with Microsoft Copilot).
总的来说,研究人员检查了十几个人工智能聊天机器人,发现它们中的大多数都容易受到这种攻击 – 例外是谷歌双子座(née Bard)和GitHub Copilot(不要与Microsoft Copilot混淆)。

How hackers can read your chats with ChatGPT or Microsoft Copilot

At the time of publication of the paper, many chatbots were vulnerable to the attack. Source
在论文发表时,许多聊天机器人都容易受到攻击。源

Should I be worried?
我应该担心吗?

It should be noted that this attack is retrospective. Suppose someone took the trouble to intercept and save your conversations with ChatGPT (not that easy, but possible), in which you revealed some awful secrets. In this case, using the above-described method, that someone would theoretically be able to read the messages.
应该注意的是,这次攻击是追溯性的。假设有人不厌其烦地拦截并保存了您与 ChatGPT 的对话(不是那么容易,但有可能),您在对话中透露了一些可怕的秘密。在这种情况下,使用上述方法,理论上有人将能够阅读消息。

Thankfully, the interceptor’s chances are not too high: as the researchers note, even the general topic of the conversation was determined only 55% of the time. As for successful reconstruction, the figure was a mere 29%. It’s worth mentioning that the researchers’ criteria for a fully successful reconstruction were satisfied, for example, by the following:
值得庆幸的是,拦截器的机会并不太高:正如研究人员所指出的那样,即使是谈话的一般主题也只有55%的时间被确定。至于成功的重建,这个数字仅为29%。值得一提的是,研究人员完全成功的重建标准得到了满足,例如,以下几点:

How hackers can read your chats with ChatGPT or Microsoft Copilot

Example of a text reconstruction that the researchers considered fully successful. Source
研究人员认为完全成功的文本重建示例。源

How important such semantic nuances are — decide for yourself. Note, however, that this method will most likely not extract any actual specifics (names, numerical values, dates, addresses, contact details, other vital information) with any degree of reliability.
这种语义上的细微差别有多重要——你自己决定。但请注意,此方法很可能不会以任何程度的可靠性提取任何实际细节(姓名、数值、日期、地址、联系方式、其他重要信息)。

And the attack has one other limitation that the researchers fail to mention: the success of text restoration depends greatly on the language the intercepted messages are written in: the success of tokenization varies greatly from language to language. This paper was focused on English, which is characterized by very long tokens that are generally equivalent to an entire word. Hence, tokenized English text shows distinct patterns that make reconstruction relatively straightforward.
这种攻击还有一个研究人员没有提到的局限性:文本恢复的成功很大程度上取决于截获的消息所写的语言:标记化的成功因语言而异。这篇论文的重点是英语,它的特点是很长的标记,通常相当于一个完整的单词。因此,标记化的英语文本显示出不同的模式,使重建相对简单。

No other language comes close. Even for those languages in the Germanic and Romance groups, which are the most akin to English, the average token length is 1.5–2 times shorter; and for Russian, 2.5 times: a typical Russian token is only a couple of characters long, which will likely reduce the effectiveness of this attack down to zero.
没有其他语言能与之相提并论。即使是日耳曼语和罗曼语组中与英语最相似的语言,平均标记长度也要短 1.5-2 倍;而对于俄语,则为 2.5 倍:典型的俄语令牌只有几个字符长,这可能会将这种攻击的有效性降低到零。

At least two AI chatbot developers — Cloudflare and OpenAI — have already reacted to the paper by adding the padding method mentioned above, which was designed specifically with this type of threat in mind. Other AI chatbot developers are set to follow suit, and future communication with chatbots will, fingers crossed, be safeguarded against this attack.
至少有两家人工智能聊天机器人开发商——Cloudflare和OpenAI——已经对这篇论文做出了反应,添加了上面提到的填充方法,该方法专门针对此类威胁而设计。其他人工智能聊天机器人开发人员也将效仿,未来与聊天机器人的通信将受到保护,免受这种攻击。

原文始发于Alanna Titterington:How hackers can read your chats with ChatGPT or Microsoft Copilot

版权声明:admin 发表于 2024年4月27日 下午9:19。
转载请注明:How hackers can read your chats with ChatGPT or Microsoft Copilot | CTF导航

相关文章