Key findings 主要发现

We analyzed the security of cloud-based pinyin keyboard apps from nine vendors — Baidu, Honor, Huawei, iFlytek, OPPO, Samsung, Tencent, Vivo, and Xiaomi — and examined their transmission of users’ keystrokes for vulnerabilities.
我们分析了来自百度、荣耀、华为、科大讯飞、OPPO、三星、腾讯、Vivo 和小米等九家供应商的基于云的拼音键盘应用程序的安全性，并检查了它们传输的用户击键是否存在漏洞。
Our analysis revealed critical vulnerabilities in keyboard apps from eight out of the nine vendors in which we could exploit that vulnerability to completely reveal the contents of users’ keystrokes in transit. Most of the vulnerable apps can be exploited by an entirely passive network eavesdropper.
我们的分析揭示了九家供应商中八家的键盘应用程序中的关键漏洞，我们可以利用该漏洞完全揭示用户在传输过程中的击键内容。大多数易受攻击的应用程序都可以被完全被动的网络窃听者利用。
Combining the vulnerabilities discovered in this and our previous report analyzing Sogou’s keyboard apps, we estimate that up to one billion users are affected by these vulnerabilities. Given the scope of these vulnerabilities, the sensitivity of what users type on their devices, the ease with which these vulnerabilities may have been discovered, and that the Five Eyes have previously exploited similar vulnerabilities in Chinese apps for surveillance, it is possible that such users’ keystrokes may have also been under mass surveillance.
结合本报告中发现的漏洞和我们之前分析搜狗键盘应用程序的报告，我们估计多达 10 亿用户受到这些漏洞的影响。鉴于这些漏洞的范围、用户在设备上输入内容的敏感性、这些漏洞被发现的难易程度，以及五眼联盟之前曾利用中国应用程序中的类似漏洞进行监控，这些用户的击键也可能受到大规模监控。
We reported these vulnerabilities to all nine vendors. Most vendors responded, took the issue seriously, and fixed the reported vulnerabilities, although some keyboard apps remain vulnerable.
我们向所有 9 家供应商报告了这些漏洞。大多数供应商做出了回应，认真对待了这个问题，并修复了报告的漏洞，尽管一些键盘应用程序仍然容易受到攻击。
We conclude our report by summarizing our recommendations to various stakeholders to attempt to reduce future harm from apps which might feature similar vulnerabilities.
在报告的最后，我们总结了对各利益相关者的建议，以尝试减少可能具有类似漏洞的应用程序对未来的危害。

Introduction 介绍

Typing logographic languages such as Chinese is more difficult than typing alphabetic languages, where each letter can be represented by one key. There is no way to fit the tens of thousands of Chinese characters that exist onto a single keyboard. Despite this obvious challenge, technologies have developed which make typing in Chinese possible. To enable the input of Chinese characters, a writer will generally use a keyboard app with an “Input Method Editor” (IME). IMEs offer a variety of approaches to inputting Chinese characters, including via handwriting, voice, and optical character recognition (OCR). One popular phonetic input method is Zhuyin, and shape or stroke-based input methods such as Cangjie or Wubi are commonly used as well. However, used by nearly 76% of mainland Chinese keyboard users, the most popular way of typing in Chinese is the pinyin method, which is based on the pinyin romanization of Chinese characters.
输入中文等标识语言比输入字母语言更困难，在字母语言中，每个字母都可以用一个键表示。没有办法将数以万计的汉字放在一个键盘上。尽管存在这一明显的挑战，但技术已经发展起来，使中文打字成为可能。要启用汉字输入，作者通常会使用带有“输入法编辑器”（IME）的键盘应用程序。IME 提供了多种输入汉字的方法，包括通过手写、语音和光学字符识别（OCR）。一种流行的语音输入法是注音，也常用基于形状或笔画的输入法，如仓颉或五笔。然而，在近76%的中国大陆键盘用户使用中文时，最流行的中文打字方式是拼音法，它基于汉字的拼音罗马化。

All of the keyboard apps we analyze in this report fall into the category of input method editors (IMEs) that offer pinyin input. These keyboard apps are particularly interesting because they have grown to accommodate the challenge of allowing users to type Chinese characters quickly and easily. While many keyboard apps operate locally, solely within a user’s device, IME-based keyboard apps often have cloud features which enhance their functionality. Because of the complexities of predicting which characters a user may want to type next, especially in logographic languages like Chinese, IMEs often offer “cloud-based” prediction services which reach out over the network. Enabling “cloud-based” features in these apps means that longer strings of syllables that users type will be transmitted to servers elsewhere. As many have previously pointed out, “cloud-based” keyboards and input methods can function as vectors for surveillance and essentially behave as keyloggers. While the content of what users type is traveling from their device to the cloud, it is additionally vulnerable to network attackers if not properly secured. This report is not about how operators of cloud-based IMEs read users’ keystrokes, which is a phenomenon that has already been extensively studied and documented. This report is primarily concerned with the issue of protecting this sensitive data from network eavesdroppers.
我们在本报告中分析的所有键盘应用都属于提供拼音输入的输入法编辑器（IME）类别。这些键盘应用程序特别有趣，因为它们已经发展到可以适应允许用户快速轻松地输入汉字的挑战。虽然许多键盘应用仅在用户设备内本地运行，但基于 IME 的键盘应用通常具有增强其功能的云功能。由于预测用户接下来可能想要输入哪些字符很复杂，尤其是在中文等徽标语言中，IME 通常提供通过网络传输的“基于云”的预测服务。在这些应用程序中启用“基于云”的功能意味着用户键入的较长音节字符串将被传输到其他地方的服务器。正如许多人之前指出的那样，“基于云的”键盘和输入法可以作为监视的载体，本质上表现为键盘记录器。虽然用户键入的内容从他们的设备传输到云，但如果没有得到适当的保护，它也容易受到网络攻击者的攻击。本报告不是关于基于云的 IME 的操作员如何读取用户的击键，这是一种已经被广泛研究和记录的现象。本报告主要关注保护这些敏感数据免受网络窃听者侵害的问题。

In this report, we analyze the security of cloud-based pinyin keyboard apps from nine vendors: Baidu, Honor, Huawei, iFlytek, OPPO, Samsung, Tencent, Vivo, and Xiaomi. We examined these apps’ transmission of users’ keystrokes for vulnerabilities. Our analysis revealed critical vulnerabilities in keyboard apps from eight out of the nine vendors — all but Huawei — in which we could exploit that vulnerability to completely reveal the contents of users’ keystrokes in transit.
在本报告中，我们分析了来自百度、荣耀、华为、科大讯飞、OPPO、三星、腾讯、Vivo 和小米等九家供应商的基于云的拼音键盘应用程序的安全性。我们检查了这些应用程序对用户击键的传输是否存在漏洞。我们的分析揭示了九家供应商中八家（华为除外）的键盘应用程序存在严重漏洞，我们可以利用该漏洞完全揭示用户在传输过程中的击键内容。

Between this report and our Sogou report, we estimate that close to one billion users are affected by this class of vulnerabilities. Sogou, Baidu, and iFlytek IMEs alone comprise over 95% of the market share for third-party IMEs in China, which are used by around a billion people. In addition to the users of third party keyboard apps, we found that the default keyboards on devices from three manufacturers (Honor, OPPO, and Xiaomi) were also vulnerable to our attacks. Devices from Samsung and Vivo also bundled a vulnerable keyboard, but it was not used by default. In 2023, Honor, OPPO, and Xiaomi alone comprised nearly 50% of the smartphone market in China.
在这份报告和搜狗报告之间，我们估计有近 10 亿用户受到此类漏洞的影响。仅搜狗、百度和科大讯飞 IME 就占据了中国第三方 IME 市场份额的 95% 以上，大约有 10 亿人使用。除了第三方键盘应用程序的用户外，我们发现来自三家制造商（荣耀、OPPO 和小米）的设备上的默认键盘也容易受到我们的攻击。三星和 Vivo 的设备也捆绑了一个易受攻击的键盘，但默认情况下不使用它。2023年，仅荣耀、OPPO和小米就占据了中国智能手机市场的近50%。

Having the capability to read what users type on their devices is of interest to a number of actors — including government intelligence agencies that operate globally — because it may encompass exceptionally sensitive information about users and their contacts including financial information, login credentials such as usernames or passwords, and messages that are otherwise end-to-end encrypted. Given the known capabilities of state actors, and that Five Eyes agencies have previously exploited similar vulnerabilities in Chinese apps for the express purpose of mass surveillance, it is possible that we were not the first to discover these vulnerabilities and that they have previously been exploited on a mass scale for surveillance purposes.
许多参与者（包括全球运营的政府情报机构）都对读取用户在其设备上键入的内容感兴趣，因为它可能包含有关用户及其联系人的异常敏感信息，包括财务信息、登录凭据（如用户名或密码）以及以其他方式进行端到端加密的消息。鉴于国家行为者的已知能力，以及五眼联盟机构之前曾利用中国应用程序中的类似漏洞进行大规模监控，我们可能不是第一个发现这些漏洞的人，而且它们以前曾被大规模利用用于监控目的。

We reported these issues to all eight of the vendors in whose keyboards we found vulnerabilities. Most vendors responded, took the issue seriously, and fixed the reported vulnerabilities, although some keyboard apps remain vulnerable. Users should keep their apps and operating systems up to date. We recommend that they consider switching from a cloud-based keyboard app to one that operates entirely on-device if they are concerned about these privacy issues.
我们向所有八家供应商报告了这些问题，我们发现其键盘存在漏洞。大多数供应商做出了回应，认真对待了这个问题，并修复了报告的漏洞，尽管一些键盘应用程序仍然容易受到攻击。用户应使其应用和操作系统保持最新状态。如果他们担心这些隐私问题，我们建议他们考虑从基于云的键盘应用程序切换到完全在设备上运行的应用程序。

The remainder of this report is structured as follows. In the “Related work” section, we outline previous security and privacy research that has been conducted on IME apps and past research which relates to issues of encryption in the Chinese app ecosystem. In “Methodology”, we describe the reverse engineering tools and techniques we used to analyze the above apps. In the “Findings” section, we explain the vulnerabilities we discovered in each app and (where applicable) how we exploited these vulnerabilities. In “Coordinated disclosure”, we discuss how we reported the vulnerabilities we found to the companies and their responses to our outreach. Finally, in “Discussion”, we reflect on the impact of the vulnerabilities we discovered, how they came to be, and ways that we can avoid similar problems in the future. We provide recommendations to all stakeholders in this systemic privacy and security failure, including users, IME and keyboard developers, operating systems, mobile device manufacturers, app store operators, International standards bodies, and security researchers.
本报告的其余部分结构如下。在“相关工作”部分，我们概述了之前对 IME 应用程序进行的安全和隐私研究，以及过去与中国应用程序生态系统中的加密问题相关的研究。在“方法论”中，我们描述了用于分析上述应用程序的逆向工程工具和技术。在“调查结果”部分，我们解释了我们在每个应用程序中发现的漏洞，以及（如适用）我们如何利用这些漏洞。在“协调披露”中，我们讨论了我们如何向公司报告我们发现的漏洞以及他们对我们外展的回应。最后，在“讨论”中，我们反思了我们发现的漏洞的影响，它们是如何形成的，以及我们可以避免将来出现类似问题的方法。我们向所有利益相关者提供建议，以应对这种系统性的隐私和安全故障，包括用户、IME 和键盘开发人员、操作系统、移动设备制造商、应用商店运营商、国际标准机构和安全研究人员。

There has been much work analyzing East Asian apps for their security and privacy properties. As examples from outside of China, researchers studied LINE, a Japanese-developed app, and KakaoTalk, a South Korean-developed app, finding that they have faults in their end-to-end encryption implementations. When it comes to Chinese software, the Citizen Lab has previously revealed privacy and security issues in several Chinese web browsers, and identified vulnerabilities in the Zoom video conferencing platform and the MY2022 Olympics app. Unfortunately, even developers of extremely popular apps often overlook implementing proper security measures and protecting user privacy.
已经有很多工作来分析东亚应用程序的安全和隐私属性。以中国以外的例子为例，研究人员研究了日本开发的应用程序LINE和韩国开发的应用程序KakaoTalk，发现它们在端到端加密实现中存在错误。在中国软件方面，公民实验室此前已经揭示了几个中国网络浏览器的隐私和安全问题，并发现了Zoom视频会议平台和MY2022 Olympics应用程序中的漏洞。不幸的是，即使是非常流行的应用程序的开发人员也经常忽视实施适当的安全措施和保护用户隐私。

Some work has been concerned specifically with the privacy issues with cloud-based keyboard apps. As the technology powering keyboard apps became more popular and sophisticated, awareness of the potential security risks associated with these apps grew. Two main areas of concern have received the most attention from security researchers when it comes to cloud-based keyboard apps: whether user data is secure in the cloud servers and whether it is secure in transit as it moves from the user’s device to a cloud server.
一些工作专门关注基于云的键盘应用程序的隐私问题。随着键盘应用程序的技术变得越来越流行和复杂，人们对与这些应用程序相关的潜在安全风险的认识不断提高。当涉及到基于云的键盘应用程序时，安全研究人员最关注的两个主要领域是：用户数据在云服务器中是否安全，以及在从用户设备移动到云服务器时是否安全。

Some researchers have expressed concern over companies handling sensitive keystroke data and have made attempts to ameliorate the risk of the cloud server being able to record what you typed. In 2013, the Japanese government published concerns it had with privacy regarding the Baidu IME, particularly the cloud input function. Researchers have also been concerned with surveillance via other “cloud-based” IMEs, like iFlytek’s voice input. While there has been a push to develop privacy-aware cloud-based IMEs that would keep user data secret, they are not widely used. While it is concerning what companies might do with user keystroke data, our research pertains to the security of user keystroke data before it even reaches cloud servers and who else other than the cloud operator may be able to read it.
一些研究人员对处理敏感击键数据的公司表示担忧，并试图降低云服务器能够记录您输入的内容的风险。2013年，日本政府公布了对百度输入法隐私的担忧，特别是云输入功能。研究人员还关注通过其他“基于云”的IME进行监控，例如科大讯飞的语音输入。虽然人们一直在推动开发具有隐私意识的基于云的 IME，以保护用户数据的机密性，但它们并未被广泛使用。虽然它涉及公司可能会如何处理用户击键数据，但我们的研究涉及用户击键数据在到达云服务器之前的安全性，以及除了云运营商之外还有谁可以读取它。

Other research has studied the leakage of sensitive information when user keystroke data is in transit between a user’s device to a remote cloud server. If not properly encrypted, data can be intercepted and collected by network eavesdroppers. In 2015 security researchers proposed and evaluated a system to identify keystroke leakages in IME traffic, revealing that at least one IME was transmitting sensitive data without encrypting it at all. Another investigation in the same year showed that the most popular IME, Sogou, was sending users’ device identifiers in the clear. In our 2023 report we exposed Sogou falling short once more, finding that Sogou allowed network eavesdroppers to read what users were typing—as they typed—in any application. All of these discoveries point to developers of these applications overlooking the importance of transport security to protect user data from network attackers.
其他研究研究了当用户击键数据在用户设备之间传输到远程云服务器时敏感信息的泄漏。如果没有正确加密，数据可能会被网络窃听者拦截和收集。2015 年，安全研究人员提出并评估了一种系统来识别 IME 流量中的击键泄漏，结果显示至少有一个 IME 正在传输敏感数据，而根本没有对其进行加密。同年的另一项调查显示，最流行的 IME Sogou 正在以明文形式发送用户的设备标识符。在我们的 2023 年报告中，我们再次揭露了搜狗的不足，发现搜狗允许网络窃听者在任何应用程序中读取用户正在键入的内容。所有这些发现都表明，这些应用程序的开发人员忽视了传输安全对于保护用户数据免受网络攻击者的重要性。

While previous work studying the security of keystroke network data in transit investigates single keyboard apps at a time, our report is the first to holistically evaluate the network security of the cloud-based keyboard app landscape in China.
虽然之前研究传输中击键网络数据安全性的工作一次只调查单个键盘应用程序，但我们的报告是第一份全面评估中国基于云的键盘应用程序格局的网络安全的报告。

Methodology 方法论

We analyzed the Android and, if present, the iOS and Windows versions of keyboard apps from the following keyboard app vendors: Tencent, Baidu, iFlytek, Samsung, Huawei, Xiaomi, OPPO, Vivo, and Honor. The first three — Tencent, Baidu, and iFlytek — are software developers of keyboard apps whereas the remaining six — Samsung, Huawei, Xiaomi, OPPO, Vivo, and Honor — are mobile device manufacturers who either developed their own keyboard apps or include one or more of the other three developers’ keyboard apps preinstalled on their devices. We selected these nine vendors because we identified them as having integrated cloud recommendation functionality into their products and because they are popularly used. To procure the versions we analyzed, between August and November, 2023, we downloaded the latest versions of them from their product websites, the Apple App Store, or, in the case of the apps developed or bundled by mobile device manufacturers, by procuring a mobile device that has the app preinstalled on the ROM. In the case that we obtained the app as pre-installed on a mobile device, we ensured that the device’s apps and operating system were fully updated before beginning analysis of its apps. The devices we obtained were intended for the mainland Chinese market, and, when device manufacturers had two editions of their device, a Chinese edition and a global edition, we analyzed the Chinese edition. 重试错误原因

To better understand whether these vendors’ keyboard apps securely implemented their cloud recommendation functionality, we analyzed them to determine whether they sufficiently encrypted users’ typed keystrokes. To do so, we used both static and dynamic analysis methods. We used jadx to decompile and statically analyze Dalvik bytecode and IDA Pro to decompile and statically analyze native machine code. We used frida to dynamically analyze the Android and iOS versions and IDA Pro to dynamically analyze the Windows version. Finally, we used Wireshark and mitmproxy to perform network traffic capture and analysis. 重试错误原因

To prepare for our dynamic analysis of each keyboard app, after installing it, we enabled the pinyin input if it was not already enabled. The keyboards we analyzed generally prompted users to enable cloud functionality after installation or on first use. In such cases, we answered such prompts in the affirmative or otherwise enabled cloud functionality through the mobile device’s or app’s settings. 重试错误原因

In our analysis, we assume a fairly conservative threat model. For most of our attacks, we assume a passive network eavesdropper that monitors network packets that are sent from a user’s keyboard app to a keyboard app’s cloud server. In one of our attacks, specifically against apps using Tencent’s Sogou API, we allow the adversary to be active in a limited way in that the adversary may additionally transmit network traffic to the cloud server but does not necessarily have to be a machine-in-the-middle (MITM) or spoof messages from the user in a layer 3 sense. In all of our attacks, the adversary also has access to a copy of the client software, but the server is a black box. 重试错误原因

We note that, as neither Apple’s nor Google’s keyboard apps have a feature to transmit keystrokes to cloud servers for cloud-based recommendations, we did (and could) not analyze these keyboards for the security of this feature. However, we observed that none of the mobile devices that we analyzed included Google’s keyboard, Gboard, preinstalled, either. This finding likely results from Google’s exit from China reportedly due to the company’s failure to comply with China’s pervasive censorship requirements. 重试错误原因

Findings 发现

Among the nine vendors whose apps we analyzed, we found that there was only one vendor, Huawei, in whose apps we could not find any security issues regarding the transmission of users’ keystrokes. For each of the remaining eight vendors, in at least one of their apps, we discovered a vulnerability in which keystrokes could be completely revealed by a passive network eavesdropper (see Table 1 for details).
在我们分析的九家供应商中，我们发现只有一家供应商，即华为，在其应用程序中，我们无法发现任何有关用户击键传输的安全问题。对于其余八家供应商中的每一个，在他们的至少一个应用程序中，我们发现了一个漏洞，其中击键可以被被动网络窃听者完全泄露（有关详细信息，请参见表 1）。

Vulnerabilities across keyboard apps reveal keystrokes to network eavesdroppers

Legend
✘✘	working exploit created to decrypt transmitted keystrokes for both active and passive eavesdroppers 为解密主动和被动窃听者的传输击键而创建的工作漏洞
✘	working exploit created to decrypt transmitted keystrokes for an active eavesdropper 为解密活动窃听者的传输击键而创建的工作漏洞
!	weaknesses present in cryptography implementation 密码学实现中存在的弱点
	no known issues 没有已知问题
N/A	product not offered or not present on device analyzed 未提供或未在设备上分析的产品

Keyboard developer 键盘开发人员	Android	iOS	Windows
Tencent^†	✘	N/A	✘
Baidu	!	!	✘✘
iFlytek	✘✘

Pre-installed keyboard developer 重试错误原因

Device manufacturer 设备制造商	Own	Sogou	Baidu	iFlytek	iOS	Windows
Samsung	✘✘	*	✘✘	N/A	N/A	N/A
Huawei	*		N/A	N/A	N/A	N/A
Xiaomi	N/A	✘*	✘✘	✘✘	N/A	N/A
OPPO	N/A	✘	✘✘*	N/A	N/A	N/A
Vivo	*	✘	N/A	N/A	N/A	N/A
Honor	N/A	N/A	✘✘*	N/A	N/A	N/A

Table 1: Summary of vulnerabilities discovered in popular keyboards and in keyboards pre-installed on popular phones.
表 1：在常用键盘和常用手机上预装的键盘中发现的漏洞摘要。
* Default keyboard app on our test device.
* 我们测试设备上的默认键盘应用程序。
^† Both QQ Pinyin and Sogou IME are developed by Tencent; in this report we analyzed QQ Pinyin and found the same issues as we had in Sogou IME.
^† QQ拼音和搜狗输入法均由腾讯开发;在这份报告中，我们分析了QQ拼音，发现了与搜狗输入法相同的问题。

The ease with which the keystrokes in these apps could be revealed varied. In one app, Samsung Keyboard, we found that the app performed no encryption whatsoever. Some apps appeared to internally use Sogou’s cloud functionality and were vulnerable to an attack which we previously published. Most vulnerable apps failed to use asymmetric cryptography and mistakenly relied solely on home-rolled symmetric encryption to protect users’ keystrokes.
这些应用程序中的击键的难易程度各不相同。在一个应用程序Samsung Keyboard中，我们发现该应用程序没有执行任何加密。一些应用程序似乎在内部使用了搜狗的云功能，并且容易受到我们之前发布的攻击。大多数易受攻击的应用程序未能使用非对称加密，并且错误地仅依靠自制对称加密来保护用户的击键。

The remainder of this section details further analysis of the apps we analyzed from each vendor and, when present, their vulnerabilities.
本节的其余部分详细介绍了我们从每个供应商那里分析的应用程序的进一步分析，以及它们的漏洞（如果存在）。

Tencent 腾讯

We have previously analyzed one Tencent keyboard app, Sogou, in a previous report. We were motivated by our previous findings analyzing Sogou to analyze another Tencent keyboard app, QQ Pinyin. We analyzed QQ Pinyin on Android and Windows. We found that the Android version (8.6.3) and Windows version (6.6.6304.400) of this software communicated to similar cloud servers as Sogou and contained the same vulnerabilities to those which we previously reported in Sogou IME (see Table 2 for details).
我们之前在之前的一份报告中分析了腾讯键盘应用程序搜狗。我们之前分析搜狗的发现激励我们分析了腾讯的另一款键盘应用程序QQ拼音。我们分析了 Android 和 Windows 上的 QQ 拼音。我们发现该软件的 Android 版本（8.6.3）和 Windows 版本（6.6.6304.400）与搜狗类似的云服务器通信，并且包含与我们之前在搜狗 IME 中报告的漏洞相同的漏洞（详见表 2）。

Platform	File/Package Name 文件/包名称	Version analyzed 分析的版本	Secure?
Android	com.tencent.qqpinyin	8.6.3	✘
Windows	QQPinyin_Setup_6.6.6304.400.exe	6.6.6304.400	✘

Table 2: The versions of QQ Pinyin that we analyzed.
表2：我们分析的QQ拼音版本。

Baidu 百度

We analyzed Baidu IME for Windows, Android, and iOS. We found that Baidu IME for Windows includes a vulnerability which allows network eavesdroppers to decrypt network transmissions. This means third parties can obtain sensitive personal information including what users have typed. We also found privacy and security weaknesses in the encryption used by the Android and iOS versions of Baidu IME (see Table 3 for details).
我们分析了适用于 Windows、Android 和 iOS 的百度输入法。我们发现适用于 Windows 的百度输入法包含一个漏洞，该漏洞允许网络窃听者解密网络传输。这意味着第三方可以获取敏感的个人信息，包括用户输入的内容。我们还发现 Android 和 iOS 版本的百度输入法格式使用的加密存在隐私和安全漏洞（有关详细信息，请参阅表 3）。

Platform	File/Package Name 文件/包名称	Version analyzed 分析的版本	Secure?	Protocol
Windows	BaiduPinyinSetup_6.0.3.44.exe	6.0.3.44	✘✘	BAIDUv3.1
Android	com.baidu.input	11.7.19.9	!	BAIDUv4.0
iOS	com.baidu.inputMethod	11.7.20	!	BAIDUv4.0

Table 3: The versions of Baidu IME that we analyzed.
表 3：我们分析的百度 IME 版本。

The Android version transmitted keystrokes information via UDP packets to udpolimeok.baidu.com and that the Windows and iOS versions transmitted keystrokes to udpolimenew.baidu.com. The two mobile versions that we analyzed, namely the Android and iOS versions, transmitted these keystrokes according to a stronger protocol, whose payload begins with the bytes 0x04 0x00. The Windows version transmitted these keystrokes according to a weaker protocol, whose UDP payload begins with the bytes 0x03 0x01. We henceforth refer to these protocols as the BAIDUv4.0 and BAIDUv3.1 protocols, respectively. In the remainder of this section we detail multiple weaknesses in the BAIDUv4.0 protocol used by the Android and iOS versions and explain how a network eavesdropper can decrypt the contents of keystrokes transmitted by the BAIDUv3.1 protocol.
Android 版本通过 UDP 数据包将击键信息传输到 udpolimeok.baidu.com ，而 Windows 和 iOS 版本则将击键传输到 udpolimenew.baidu.com 。我们分析的两个移动版本，即 Android 和 iOS 版本，根据更强的协议传输这些击键，其有效载荷以0x04 0x00字节开头。Windows 版本根据较弱的协议传输这些击键，其 UDP 有效负载以0x03 0x01字节开头。此后，我们将这些协议分别称为 BAIDUv4.0 和 BAIDUv3.1 协议。在本节的其余部分，我们将详细介绍 Android 和 iOS 版本使用的 BAIDUv4.0 协议中的多个弱点，并解释网络窃听者如何解密 BAIDUv3.1 协议传输的击键内容。

Weaknesses in BAIDUv4.0 protocol
BAIDUv4.0协议的弱点

To encrypt keystroke information, the BAIDUv4.0 protocol uses elliptic-curve Diffie-Hellman and a pinned server public key (pk_s) to establish a shared secret key for use in a modified version of AES.
为了加密击键信息，BAIDUv4.0 协议使用椭圆曲线 Diffie-Hellman 和固定服务器公钥（pk _s ）来建立共享密钥，以便在 AES 的修改版本中使用。

Upon opening the keyboard, before the first outgoing BAIDUv4.0 protocol message is sent, the application randomly generates a client Curve25519 public-private key pair, which we will call (pk_c, sk_c). Then, a Diffie-Hellman shared secret k is generated using sk_c and a pinned public key pk_s. To send a message with plaintext P, the application reuses the first 16 bytes of pk_c as the initialization vector (IV) for symmetric encryption, and k is used as the symmetric encryption key. The resulting symmetric encryption of P is then sent along with pk_c to the server. The server can then obtain the same Diffie-Hellman shared secret k from pk_c and sk_s, the private key corresponding to pk_s, to decrypt the ciphertext.
打开键盘后，在发送第一条传出的 BAIDUv4.0 协议消息之前，应用程序会随机生成一个客户端 Curve25519 公私密钥对，我们称之为（pk _c ， sk _c ）。然后，使用 sk _c 和固定的公钥 pk _s 生成 Diffie-Hellman 共享密钥 k。为了发送带有明文 P 的消息，应用程序重用 pk _c 的前 16 个字节作为对称加密的初始化向量（IV），并使用 k 作为对称加密密钥。然后，生成的 P 对称加密与 pk _c 一起发送到服务器。然后，服务器可以从 pk _c 和 sk _s （对应于 pk _s 的私钥）中获取相同的 Diffie-Hellman 共享密钥 k，以解密密文。

The BAIDUv4.0 protocol symmetrically encrypts data using a modified version of AES, which symbols in the code indicate Baidu has called AESv3. Compared to ordinary AES, AESv3 has a built-in cipher mode and padding. AESv3’s built-in cipher mode mixes bytes differently and uses a modified counter (CTR) mode which we call Baidu CTR (BCTR) mode, illustrated in Figure 1.
BAIDUv4.0 协议使用 AES 的修改版本对称加密数据，代码中的符号表示百度已调用 AESv3。与普通 AES 相比，AESv3 具有内置的密码模式和填充。AESv3 的内置密码模式以不同的方式混合字节，并使用修改后的计数器（CTR）模式，我们称之为百度 CTR （BCTR）模式，如图 1 所示。

Generally speaking, any CTR cipher mode involves combining an initialization vector v with the value i of some counter, whose combination we shall notate as v + i. Most commonly, the counter value used for block i is simply i, i.e., it begins at zero and increments for each subsequent block, and AESv3’s implementation follows this convention. There is no standard way to compute v + i in CTR mode, but the way that BCTR combines v and i is by adding i to the left-most 32-bits of v, interpreting this portion of v and i in little-endian byte order. If the sum overflows, then no carrying is performed on bytes to the right of this 32-bit value. The implementation details we have thus far described do not significantly deviate from a typical CTR implementation. However, where BCTR mode differs from ordinary CTR mode is in how the value v + i is used during encryption. In ordinary CTR mode, to encrypt block i with key k, you would compute
一般来说，任何 CTR 密码模式都涉及将初始化向量 v 与某个计数器的值 i 组合在一起，我们将将其组合记为 v + i。最常见的是，用于块 i 的计数器值只是 i，即它从零开始，并随着每个后续块的增加而增加，并且 AESv3 的实现遵循此约定。在 CTR 模式下计算 v + i 没有标准方法，但 BCTR 组合 v 和 i 的方式是将 i 添加到 v 最左边的 32 位，以小端字节顺序解释 v 和 i 的这一部分。如果总和溢出，则不会对此 32 位值右侧的字节执行进位。到目前为止，我们描述的实现细节与典型的 CTR 实现没有明显差异。但是，BCTR 模式与普通 CTR 模式的不同之处在于在加密过程中如何使用值 v + i。在普通 CTR 模式下，要使用密钥 k 加密块 i，您需要计算

plain_i XOR encrypt(v + i, k).
普通 XOR 加密（v + ， k）。

In BCTR mode, to encrypt block i, you compute
在 BCTR 模式下，要加密块 i，需要计算

encrypt(plain_i XOR (v + i), k).
加密（纯异或（v + ）， k）。

As we will see later, this deviation will have implications for the security of the algorithm.
正如我们稍后将看到的，这种偏差将对算法的安全性产生影响。

While ordinarily CTR mode does not require the final block length to be a multiple of the cipher’s block size (in the case of AES, 16 bytes), due to Baidu’s modifications, BCTR mode no longer automatically possesses this property but rather achieves it by employing ciphertext stealing. If the final block length n is less than 16, AESv3’s implementation encrypts the final 16 byte block by taking the last (16 – n) bytes of the penultimate ciphertext block and prepending them to the n bytes of the ultimate plaintext block. The encryption of the resultant block fills the last (16 – n) bytes of the penultimate ciphertext block and the n bytes of the final ciphertext block. Note, however, that this practice only works when the plaintext consists of at least two blocks. Therefore, if there exists only one plaintext block, then AESv3 right-zero-pads that block to be 16 bytes.
虽然通常 CTR 模式不要求最终块长度是密码块大小的倍数（在 AES 的情况下为 16 字节），但由于百度的修改，BCTR 模式不再自动拥有此属性，而是通过使用密文窃取来实现它。如果最终块长度 n 小于 16，则 AESv3 的实现通过获取倒数第二个密文块的最后（16 – n）字节并将它们附加到最终明文块的 n 个字节之前来加密最终的 16 字节块。结果块的加密填充倒数第二个密文块的最后（16 – n）个字节和最终密文块的 n 个字节。但请注意，这种做法仅在明文至少由两个块组成时才有效。因此，如果只存在一个明文块，则 AESv3 右零垫该块为 16 字节。

Privacy issues with key and IV re-use
密钥和 IV 重用的隐私问题

Since the IV and key are both directly derived from the client key pair, the IV and key are reused until the application generates a new key pair. This only happens when the application restarts, such as when the user restarts the mobile device, the user switches to a different keyboard and back, or the keyboard app is evicted from memory. From our testing, we have observed the same key and IV in use for over 24 hours. There are various issues that arise from key and IV reuse.
由于 IV 和密钥都直接派生自客户端密钥对，因此 IV 和密钥将重复使用，直到应用程序生成新的密钥对。仅当应用程序重新启动时，例如当用户重新启动移动设备时，用户切换到其他键盘并返回，或者键盘应用从内存中逐出时，才会发生这种情况。从我们的测试中，我们观察到相同的钥匙和 IV 使用超过 24 小时。密钥和 IV 重用会产生各种问题。

Re-using the same IV and key means that the same inputs will encrypt to the same encrypted ciphertext. Additionally, due to the way the block cipher is constructed, if blocks in the same positions of the plaintexts are the same, they will encrypt to the same ciphertext blocks. As an example, if the second block of two plaintexts are the same, the second block of the corresponding ciphertexts will be the same.
重复使用相同的 IV 和密钥意味着相同的输入将加密为相同的加密密文。此外，由于分组密码的构造方式，如果明文相同位置的块相同，它们将加密为相同的密文块。例如，如果两个明文的第二个块相同，则对应密文的第二个块将相同。

Weakness in cipher mode
密码模式下的弱点

The electronic codebook (ECB) cipher mode is notorious for having the undesirable property that equivalent plaintext blocks encrypt to equivalent ciphertext blocks, allowing patterns in the plaintext to be revealed in the ciphertext (see Figure 2 for an illustration).
电子密码本（ECB）密码模式因具有等效明文块加密为等效密文块的不良属性而臭名昭著，允许在密文中显示明文中的模式（参见图 2 的说明）。

While BCTR mode used by Baidu does not as flagrantly reveal patterns to the same extent as ECB mode, there do exist circumstances in which patterns in the plaintext can still be revealed in the ciphertext. Specifically, there exist circumstances in which there exists a counter-like pattern in the plaintext which can be revealed by the ciphertext (see Figure 3 for an example). These circumstances are possible due to the fact that (IV + i) is XORed with each plaintext block i and then encrypted, unlike ordinary CTR mode which encrypts (IV + i) and XORs it with the plaintext. Thus, when using BCTR mode, if the plaintext exhibits similar counting patterns as (IV + i), then for multiple blocks the value ((IV + i) XOR plaintext block i) may be equivalent and thus encrypt to an equivalent ciphertext.
虽然百度使用的BCTR模式不像ECB模式那样公然揭示模式，但确实存在明文模式仍然可以在密文中揭示的情况。具体来说，在某些情况下，明文中存在一种类似计数器的模式，可以通过密文来揭示（参见图 3 的示例）。这些情况之所以成为可能，是因为（IV + i）与每个明文块 i 进行异或化，然后加密，这与普通的 CTR 模式不同，后者加密（IV + i）并使用明文对其进行异或。因此，当使用 BCTR 模式时，如果明文表现出与（IV + i）相似的计数模式，则对于多个块，值（（IV + i） XOR 明文块 i）可能是等价的，因此加密为等效的密文。

More generally, BCTR mode fails to provide the cryptographic property of diffusion 扩散. Specifically, if an algorithm provides diffusion, then, when we change a single bit of the plaintext, we expect half of the bits of the ciphertext to change. However, the example in Figure 3 illustrates a case where changing a single bit of the plaintext caused zero bits of the ciphertext to change, a clear violation of the expectations of this property. The property of diffusion is vital in secure cryptographic algorithms so that patterns in the plaintext are not visible as patterns in the ciphertext.

Other privacy and security weaknesses
其他隐私和安全漏洞

There are other weaknesses in the custom encryption protocol designed by Baidu IME that are not consistent with the expected standards for a modern encryption protocol used by hundreds of millions of devices.
百度输入法设计的自定义加密协议中还存在其他弱点，这些弱点与数亿台设备使用的现代加密协议的预期标准不一致。

Forward secrecy issues with static Diffie-Hellman
静态 Diffie-Hellman 的前向保密问题

The use of a pinned static server key means that the cipher is not forward secret, a property of other modern network encryption ciphers like TLS. If the server key is ever revealed, any past message where the shared secret was generated with that key can be successfully decrypted.
使用固定的静态服务器密钥意味着密码不是前向机密，这是其他现代网络加密密码（如 TLS）的属性。如果服务器密钥被泄露，则可以使用该密钥生成共享密钥的任何过去消息都可以成功解密。

Lack of message integrity
缺乏消息完整性

There are no cryptographically secure message integrity checks, which means that a network attacker may freely modify the ciphertext. There is a CRC32 checksum calculated and included with the plaintext data, but a CRC32 checksum does not provide cryptographic integrity, as it is easy to generate CRC32 checksum collisions. Therefore, modifying the ciphertext may be possible. In combination with the issue concerning key and IV reuse, this protocol may be vulnerable to a swapped block attack.
没有加密安全的消息完整性检查，这意味着网络攻击者可以自由修改密文。计算并包含在明文数据中有一个 CRC32 校验和，但 CRC32 校验和不提供加密完整性，因为很容易生成 CRC32 校验和冲突。因此，修改密文是可能的。结合密钥和 IV 重用问题，该协议可能容易受到交换块攻击。

Vulnerability in BAIDUv3.1 protocol
BAIDUv3.1协议漏洞

The BAIDUv3.1 protocol is weaker than the BAIDUv4.0 protocol and contains a critical vulnerability that allows an eavesdropper to decrypt any messages encrypted with it. The protocol in the versions of Baidu’s keyboard apps that we analyzed encrypts keystrokes using a modified version of AES which we call AESv2, as we believe it to be the predecessor cipher to Baidu’s AESv3. When a keyboard app uses the BAIDUv3.1 protocol with the AESv2 cipher, we say that it uses the BAIDUv3.1+AESv2 scheme. Normally, AES when used with a 128-bit key performs 10 rounds of encryption on each block. However, we found that AESv2 uses only 9 rounds but is otherwise equivalent to AES encryption with a 128-bit key.
BAIDUv3.1 协议比 BAIDUv4.0 协议弱，并且包含一个严重漏洞，允许窃听者解密任何用它加密的消息。我们分析的百度键盘应用程序版本中的协议使用修改后的 AES 版本（我们称之为 AESv2）对击键进行加密，因为我们认为它是百度 AESv3 的前身密码。当键盘应用使用带有 AESv2 密码的 BAIDUv3.1 协议时，我们说它使用 BAIDUv3.1+AESv2 方案。通常，AES 与 128 位密钥一起使用时，会对每个块执行 10 轮加密。但是，我们发现 AESv2 仅使用 9 轮，但在其他方面等同于具有 128 位密钥的 AES 加密。

The BAIDUv3.1+AESv2 scheme encrypts keystrokes using AESv2 in the following manner. First, a key is derived according to a fixed function (see Figure 4). Note that the function takes no input nor references any external state and thus always generates the same static key k_f = “\xff\x9e\xd5H\x07Z\x10\xe4\xef\x06\xc7.\xa7\xa2\xf26”.
BAIDUv3.1+AESv2 方案按以下方式使用 AESv2 加密击键。首先，根据固定函数派生密钥（见图 4）。请注意，该函数不接受任何输入，也不引用任何外部状态，因此始终生成相同的静态键 k _f = “ \xff\x9e\xd5H\x07Z\x10\xe4\xef\x06\xc7.\xa7\xa2\xf26 ”。

def derive_fixed_key():
    key = []
    x = 0
    for i in range(16):
    key.append((~i ^ ((i + 11) * (x >> (i & 3)))) & 0xff)
    x += 1937
    return bytes(key)

Figure 4:

Python code equivalent to the code that the BAIDUv3.1 protocol uses to derive its fixed key. The function takes no input and derives the same key on every invocation.

图 4：Python 代码等效于 BAIDUv3.1 协议用于派生其固定密钥的代码。该函数不接受任何输入，并在每次调用时派生相同的键。

To encrypt a protobuf-serialized message, the BAIDUv3.1 protocol first snappy-compresses it, forming a compressed buffer. The 32-bit, little-endian length of this compressed message is then prepended to the compressed buffer, forming the plaintext. A randomly generated 128-bit key k_m is used to encrypt the plaintext using AESv2 in ECB mode. The resulting ciphertext is stored in bytes 44 until the end of the final UDP payload. Key k_f is used to encrypt k_m using AESv2 in ECB mode. The resulting ciphertext is stored in bytes 28 until 44 of the final UDP payload.
为了加密 protobuf 序列化的消息，BAIDUv3.1 协议首先对其进行快速压缩，形成压缩缓冲区。然后，此压缩消息的 32 位小端长度被附加到压缩缓冲区之前，形成明文。随机生成的 128 位密钥 k _m 用于在 ECB 模式下使用 AESv2 加密明文。生成的密文以字节 44 的形式存储，直到最终 UDP 有效负载结束。密钥 k _f 用于在 ECB 模式下使用 AESv2 加密 k _m 。生成的密文存储在最终 UDP 有效负载的 28 到 44 字节中。

We found that these encrypted protobuf serializations include our typed keystrokes as well as the name of the application into which we were typing them (see Figure 5).
我们发现这些加密的 protobuf 序列化包括我们键入的击键以及我们键入它们的应用程序的名称（参见图 5）。

[...]
2 {
    1: "nihaocanyoureadthis"
    5: 3407918
  }
3 {
    1: 107
    2: 10
    5: 1
  }
4 {
    1: "1133d4c64afbf1feda85d3c497dd6164|0"
    2: "wn1||0"
    3: "6.0.3.44"
    4: "notepad.exe"
  }
  
[...]

Figure 5:

Excerpt of decrypted information, including what we had typed (“nihaocanyoureadthis”) and the app into which it was typed (“notepad.exe”).

图 5：解密信息的摘录，包括我们输入的内容（“nihaocanyoureadthis”）和键入该信息的应用程序（“notepad.exe”）。

A vulnerability exists in the BAIDUv3.1+AESv2 scheme that allows a network eavesdropper to decrypt the contents of these messages. Since AES is a symmetric encryption algorithm, the same key used to encrypt a message can also be used to decrypt it. Since k_f is fixed, any network eavesdropper with knowledge of k_f, such as from performing the same analysis of the app as we performed, can decrypt k_m and thus can decrypt the plaintext contents of each message encrypted in the manner described above. As we found that users’ keystrokes and the names of the applications they were using were sent in these messages, a network eavesdropper who is eavesdropping on a user’s network traffic can observe what that user is typing and into which application they are typing it by taking advantage of this vulnerability.
BAIDUv3.1+AESv2 方案中存在一个漏洞，允许网络窃听者解密这些消息的内容。由于 AES 是一种对称加密算法，因此用于加密消息的相同密钥也可用于解密消息。由于 k _f 是固定的，因此任何了解 k _f 的网络窃听者，例如通过执行与我们执行的相同的应用程序分析，都可以解密 k _m ，从而可以解密以上述方式加密的每条消息的明文内容。由于我们发现用户的击键和他们正在使用的应用程序的名称是在这些消息中发送的，因此窃听用户网络流量的网络窃听者可以利用此漏洞观察该用户正在键入的内容以及他们正在键入的应用程序。

iFlytek 科大讯飞

We analyzed iFlytek (also called xùnfēi from the pinyin of 讯飞) IME on Android, iOS, and Windows. We found that iFlytek IME for Android includes a vulnerability which allows network eavesdroppers to recover the plaintext of insufficiently encrypted network transmissions, revealing sensitive information including what users have typed (see Table 4 for details).
我们分析了 Android、iOS 和 Windows 上的 iFlytek（也称为讯飞拼音的 xùnfēi）IME。我们发现，安卓版科大讯飞输入法包含一个漏洞，该漏洞允许网络窃听者恢复加密不足的网络传输的明文，从而泄露敏感信息，包括用户输入的内容（详见表4）。

Platform	File/Package Name 文件/包名称	Version analyzed 分析的版本	Secure?
Android	com.iflytek.inputmethod	12.1.10	✘✘
iOS	com.iflytek.inputime	12.1.3338
Windows	iFlyIME_Setup_3.0.1734.exe	3.0.1734

Table 4: The versions of Xunfei IME analyzed.
表 4：分析的 Xunfei IME 版本。

The Android version of iFlytek IME encrypts the payload of each HTTP request sent to pinyin.voicecloud.cn with the following algorithm. Let s be the current time in seconds since the Unix epoch at the time of the request. For each request, an 8-byte encryption key is then derived by first performing the following computation:
安卓版的科大讯飞输入法 pinyin.voicecloud.cn 使用以下算法对发送到的每个HTTP请求的有效负载进行加密。设 s 为自请求时 Unix 纪元以来的当前时间（以秒为单位）。对于每个请求，首先执行以下计算，从而派生一个 8 字节的加密密钥：

x = (s % 0x5F5E100) ^ 0x1001111
x = （s % 0x5F5E100） ^ 0x1001111

The 8-byte key k is then derived from x as the lowest 8 ASCII-encoded digits of x, left-padded with leading zeroes if necessary, in big-endian order. In Python, the above can be summarized by the following expression:
然后，从 x 派生 8 字节密钥 k 作为 x 的最低 8 个 ASCII 编码数字，必要时按大端顺序左填充前导零。在 Python 中，以上可以用以下表达式来概括：

k = b’%08u’ % ((s % 0x5F5E100) ^ 0x1001111)
k = 乘以 %08u’ % （（s % 0x5F5E100） ^ 0x1001111）

The payload of the request is then padded with PKCS#7 padding and then encrypted with DES using key k in ECB mode. The value s is transmitted in the HTTP request in the clear as a GET parameter named “time”.
然后用 PKCS#7 填充填充请求的有效负载，然后在 ECB 模式下使用密钥 k 使用 DES 加密。值 s 在 HTTP 请求中以明文形式作为名为“time”的 GET 参数传输。

Since DES is a symmetric encryption algorithm, the same key used to encrypt a message can also be used to decrypt it. Since k can be easily derived from s and since s is transmitted in the clear in every HTTP request encrypted by k, any network eavesdropper can easily decrypt the contents of each HTTP request encrypted in the manner described above. (Since s is simply the time in single second resolution, it also stands to reason that a network eavesdropper would have general knowledge of s in any case.)
由于 DES 是一种对称加密算法，因此用于加密消息的相同密钥也可用于解密消息。由于 k 可以很容易地从 s 派生出来，并且由于 s 在每个由 k 加密的 HTTP 请求中以明文形式传输，因此任何网络窃听者都可以轻松地解密以上述方式加密的每个 HTTP 请求的内容。（由于 s 只是一秒解析中的时间，因此网络窃听者在任何情况下都对 s 有一般的了解是有道理的。

We found that users’ keystrokes were transmitted in a protobuf serialization and encrypted in this manner (see Figure 6). Therefore, a network eavesdropper who is eavesdropping on a user’s network traffic can observe what that user is typing by taking advantage of this vulnerability.
我们发现用户的击键是以 protobuf 序列化方式传输的，并以这种方式加密（参见图 6）。因此，窃听用户网络流量的网络窃听者可以利用此漏洞观察该用户正在键入的内容。

1: 0
2: 0
3: 49
4: "xxxxx"
5: 0
7 {
    1: "app_id"
    2: "100IME"
}
7 {
    1: "uid"
    2: "230817031752396418"
}
7 {
    1: "cli_ver"
    2: "12.1.14983"
}
7 {
    1: "net_type"
    2: "wifi"
}
7 {
    1: "OS"
    2: "android"
}
8: 8

Figure 6:

Decrypted information revealing what we had typed (“xxxxx”).

图 6：解密的信息揭示了我们键入的内容（“xxxxx”）。

Finally, the DES encryption algorithm is an older encryption algorithm with known weaknesses, and the ECB block cipher mode is a simplistic and problematic cipher mode. The use of each of these technologies is problematic in itself and opens the Android version of iFlytek IME’s communications to additional attacks.
最后，DES 加密算法是一种具有已知弱点的较旧的加密算法，而 ECB 分组密码模式是一种简单且有问题的密码模式。这些技术的使用本身就是有问题的，并且使科大讯飞IME的Android版本的通信受到其他攻击。

Samsung 三星

We analyzed Samsung Keyboard on Android as well as the versions of Sogou IME and Baidu IME that Samsung bundled with our test device, an SM-T220 tablet running ROM version T220CHN4CWF4. We found that Samsung Keyboard for Android and Samsung’s bundled version Baidu IME includes a vulnerability that allows network eavesdroppers to recover the plaintext of insufficiently encrypted network transmissions, revealing sensitive information including what users have typed (see Table 5 for details).
我们分析了 Android 上的三星键盘以及三星与我们的测试设备（运行 ROM 版本 T220CHN4CWF4 的 SM-T220 平板电脑）捆绑在一起的搜狗输入法和百度输入法版本。我们发现，适用于 Android 的三星键盘和三星的捆绑版本百度输入法包含一个漏洞，该漏洞允许网络窃听者恢复加密不足的网络传输的明文，从而泄露敏感信息，包括用户输入的内容（详见表 5）。

Platform	Application name 应用程序名称	Package name	Version analyzed 分析的版本	Secure?
OneUI 5.1	Samsung Keyboard 三星键盘	com.samsung.android.honeyboard	5.6.10.26	✘✘
OneUI 5.1	百度输入法 (Baidu IME)	com.baidu.input	8.5.20.4	✘✘
OneUI 5.1	搜狗输入法三星版 (Sogou IME Samsung Version)	com.sohu.inputmethod.sogou.samsung	10.32.38.202307281642

Table 5: The keyboards analyzed on our Samsung test device.
表 5：在我们的三星测试设备上分析的键盘。

Samsung Keyboard (com.samsung.android.honeyboard)
三星键盘（com.samsung.android.honeyboard）

We found that when using Samsung Keyboard on the Chinese edition of a Samsung device and when Pinyin is chosen as Samsung Keyboard’s input language, Samsung Keyboard transmits keystroke data to the following URL in the clear via HTTP POST:
我们发现，当在三星设备的中文版上使用三星键盘时，当拼音被选为三星键盘的输入语言时，三星键盘会通过HTTP POST将击键数据以明文形式传输到以下URL：

http://shouji.sogou.com/web_ime/mobile_pb.php?durtot=339&h=8f2bc112-bbec-3f96-86ca-652e98316ad8&r=android_oem_samsung_open&v=8.13.10038.413173&s=&e=&i=&fc=0&base=dW5rbm93biswLjArMC4w&ext_ver=0

The keystroke data is contained in the request’s HTTP payload in a protobuf serialization (see Figure 7 below).
击键数据包含在 protobuf 序列化的请求的 HTTP 有效负载中（参见下面的图 7）。

1 {
    1: "8f2bc112-bbec-3f96-86ca-652e98316ad8"
    2: "android_oem_samsung_open"
    3: "8.13.10038.413173"
    4: "999"
    5: 1
    7: 2
}
2 {
    1: "\351\000"
    2: "\372\213"
}
4: "com.tencent.mobileqq"
7: "nihaocanyoureadthis"
16: 10
17 {
    3 {
        1: 1
        2: 5
    }
    5: 1
    9: 1
}
18: ""
19 {
    1: "0"
    4: "339"
}

Figure 7:

Protobuf message transmitted after typing “nihaocanyoureadthis”.

图 7：输入“nihaocanyoureadthis”后传输的 Protobuf 消息。

The device on which we were testing was fully updated on the date of testing (October 7, 2023) in that it had all OS updates applied and had all updates from the Samsung Galaxy Store applied.
我们正在测试的设备在测试之日（2023 年 10 月 7 日）进行了全面更新，因为它应用了所有操作系统更新，并应用了三星 Galaxy Store 的所有更新。

Since Samsung Keyboard transmits keystroke data via plain, unencrypted HTTP and since there is no encryption applied at any other layer, a network eavesdropper who is monitoring a Samsung Keyboard user’s network traffic can easily observe that user’s keystrokes if that user is using the Chinese edition of the ROM with the Pinyin input language selected.
由于三星键盘通过普通的、未加密的 HTTP 传输击键数据，并且由于没有在任何其他层应用加密，因此监控三星键盘用户网络流量的网络窃听者可以很容易地观察到该用户的击键，如果该用户使用的是中文版的 ROM，并选择了拼音输入语言。

When using the global edition of the ROM or when using a non-Pinyin input language, we did not observe the Samsung keyboard communicating with cloud servers.
当使用全球版ROM或使用非拼音输入语言时，我们没有观察到三星键盘与云服务器通信。

百度输入法 (“Baidu IME”, com.baidu.input)
百度输入法（“Baidu Ime”， com.baidu.input）

We found that the version of Baidu IME bundled with our Samsung test device transmitted keystroke information via UDP packets to udpolimenew.baidu.com. This version of Baidu IME used the BAIDUv3.1 protocol that we describe in the Baidu section earlier but with a different cipher and compression algorithm as indicated in each transmission’s header. In the remainder of this section we explain how a network eavesdropper can, just like with AESv2, decrypt the contents of messages encrypted using a scheme we call BAIDUv3.1+AESv1 (see Table 6).
我们发现，与我们的三星测试设备捆绑在一起的百度输入法版本通过 UDP 数据包将击键信息传输到 udpolimenew.baidu.com 。此版本的百度输入法使用了我们之前在百度部分中描述的 BAIDUv3.1 协议，但使用不同的密码和压缩算法，如每个传输的标头所示。在本节的其余部分，我们将解释网络窃听者如何像使用 AESv2 一样，解密使用我们称为 BAIDUv3.1+AESv1 的方案加密的消息内容（参见表 6）。

Protocol	Scheme	Cipher	Mode	Comparison of cipher to AES 密码与AES的比较
BAIDUv3.1	BAIDUv3.1+AESv1	AESv1	ECB	Additional permutations 其他排列
BAIDUv3.1	BAIDUv3.1+AESv2	AESv2	ECB	Missing round 缺少一轮
BAIDUv4.0	BAIDUv4.0+AESv3	AESv3	BCTR	Uses home-rolled cipher mode 使用自制密码模式

Table 6: Summary of ciphers used across different Baidu protocols.
表 6：不同百度协议中使用的密码摘要。

Samsung’s bundled version of Baidu IME encrypts keystrokes using a modified version of AES which we name AESv1, as we believe it to be the predecessor to Baidu’s AESv2. When encrypting, AESv1’s key expansion is like that of standard AES, except, on each but the first subkey, the order of the subkey’s bytes are additionally permuted. Furthermore, on the encryption of each block, the bytes of the block are additionally permuted in two locations, once near the beginning of the block’s encryption immediately after the block has been XOR’d by the first subkey and again near the end of the block’s encryption immediately before S-box substitution. Aside from complicating our analysis, we are not aware of these modifications altering the security properties of AES, and we have developed an implementation of this algorithm to both encrypt and decrypt messages given a plaintext or ciphertext and a key.
三星的百度输入法捆绑版本使用AES的修改版本来加密击键，我们将其命名为AESv1，因为我们认为它是百度AESv2的前身。加密时，AESv1 的密钥扩展与标准 AES 类似，只是在除第一个子项之外的每个子项上，子项字节的顺序都额外排列。此外，在每个块的加密中，块的字节额外排列在两个位置，一次在块被第一个子项进行异或之后的块加密开始附近，另一次在S-box替换之前的块加密结束附近。除了使我们的分析复杂化之外，我们还没有意识到这些修改会改变 AES 的安全属性，并且我们已经开发了此算法的实现，以加密和解密给定明文或密文和密钥的消息。

Samsung’s bundled version of Baidu IME encrypts keystrokes by applying AESv1 in electronic codebook (ECB) mode in the following manner. First, the app uses the fixed 128-bit key, k_f = “\xff\x9e\xd5H\x07Z\x10\xe4\xef\x06\xc7.\xa7\xa2\xf26”, to encrypt another, generated, key, k_m. The fixed key k_f is the same key the BAIDUv3.1 protocol uses for AESv2 (see Figure 4). The encryption of k_m is stored in bytes 64 until 80 of each UDP packet’s payload. The key k_m is then used to encrypt the remainder of a zlib-compressed message payload, which is stored at byte 80 until the end of the UDP payload. We found that the encrypted payload included, in a binary container format which we did not recognize, our typed keystrokes as well as the name of the application into which we were typing them (see Figure 8).
三星的百度输入法捆绑版本通过以下方式在电子码本（ECB）模式下应用 AESv1 来加密击键。首先，该应用程序使用固定的 128 位密钥 k _f = “ \xff\x9e\xd5H\x07Z\x10\xe4\xef\x06\xc7.\xa7\xa2\xf26 ” 来加密另一个生成的密钥 k _m 。固定密钥 k _f 与 BAIDUv3.1 协议用于 AESv2 的密钥相同（参见图 4）。k _m 的加密存储在每个 UDP 数据包有效负载的 64 到 80 字节中。然后，密钥 k _m 用于加密 zlib 压缩消息有效负载的其余部分，该负载存储在字节 80 处，直到 UDP 有效负载结束。我们发现，加密的有效负载以我们无法识别的二进制容器格式包含我们键入的击键以及我们键入它们的应用程序的名称（参见图 8）。

0: [800,
1276,
10,
0,
"92F8EE78F1DDCBE74CFEB1166F70883D%7C0",
"a1|SM-T220-gta7litewifi|320",
"8.5.20.4",
"com.android.settings.intelligence",
"1012497q",
"",
"2你好惨又热大腿",
""],
1: [0, "", "nihaocanyoureadthis"]

Figure 8:

The decrypted and decompressed payload, revealing what we had typed (“nihaocanyoureadthis”, highlighted) and the app into which it was typed (“com.android.settings.intelligence”); on top is a hex dump of, when decrypted and decompressed, the resulting proprietary binary blob, and below it is our understanding of how to parse it.

图 8：解密和解压缩的有效负载，显示我们输入的内容（突出显示“nihaocanyoureadthis”）和输入它的应用（“com.android.settings.intelligence”）;上面是解密和解压缩后生成的专有二进制 blob 的十六进制转储，下面是我们对如何解析它的理解。

A vulnerability exists in the BAIDUv3.1+AESv1 scheme that allows a network eavesdropper to decrypt the contents of these messages. Since AES, including AESv1, is a symmetric encryption algorithm, the same key used to encrypt a message can also be used to decrypt it. Since k_f is hard-coded, any network eavesdropper with knowledge of k_f can decrypt k_m and thus decrypt the plaintext contents of each message encrypted in the manner described above. As we found that users’ keystrokes and the names of the applications they were using were sent in these messages, a network eavesdropper who is eavesdropping on a user’s network traffic can observe what that user is typing and into which application they are typing it by taking advantage of this vulnerability.
BAIDUv3.1+AESv1 方案中存在一个漏洞，允许网络窃听者解密这些消息的内容。由于 AES（包括 AESv1）是一种对称加密算法，因此用于加密消息的相同密钥也可用于解密消息。由于 k _f 是硬编码的，任何了解 k _f 的网络窃听者都可以解密 k _m ，从而解密以上述方式加密的每条消息的明文内容。由于我们发现用户的击键和他们正在使用的应用程序的名称是在这些消息中发送的，因此窃听用户网络流量的网络窃听者可以利用此漏洞观察该用户正在键入的内容以及他们正在键入的应用程序。

Additionally, in the version of Baidu Input Method distributed by Samsung, we found that key k_m was not securely generated using a secure pseudorandom number generator (secure PRNG). Instead, it was seeded using a custom-designed PRNG that we believe to have poor security properties, and, instead of using a high entropy seed, the PRNG generating k_m was seeded using the message plaintext. However, even without these weaknesses in the generation of k_m, the protocol is already completely insecure to network eavesdroppers as described in the previous paragraphs.
此外，在三星分发的百度输入法版本中，我们发现密钥 k _m 不是使用安全伪随机数生成器（安全 PRNG）安全生成的。取而代之的是，它是使用定制设计的 PRNG 进行种子播种的，我们认为该 PRNG 具有较差的安全属性，并且没有使用高熵种子，而是使用消息明文对生成 k _m 的 PRNG 进行种子。然而，即使没有这些弱点生成 k _m ，如前几段所述，该协议对网络窃听者来说已经完全不安全。

Huawei 华为

We analyzed the keyboards preinstalled on our Huawei Mate 50 Pro test device. We found no vulnerabilities in the manner of transmission of users’ keystrokes in the versions of Huawei’s keyboard apps that we analyzed (see Table 7 for details). Specifically, Huawei used TLS to encrypt keystrokes in each version that we analyzed.
我们分析了华为Mate 50 Pro测试设备上预装的键盘。在我们分析的华为键盘应用程序版本中，我们没有发现用户击键传输方式的漏洞（详见表7）。具体来说，华为在我们分析的每个版本中使用TLS来加密击键。

Platform	Application name 应用程序名称	Package Name	Version analyzed 分析的版本	Secure?
HarmonyOS 4.0.0 和谐操作系统 4.0.0	搜狗输入法 (Sogou IME)	com.sohu.inputmethod.sogou	11.31
HarmonyOS 4.0.0 和谐操作系统 4.0.0	小艺输入法 (Celia IME)	com.huawei.ohos.inputmethod	1.0.19.333

Table 7: The versions of the Huawei keyboard apps analyzed.
表 7：分析的华为键盘应用程序版本。

Xiaomi 小米

We analyzed the keyboards preinstalled on our Xiaomi Mi 11 test device. We found that they all include vulnerabilities that allow network eavesdroppers to decrypt network transmissions from the keyboards (see Table 8 for details). This means that network eavesdroppers can obtain sensitive personal information, including what users have typed.
我们分析了小米 11 测试设备上预装的键盘。我们发现它们都包含允许网络窃听者从键盘解密网络传输的漏洞（有关详细信息，请参阅表 8）。这意味着网络窃听者可以获取敏感的个人信息，包括用户输入的内容。

Platform	Application name 应用程序名称	Package Name	Version analyzed 分析的版本	Secure?
MIUI 14.0.31	百度输入法小米版 (Baidu IME Xiaomi Version)	com.baidu.input_mi	10.6.120.480	✘✘
MIUI 14.0.31	搜狗输入法小米版 (Sogou IME Xiaomi Version)	com.sohu.inputmethod.sogou.xiaomi	10.32.21.202210221903	✘
MIUI 14.0.31	讯飞输入法小米版 (iFlytek IME Xiaomi Version)	com.iflytek.inputmethod.miui	8.1.8014	✘✘

Table 8: The versions of the Xiaomi keyboard apps analyzed.
表 8：分析的小米键盘应用程序版本。

In this section we detail vulnerabilities in three different keyboard apps included with MIUI 14.0.31 in which users’ keystrokes can be, if necessary, decrypted, and read by network eavesdroppers.
在本节中，我们详细介绍了 MIUI 14.0.31 中包含的三个不同键盘应用程序中的漏洞，如有必要，用户的击键可以被解密，并由网络窃听者读取。

百度输入法小米版 (“Baidu IME Xiaomi Version”, com.baidu.input_mi)
百度输入法小米版（“百度IME小米版”， com.baidu.input_mi）

We found that Xiaomi’s Baidu-based keyboard app encrypts keystrokes using the BAIDUv3.1+AESv2 scheme which we detailed previously. When the app’s messages are decrypted and deserialized, we found that they include our typed keystrokes as well as the name of the application into which we were typing them (see Figure 9).
我们发现小米基于百度的键盘应用程序使用我们之前详述的 BAIDUv3.1+AESv2 方案加密击键。当应用程序的消息被解密和反序列化时，我们发现它们包括我们键入的击键以及我们键入它们的应用程序的名称（参见图 9）。

[...]
2 {
    1: "nihaonihaoqqwerty"
}
3 {
    1: 53
    2: 10
    3: 1080
    4: 2166
    5: 5
}
4 {
    1: "DC0F75E6809F0FAAB46EDE2F2D6302ED%7CVAPBN4NOH"
    2: "p-a1-3-66|2211133C|720"
    3: "10.6.120.480"
    4: "com.miui.notes"
    5: "1000228c"
    6: "\346\242\205\345\267\236"
}
[...]

Figure 9:

Excerpt of decrypted information, including what we had typed (“nihaonihaoqqwerty”) and the application into which it was typed (“com.miui.notes”).

图 9：解密信息的摘录，包括我们输入的内容（“nihaonihaoqqwerty”）和输入它的应用程序（“com.miui.notes”）。

Like we explained previously, a vulnerability exists in the BAIDUv3.1+AESv2 scheme that allows a network eavesdropper to decrypt the contents of these messages. As we found that users’ keystrokes and the names of the applications they were using were sent in these messages, a network eavesdropper who is eavesdropping on a user’s network traffic can observe what that user is typing and into which application they are typing it by taking advantage of this vulnerability.
正如我们之前解释的那样，BAIDUv3.1+AESv2 方案中存在一个漏洞，该漏洞允许网络窃听者解密这些消息的内容。由于我们发现用户的击键和他们正在使用的应用程序的名称是在这些消息中发送的，因此窃听用户网络流量的网络窃听者可以利用此漏洞观察该用户正在键入的内容以及他们正在键入的应用程序。

搜狗输入法小米版 (“Sogou IME Xiaomi Version”, com.sohu.inputmethod.sogou.xiaomi)

The Sogou-based keyboard app is subject to a vulnerability which we have already publicly disclosed in Sogou IME (搜狗输入法) in which a network eavesdropper can decrypt and recover users’ transmitted keystrokes. Please see the corresponding details in this report for full details. Tencent responded by securing Sogou IME transmissions using TLS, but we found that Xiaomi’s Sogou-based keyboard had not been fixed.
基于搜狗的键盘应用程序受到一个漏洞的影响，我们已经在搜狗输入法中公开披露了该漏洞，其中网络窃听者可以解密和恢复用户传输的击键。有关完整详细信息，请参阅本报告中的相应详细信息。腾讯的回应是使用 TLS 保护搜狗 IME 传输，但我们发现小米基于搜狗的键盘尚未修复。

讯飞输入法小米版 (“iFlytek IME Xiaomi Version”, com.iflytek.inputmethod.miui)
讯飞输入法小米版（“iFlytek IME Xiaomi Version”， com.iflytek.inputmethod.miui）

Similar to iFlytek’s own IME for Android, we found that Xiaomi’s iFlytek keyboard app used the same faulty encryption. We found that users’ keystrokes were sent to pinyin.voicecloud.cn and encrypted in this manner.
与科大讯飞自己的 Android 版 IME 类似，我们发现小米的科大讯飞键盘应用程序使用了相同的错误加密。我们发现用户的击键是以这种方式发送 pinyin.voicecloud.cn 和加密的。

{“p”:{“m”:53,”f”:0,”l”:0},”i”:”nihaoniba”}
{“p”：{“m”：53，“f”：0，“l”：0}，“i”：“nihaoniba”}

Figure 10:

Excerpt of decrypted information, including what we had typed (“nihaoniba”).

图 10：解密信息的摘录，包括我们输入的内容（“nihaoniba”）。

Therefore, a network eavesdropper who is eavesdropping on a user’s network traffic can observe what that user is typing by taking advantage of this vulnerability (see Figure 10).

OPPO

We analyzed the keyboard apps preinstalled on our OPPO OnePlus Ace test device. We found that they all include vulnerabilities that allow network eavesdroppers to decrypt network transmissions from the keyboards (see Table 9 for details). This means that network eavesdroppers can obtain sensitive personal information, including what users have typed.

Platform	Application name	Package Name	Version analyzed	Secure?
ColorOS 13.1	百度输入法定制版 (Baidu IME Custom Version)	com.baidu.input_oppo	8.5.30.503	✘✘
ColorOS 13.1	搜狗输入法定制版 (Sogou IME Custom Version)	com.sohu.inputmethod.sogouoem	8.32.0322.2305171502	✘

Table 9: The versions of the OPPO keyboard apps analyzed.

In this section we detail vulnerabilities in two different keyboard apps included with MIUI 14.0.31 in which users’ keystrokes can be, if necessary, decrypted, and read by network eavesdroppers.

百度输入法定制版 (“Baidu IME Custom Version”, com.baidu.input_oppo)

We found that OPPO’s Baidu-based keyboard app encrypts keystrokes using the BAIDUv3.1+AESv2 scheme which we detailed previously. When the app’s messages are decrypted and deserialized, we found that they include our typed keystrokes as well as the name of the application into which we were typing them (see Figure 11).

[...]
2 {
    1: "nihaonihao"
}
3 {
    1: 28
    2: 10
    3: 1240
    4: 2662
    5: 5
}
4 {
    1: "47148455BDAEBA8A253ACBCC1CA40B1B%7CV7JTLNPID"
    2: "p-a1-5-105|PHK110|720"
    3: "8.5.30.503"
    4: "com.android.mms"
    5: "1021078a"
}
[...]

Figure 11:

Excerpt of decrypted information, including what we had typed (“nihaonihao”) and the application into which it was typed (“com.android.mms”).

搜狗输入法定制版 (“Sogou IME Custom Version”, com.sohu.inputmethod.sogouoem)

Vivo

We analyzed the keyboard apps preinstalled on our Vivo Y78+ test device. We found that the Sogou-based one includes vulnerabilities that allow network eavesdroppers to decrypt network transmissions from the keyboards (see Table 10 for details). This means that network eavesdroppers can obtain sensitive personal information, including what users have typed.

Platform	Keyboard name	Package Name	Version analyzed	Secure?
origin OS 3	搜狗输入法定制版 (Sogou IME Custom Version)	com.sohu.inputmethod.sogou.vivo	10.32.13023.2305191843	✘
origin OS 3	Jovi输入法 (Jovi IME)	com.vivo.ai.ime	2.6.1.2305231

Table 10: The versions of the Vivo keyboard apps analyzed.

Honor

We analyzed the keyboard apps preinstalled on our Honor Play7T test device. We found that the Baidu-based one includes vulnerabilities that allow network eavesdroppers to decrypt network transmissions from the keyboards (see Table 11 for details). This means that network eavesdroppers can obtain sensitive personal information, including what users have typed.

Platform	Application name	Package Name	Version analyzed	Secure?
Magic UI 6.1.0	百度输入法荣耀版 (Baidu IME Honor Version)	com.baidu.input_hihonor	8.2.501.1	✘✘

Table 11: The versions of the Honor keyboard apps analyzed.

We found that Honor’s Baidu-based keyboard app encrypts keystrokes using the BAIDUv3.1+AESv2 scheme which we detailed previously. When the app’s messages are decrypted and deserialized, we found that they include our typed keystrokes as well as the name of the application into which we were typing them (see Figure 12).

[...]
2 {
    1: "nihaonihaonihaoq"
    5: 6422639
}
3 {
    1: 91
    2: 10
    3: 720
    4: 1552
    5: 5
}
4 {
    1: "A49AD3D3789A136975C2B28201753F03%7C0"
    2: "p-a1-5-115|RKY-AN10|720"
    3: "8.2.501.1"
    4: "com.hihonor.mms"
    5: "1023233d"
    7: "A00-TWGTFEV5OFZ7WZ2AFN5TCDE4BPNO7XRZ-BVEZBI4D"
}
[...]

Figure 12:

Excerpt of decrypted information, including what we had typed (“nihaonihaonihaoq”) and the application into which it was typed (“com.hihonor.mms”).

As of April 1, 2024, “Baidu IME Honor Version”, the default IME on the Honor device we tested, is still vulnerable to passive decryption. We also discovered that on our Play7T device, there was no way to update “Baidu IME Honor Version” through the device’s app store. In responding to our disclosures, Honor asked us to disclose to Baidu and that it was Baidu’s responsibility to patch this issue.

Other affected keyboard apps

Given our limited resources to analyze apps, we were not able to analyze every cloud-based keyboard app available. Nevertheless, given that these vulnerabilities appeared to affect APIs that were used by multiple apps, we wanted to approximate the total number of apps affected by these vulnerabilities.

We began by searching VirusTotal, a database of software and other files that have been uploaded for automated virus scanning, for Android apps which reference the string “get.sogou.com”, the API endpoint used by Sogou IME, as these apps may require additional investigation to determine whether they are vulnerable. Excluding apps that we analyzed above, this search yielded the following apps:

com.sohu.sohuvideo
com.tencent.docs
com.sogou.reader.free
com.sohu.inputmethod.sogou.samsung
com.sogou.text
com.sogou.novel
com.sogo.appmall
com.blank_app
com.sohu.inputmethod.sogou.nubia
com.sogou.androidtool
com.sohu.inputmethod.sogou.meizu
com.sohu.inputmethod.sogou.zte
sogou.mobile.explorer.hmct
sogou.mobile.explorer
com.sogou.translatorpen
com.sec.android.inputmethod.beta
com.sohu.inputmethod.sogou.meitu
com.sec.android.inputmethod
sogou.mobile.explorer.online
com.sohu.sohuvideo.meizu
com.sohu.inputmethod.sogou.oem
com.sogou.map.android.maps
sogou.llq.online
com.sohu.inputmethod.sogou.coolpad
com.sohu.inputmethod.sogou.chuizi
com.sogou.toptennews
com.sogou.recmaster
com.meizu.flyme.input

We have not analyzed these apps and thus cannot conclude that they are necessarily vulnerable, or even keyboard apps, but we provide this list to help reveal the possible scope of the vulnerabilities that we discovered. When we disclosed this list to Tencent, Tencent requested an additional three months to fix the vulnerabilities before we publicly disclosed this list, suggesting credence to the idea that apps in this list are largely vulnerable. Similarly, after excluding apps that we had already analyzed, the following are other Android apps which reference the strings “udpolimenew.baidu.com” or “udpolimeok.baidu.com”, the API endpoints used by Baidu Input Method:

com.adamrocker.android.input.simeji
com.facemoji.lite.xiaomi.gp
com.facemoji.lite.xiaomi
com.preff.kb.xm
com.facemoji.lite.transsion
com.txthinking.brook
com.facemoji.lite.vivo
com.baidu.input_huawei
com.baidu.input_vivo
com.baidu.input_oem
com.preff.kb.op
com.txthinking.shiliew
mark.via.gp
com.qinggan.app.windlink
com.baidu.mapauto

These findings suggest that a large ecosystem of apps may be affected by the vulnerabilities that we discovered in this report.

Coordinated disclosure

We reported the vulnerabilities that we discovered to each vendor in accordance with our vulnerability disclosure policy. All companies except Baidu, Vivo, and Xiaomi responded to our disclosures. Baidu fixed the most serious issues we reported to them shortly after our disclosure, but Baidu has yet to fix all issues that we reported to them. The mobile device manufacturers whose preinstalled keyboard apps we analyzed fixed issues in their apps except for their Baidu apps, which either only had the most serious issues addressed or, in the case of Honor, did not address any issues (see Table 12 for details). Regarding QQ Pinyin, Tencent indicated that “with the exception of end-of-life products, we aim to finalize the upgrade for all active products to transmit EncryptWall requests via HTTPS by the conclusion of Q1 [2024]”, but, as of April 1, 2024, we have not seen any fixes to this product. Tencent may consider QQ Pinyin end-of-life as it has not received updates since 2020, although we note that it is still available for download. For timelines and full correspondence of our disclosures to each vendor, please see the Appendix.

Legend
✘✘	working exploit created to decrypt transmitted keystrokes for both active and passive eavesdroppers
✘	working exploit created to decrypt transmitted keystrokes for an active eavesdropper
!	weaknesses present in cryptography implementation
	no known issues or all known issues fixed
N/A	product not offered or not present on device analyzed

Keyboard developer	Android	iOS	Windows
Tencent^†	✘	N/A	✘
Baidu	!	!	!
iFlytek

Pre-installed keyboard developer

Device manufacturer	Own	Sogou	Baidu	iFlytek	iOS	Windows
Samsung		*	!	N/A	N/A	N/A
Huawei	*		N/A	N/A	N/A	N/A
Xiaomi	N/A	*	!		N/A	N/A
OPPO	N/A		!*	N/A	N/A	N/A
Vivo	*		N/A	N/A	N/A	N/A
Honor	N/A	N/A	✘✘*	N/A	N/A	N/A

* Default keyboard app on our test device.
^† Both QQ Pinyin and Sogou IME are developed by Tencent; in this report we analyzed QQ Pinyin and found the same issues as we had in Sogou IME.

Table 12: Status of vulnerabilities after disclosure as of April 1, 2024.

To summarize, we no longer have working exploits against any products except Honor’s keyboard app and Tencent’s QQ Pinyin. Baidu’s keyboard apps on other devices continue to contain weaknesses in their cryptography which we are unable to exploit at this time to fully decrypt users’ keystrokes in transit.

Barriers to users receiving security updates

Users can receive updates to their keyboard apps on their phones’ app stores, and such updates typically install in the background without user intervention. In our testing, updating keyboard apps was typically performed without friction. However, in some cases, a user may need to also ensure that they have fully updated their operating system before they will receive the fixes to our reported vulnerabilities for their keyboard app through the app store. In the case of the Honor device we tested, there was no update mechanism for the default keyboard used by the operating system through the app store. Honor devices bundled with a vulnerable version of the keyboard will remain vulnerable to passive decryption. In the case of the Samsung Galaxy Store, we found that on our device a user must sign in with a Samsung account before receiving security updates to their keyboard app. In the case the user does not have a Samsung account, then they must create one. We believe that installing important security updates should be frictionless, and we recommend that Samsung and app stores in general not require the registration of a user account before receiving important security updates.

We also learned from communication with Samsung’s security team that our test device had been artificially stuck on an older version of Baidu IME (version 8.5.20.4) compared to the one in the Samsung Galaxy Store. This is because, although the test device was using a Chinese ROM, we were prevented from receiving updates to Baidu IME because the app was geographically unavailable in Canada, where we were testing from. Samsung addressed this issue by adding Baidu’s keyboard app to the global market. Generally speaking, we recommend that Samsung and other app stores do not geoblock security updates to apps that are already installed.

Language barriers in responsible disclosures

We suspect that a language barrier may have prevented iFlytek from responding to our initial disclosure in English. After we did not receive a response for one month, we re-sent the same disclosure e-mail, but with a subject line and one-sentence summary in simplified Chinese. iFlytek responded within three days of this second email and promptly fixed the issues we noted. All future disclosure emails to the Chinese mobile device manufacturers were then written with Chinese subject lines and a short summary in Chinese. Though obvious in hindsight, we encourage security researchers to consider if the company to which they are disclosing uses a different language than the researcher. We suggest submitting vulnerability disclosures, at the very least, with short summaries and email subject lines in the official language of the company’s jurisdiction to prevent similar delays as we may have encountered in disclosure timelines.

Limitations

In this report we detail vulnerabilities relating to the security of the transmission of users’ keystrokes in multiple keyboard apps. In this work we did not perform a full audit of any app or make any attempt to exhaustively find every security vulnerability in any software. Our report concerns analyzing keyboard apps for a class of vulnerabilities that we discovered, and the absence of our reporting of other vulnerabilities should not be considered evidence of their absence.

Discussion

In this section we discuss the impact of the vulnerabilities that we found, speculate as to the factors that gave rise to them, and conclude by introducing possible ways to systemically prevent such vulnerabilities from arising in the future.

Impact of these vulnerabilities

The scope of these severe vulnerabilities cannot be understated: until this and our previous Sogou report, the majority of Chinese mobile users’ keystrokes were decryptable by network adversaries. The keyboards we studied comprise over 95% of the third-party IME market share, which is estimated to be over 780 million users by marketing agencies. In addition, the three phone manufacturers which pre-installed and by default used vulnerable keyboard apps comprise nearly 50% of China’s smartphone market.

The vulnerabilities that we discovered would be inevitably discovered by anyone who thinks to look for them. Furthermore, the vulnerabilities do not require technological sophistication to exploit. With the exception of the vulnerability affecting many Sogou-based keyboard apps that we previously discovered, all of the vulnerabilities that we covered in this report can be exploited entirely passively without sending any additional network traffic. This also means any existing logs of network data sent by these keyboards can be decrypted in the future. As such, we might wonder, are these vulnerabilities actively under mass exploitation?

While many governments may possess sophisticated mass surveillance capabilities, the Snowden revelations gave us unique insight into the capabilities of the United States National Security Agency (NSA) and more broadly the Five Eyes. The revelations disclosed, among other programs, an NSA program called XKEYSCORE for collecting and searching Internet data in realtime across the globe (see Figure 13). Leaked slides describing the program specifically reveal only a few examples of XKEYSCORE plugins. However, one was a plugin that was written by a Five Eyes team to take advantage of vulnerabilities in the cryptography of Chinese-developed UC Browser to enable the Five Eyes to collect device identifiers, SIM card identifiers, and account information pertaining to UC Browser users (see Figure 14 for an illustration).

The similarity of the vulnerability exploited by this XKEYSCORE plugin and the vulnerabilities described in this report are uncanny, as they are all vulnerabilities in the encryption of sensitive data transmissions in software predominantly used by Chinese users. Given the known capabilities of XKEYSCORE, we surmise that the Five Eyes would have the capability to globally surveil the keystrokes of all of the keyboard apps that we analyzed with the exception of Sogou and the apps licensing its software. This single exception exists because Sogou cannot be monitored passively and would require sending packets to Sogou servers. Such communications would be measurable at Sogou’s servers and at other vantage points, potentially revealing the Five Eyes’s target(s) of surveillance to Sogou or Chinese network operators. Therefore, targets of outdated Sogou software would be undesirable victims of mass surveillance, even if such non-passive measurements were within the known capabilities of XKEYSCORE or other Five Eyes programs.

Given the enormous intelligence value of knowing what users are typing, we can conclude that not only do the NSA and more broadly the Five Eyes have the capabilities to mass exploit the vulnerabilities we found but also the strong motivation to exploit them. If the Five Eyes’ capabilities are an accurate reflection of the capabilities and motivations of other governments, then we can assume that many other governments are also capable and motivated to mass exploit these vulnerabilities. The only remaining question is whether any government had knowledge of these vulnerabilities. If they did not have such knowledge before our original report analyzing Sogou, they may have acquired after it in the same way that our original research inspired us to look at similar keyboard apps for analogous vulnerabilities. Unfortunately, short of future government leaks, we may never know if or to what extent any state actors mass exploited these vulnerabilities.

Even though we disclosed the vulnerabilities to vendors, some vendors failed to fix the issues that we reported. Moreover, users of devices which are out of support or that otherwise no longer receive updates may continue to be vulnerable. As such, many users of these apps may continue to be under mass surveillance for the foreseeable future.

How did these vulnerabilities arise

We analyzed a broad sample of Chinese keyboard apps, finding that they are almost universally vulnerable to having their users’ keystrokes being decrypted by network eavesdroppers. Yet there is no common library or a single implementation flaw responsible for these vulnerabilities. While some of the keyboard apps did license their code from other companies, our overall findings can only be explained by a large number of developers independently making the same kind of mistake. As such, we might ask, how could such a large number of independent developers almost universally make such a critical mistake?

One attempt to answer this question is to suggest that these were not mistakes at all but deliberate backdoors introduced by the Chinese government. However, this hypothesis is rather weak. First, user keystroke data is already being sent to servers within Chinese legal jurisdiction, and so the Chinese government would have access to such data anyways. Second, the vulnerabilities that we found give the ability not just to the Chinese government to decrypt transmitted keystrokes but to any other actor as well. In an ideal backdoor, the Chinese government would want the desirable property that only they have access to the backdoor. Finally, the Chinese government has made strides to study and improve the data security of apps developed and used in China, attempting to prevent and fix the very sort of vulnerabilities which we discovered. For instance, a 2020 report from CNCERT/CC found that 60 percent of the 50 banking applications that they investigated did not encrypt any user data transmitted over the network, among a litany of other common security issues.

Were Chinese app developers skeptical of using cryptographic standards perceived as “Western”? Countries such as China and Russia have their own encryption standards and ciphers. To our knowledge none of the faulty encryption implementations that we analyzed adhered to any sort of known standard in any country, and each appeared to be home-rolled ciphers. However, it is possible that Asian developers are less inclined to use encryption standards that they fear may contain backdoors such as the potential Dual_EC_DRBG backdoor.

Perhaps Chinese app developers could be skeptical of standards such as SSL/TLS as well. The TLS ecosystem has also only become nearly-universal in the past decade. Especially before broad oversight of certificate authorities became commonplace, there were many valid criticisms of the SSL/TLS ecosystem. In 2011, digital rights organizations EFF and Access Now were both concerned about the certificate authority (CA) infrastructure underpinning SSL/TLS transport encryption. Even today, the vast majority of root certificates trusted by major OSes and browsers are operated by certificate authorities based in the Global North. We also note that all of the IMEs containing vulnerabilities were first released before 2013 and likely had a need for secure network transmission before SSL/TLS became the de-facto standard for strong transport encryption.

Still, it has been a decade since the Snowden leaks demonstrated the global, urgent, and practical need for strong encryption of data-in-transit in 2013, and the TLS ecosystem has largely stabilized, with CA root lists of many major browsers and OSes controlled by voting bodies and certificate transparency deployed. As of 2024, almost 95% of web traffic from users of Firefox in the United States is traveling over HTTPS. In addition, the speed in which both iFlytek and Sogou switched to TLS demonstrates that making the change to standard TLS is not necessarily a time or resource issue. Even if skepticism towards SSL/TLS explains the reluctance to adopt it in the early 2010s, we are not sure why there is much more inertia in the Chinese Internet ecosystem against making the switch to TLS.

Finally, mobile devices and other operating systems are still incapable of guaranteeing the security of data under transmission, despite iOS and Android having introduced restrictions into their APIs. For instance, iOS 9 implemented App Transport Security, a policy placing restrictions on the ability to transmit data without TLS. However, there are two limitations of this technology. First, an app can specify exceptions to this policy in its Info.plist resource. Second, the policy affects high level APIs and leaves communications over lower level socket-based APIs unregulated. Similar to iOS, Android 9 disables cleartext traffic using certain high level APIs by default, but an app may exclude specific domains or avoid the policy by using lower level APIs.

Can we systemically address these vulnerabilities?

Individually analyzing apps for this class of vulnerabilities and individually reporting issues discovered is limited in the scale of apps that it can fix. First, while we can attempt to manually analyze some of the most popular keyboard apps, we will never be able to analyze every app at large. Second, we might not be able to predict which apps to look at in the first place. For instance, before we analyzed Sogou and the keyboard apps featured in this report, we never would have expected that their network transmissions would be so easily vulnerable to interception. In light of the limitations of the methods that we employed in this report, in the remainder of this section we discuss possibilities for how we might systematically or wholesale address apps which transmit sensitive data over networks without sufficient encryption.

By security researchers paying more attention to the Chinese Internet

There appears to be a general failure of researchers to analyze Chinese apps and the Chinese Internet ecosystem at large, despite its size and influence. The Google Play Store and Apple App Store ecosystems, for instance, are commonly studied by privacy researchers, but many Chinese app stores are overlooked, despite that many popular Chinese apps have more users than their counterparts on the Google Play Store. While the vulnerabilities that we discovered were not all trivial to find and many took substantial analysis to attack, most would have been inevitably discovered by any researcher analyzing these apps for data security. A researcher studying network traffic from users of Chinese devices could also have identified strange, non-standard traffic.

By using app store enforcement

One might call on app stores to enforce the use of sufficient encryption to protect sensitive data in transit. App stores already have a number of rules that they enforce through a combination of automated and manual review. Calling on app stores to enforce sufficient encryption of in-transit sensitive data is tempting given the resources of the companies operating the app stores. However, failing any other innovation, the same scaling issues that apply to other researchers studying these apps will apply to those working for these companies.

By using device permission models

On Android devices, installing any keyboard, regardless of whether or how it communicates with servers over the Internet, brings up a pop-up with the following text:

This input method may be able to collect all the text you type, including personal data like passwords and credit card numbers.

The wording of these warning messages is overbroad and does not necessarily help users distinguish between keyboards that transmit keystrokes over the network, keyboards that transmit keystrokes insecurely (using something other than standard TLS) over the network, and keyboards that do not transmit any data at all.

iOS devices, on the other hand, sandbox their keyboards by default. There is a “Full Access” or “open access” permission that must be explicitly granted to keyboards before they have network access, among other privileges. Without this permission, third-party keyboards cannot transmit network data. We recommend Android also adopt a more fine-grained permission model for keyboards.

Furthermore, the vulnerable apps that we studied transmit data using low level socket APIs versus higher level APIs that require the usage of TLS or HTTPS. One might desire that separate system calls be designed for TLS or HTTPS traffic in addition to the lower level socket system calls so that devices could implement an UNSAFE_INTERNET permission that would be required for apps to use the lower level system calls while still allowing TLS-encrypted traffic for apps that do not have this permission.

While this approach may have some merit, it also has certain drawbacks. It makes sense for situations where apps are untrustworthy and the operating system is completely trustworthy, but there are common situations where the operating system could be not as or even less trustworthy than apps that it is running. One common case would be a user who is running an up-to-date app on an out of date operating system, possibly because the user’s device is no longer receiving operating system updates. In such a case, the app’s implementation of TLS is more likely to be secure than that of the operating system. Furthermore, a user’s operating system may be compromised by malware or otherwise be untrustworthy in itself. Introducing a TLS system call would centralize the encryption of all sensitive data and grant the operating system easy visibility into all unencrypted data. In any case, innovating in areas of encryption is an important right of application developers, and it may not make sense to stifle apps like Signal because of their use of end-to-end or other novel encryption by requiring them to obtain an UNSAFE_INTERNET permission.

One might alternatively desire for apps at large to not be able to access the Internet at all. Instead of an UNSAFE_INTERNET permission, what about introducing an INTERNET permission to govern all Internet socket access, similar to the “Full Access” permission which iOS already applies to keyboard apps? Android devices in fact already have such a permission that apps must request to use Internet (AF_INET) sockets, but it is not a permission that is exposed to ordinary users either in the Google Play Store or through any stock Android user interface, and it is automatically granted when installing an app. Unfortunately, given all of the interprocess communication (IPC) vehicles on modern smart devices, restricting Internet socket access may not guarantee that the app could not communicate over the Internet (e.g., through Google Play services). GrapheneOS, an open source Android-based operating system, implements a NETWORK permission. However, denying this permission can lead to surprising results where apps can still communicate with the Internet via IPC with other apps. As such, we recommend that both the developers of Android and iOS work toward a meaningful INTERNET permission that would adequately inform users of whether an app communicates over the Internet.

By international standards bodies better engaging with Chinese developers

We encourage International standards bodies like the IETF to continue to engage and outreach Chinese Internet companies and engineers in good faith to further reduce friction in cross-linguistic knowledge transfer. The presence of these similar but independent vulnerabilities demonstrate that there is a friction in the transfer and implementation of knowledge between the English-speaking cryptography community and the Chinese cryptography community. For instance, Schneier’s Law or the oft-repeated mantra “don’t roll your own crypto” may be common knowledge to cryptographers trained in English, but perhaps lost in translation. A lag across linguistic boundaries means that general information like the recent stabilization of TLS and webPKI infrastructure may travel more slowly, and updating encryption software to reflect new information may lag even further behind. One other possible example of this phenomenon is that, according to Firefox Telemetry, up until 2020, the Japanese Internet ecosystem also significantly lagged behind the global average in HTTPS adoption.

Although protocols put out by IETF and other International standards bodies can be far from bulletproof, these bodies can still help facilitate international communication about the current state-of-the-art in protocol encryption. The burden of cross-linguistic and cross-cultural exchange on technical standards falls on global standards bodies. Western media outlets and researchers tend to uniformly attribute the actions and participation of private Chinese companies within standards bodies to government actors seeking sovereignty over Internet standards. While skepticism may be warranted in certain cases, there is also research that challenges a simplistic and overbroad narrative. As a single data point, we note that we did not find these issues in Huawei’s keyboards, whose employees are often noted as especially active participants in IETF standard-setting.

By using automated static or dynamic analysis

There has been a failure of automated tools to detect insecure traffic at large. Longitudinal TLS telemetry has largely been focused on web-based perspectives (i.e., how many domains support TLS or how many web connections are encrypted by TLS?), and the mobile perspective is often overlooked, despite the increasing dominance of mobile traffic globally. Although there are some research projects that survey TLS usage in Android mobile apps at scale, there is no public longitudinal data from these projects (i.e., they are run as one-off studies), and many focus on the Google Play’s Android ecosystem, thereby excluding the Chinese mobile Internet. There is perhaps a need for public longitudinal TLS telemetry for popular mobile applications globally, via automated static or dynamic analysis at scale.

By using attestations in app stores

Another way for users to gain visibility into the security and privacy properties of their apps is through the use of developer attestations, such as the ones that appear in data safety sections in many popular app stores. Both the Apple App Store and the Google Play Store collect and display such attestations to varying extents, including attestations as to what data an app collects (if any) and with whom it is shared (if anyone). Additionally, the Play Store allows developers the opportunity to attest to performing “encryption in transit” (see Figure 15 for an example). These attestations allow users to clearly see what security and privacy properties an app’s developer claims it to have and, like privacy policies, they provide means of redress if violated.

We wanted to evaluate whether the apps that we analyzed lived up to their attestations concerning their encryption in the app stores in which they are available. Among the apps that we analyzed, only Baidu IME was available in the Play Store. At the time of this writing, it does not attest to its data being encrypted in transit. Although other apps that we analyzed were available in Apple’s App Store, to our knowledge, this store does not display an attestation for whether the app encrypts data in transit. As such, across both the Google Play and the Apple App stores, attestations were insufficient for compelling the keyboard apps’ developers to implement proper encryption or in providing users any opportunity for redress.

In light of the above findings, we believe that users would benefit from the following recommendations: (1) that app store operators require developers to attest to whether or not an app encrypts data in transit, (2) that app store operators display not only when developers attest to all data being encrypted in transit but also display a warning when they fail to, and (3) that app store operators require apps in certain sensitive categories, such as keyboard apps, to either positively attest to encrypting all data in transit or to attest to not transmitting any data at all.

Since most of the apps that we found perform some type of encryption, even if it were wholly inadequate, one might wonder if attesting that data is merely “encrypted” is enough, since the data arguably did have some manner of encryption applied to it during transit. The Play Store provides some guidance on this topic. Under the question — “How should I encrypt data in transit?” — the documentation notes: “You should follow best industry standards to safely encrypt your app’s data in transit. Common encryption protocols include TLS (Transport Layer Security) and HTTPS.”

Another issue with attestations is that they provide no guarantee that an app behaves as its developers attest, as developers can, after all, make false attestations. While we wish that attestations could guarantee that an app sufficiently implements proper cryptography to the same extent that a permission system can guarantee an app does not use a microphone, false attestations provide an opportunity for redress. For instance, apps which are found to violate attestations would be subject to removal from app stores. Furthermore, apps which violate attestations could be subject to fines by regulatory bodies such as the FTC. Finally, apps which violate the attestation could be liable to civil suits.

While the apps we analyzed were predominantly available from Chinese app stores, we equally recommend that Chinese app stores adopt these recommendations in addition to the Apple App Store and the Google Play Store. Moreover, while this report focuses on the problem of poor encryption practices as it applies to Chinese apps, the problem to varying extents applies to apps of all other provenances.

Summary of recommendations

We conclude our report by summarizing our recommendations to multiple stakeholders.

Recommendations to security researchers

Researchers should analyze more apps from the East Asian app ecosystem and from other popular ecosystems which may be outside of their own locale.
Researchers should develop better static and dynamic analysis techniques to recognize the types of vulnerabilities that we discovered in this report at scale.
Researchers submitting vulnerability disclosures to a company should include short summaries and email subject lines in the official language of the company’s jurisdiction.

Recommendations to international standards bodies

International standards bodies should continue to engage with security engineers from Chinese Internet companies.

Recommendations to app store operators

App stores should not require account registration as a condition to receive security updates.
App stores should not geoblock security updates.
App stores should allow developers to attest to all data being transmitted with encryption, similar to the ability in the Google Play Store.
App stores should display not only when developers attest to all data being encrypted in transit but also display a warning when they fail to.
App stores should require apps in certain sensitive categories, such as keyboard apps, to either positively attest to encrypting all data in transit or to attest to not transmitting any data at all.

Recommendations to keyboard app developers

Use well-tested and standard encryption protocols, like TLS or QUIC.
Make every attempt to provide features on-device without requiring transmitting sensitive data to cloud servers.

Recommendations to mobile operating system developers

Android should implement sandboxing by default for keyboard apps, similar to iOS, that prevents a keyboard from transmitting network traffic among other activities until a user grants the app full access.
The developers of Android and iOS should work toward a meaningful INTERNET permission that would adequately inform users of whether any app communicates over the Internet.

Recommendations to device manufacturers

Conduct security audits of third-party keyboards that you intend to pre-install by default on your operating systems.

Recommendations to users

Users of Honor’s pre-installed keyboard or users of QQ pinyin should switch keyboards immediately.
Users of any Sogou, Baidu, or iFlytek keyboard, including the versions that are bundled or pre-installed on operating systems, should ensure their keyboards and operating systems are up-to-date.
Users of any Baidu IME keyboard should consider switching to a different keyboard or disabling the “cloud-based” feature.
Users with privacy concerns should not enable “cloud-based” features on their keyboards or IMEs or should switch to a keyboard that does not offer “cloud-based” prediction.
iOS users with privacy concerns should not enable “Full Access” for their keyboards or IMEs.

Acknowledgments

We would like to thank Jedidiah Crandall, Jakub Dalek, Pellaeon Lin, and Sarah Scheffler for their guidance and review of this report. Research for this project was supervised by Ron Deibert.

Appendix

Known affected software

We recommend that all users keep their operating systems and apps, including keyboard apps, up to date. If you use any of the following software, we especially recommend you update to the most recent version of your OS and application. As of April 1, 2024, the following software has fixes available:

Separately installed, third-party keyboards

Sogou IME / 搜狗输入法 for Android and Windows
Baidu IME / 百度输入法 for Windows (this software has only been partially fixed, see below)
iFlytek IME / 讯飞输入法 for Android

Pre-installed on Samsung devices with Chinese edition ROM

Samsung Keyboard
Baidu IME / 百度输入法

Pre-installed on Xiaomi devices with Chinese edition ROM

Sogou IME Xiaomi Version / 搜狗输入法小米版
iFlytek IME Xiaomi Version / 讯飞输入法小米版

Pre-installed on OPPO devices with Chinese edition ROM

Sogou IME Custom Version / 搜狗输入法定制版

Pre-installed on Vivo devices with Chinese edition ROM

Sogou IME Custom Version / 搜狗输入法定制版

The following software does not use TLS and may still contain weaknesses:

Separately installed, third-party keyboards

Baidu IME / 百度输入法 for Android, Windows, and iOS

Pre-installed on Xiaomi devices with Chinese edition ROM

Baidu IME Xiaomi Version / 百度输入法小米版

Pre-installed on OPPO devices with Chinese edition ROM

Baidu IME Custom Version / 百度输入法定制版

The following software has not been fixed and is easily exploitable, and we suggest that users switch to another keyboard entirely:

Separately installed, third-party keyboards

QQ Pinyin IME / QQ拼音输入法 for Android and Windows

Pre-installed on Honor devices with Chinese edition ROM

Baidu IME Honor Version / 百度输入法荣耀版

Disclosure timelines

Baidu

The Citizen Lab to Baidu — October 3, 2023

We sent the following via email:

To: [email protected], [email protected]

Subject: Security issue in Baidu Input Method

To Whom It May Concern:

The Citizen Lab is an academic research group based at the Munk School of Global Affairs & Public Policy at the University of Toronto in Toronto, Canada.

We analyzed Baidu Input Method as part of our ongoing work analyzing popular mobile and desktop apps for security and privacy issues. We found that Baidu Input Method for Windows includes a vulnerability which allows network eavesdroppers to decrypt network transmissions. This means third parties can obtain sensitive personal information including what users have typed. We also found privacy and security weaknesses in the encryption used by the Android and iOS versions of Baidu Input Method. To address these issues, we suggest using HTTPS or TLS rather than custom-designed network protocols. For further details, please see the attached document.

Background

All communications associated with this disclosure may be included in the Citizen Lab’s public disclosure of this vulnerability.

Next steps

Please communicate what steps you will take to address the vulnerability that we have described, and please provide the timeline you decide upon for the implementation of fixes.

Finally, upon implementation of any fixes, we ask that you communicate the full extent of the vulnerability to the Citizen Lab.

Should you have any questions about our findings please let us know. We can be reached at this email address: [email protected].

Sincerely,

The Citizen Lab

The Citizen Lab to Baidu — November 22, 2023

Honor

The Citizen Lab to Honor — November 22, 2023

We sent the following via email:

To: [email protected]

Subject: Security issues in Honor keyboard / 荣耀百度输入法高危漏洞

总结：多伦多大学的研究人员发现荣耀预装的百度输入法使用的加密协议有高危漏洞，让网路攻击者可以直接看到用户输入的内容。本文用英文解释了研究人员发现高危漏洞的细节。

To Whom It May Concern:

The Citizen Lab is an academic research group based at the Munk School of Global Affairs & Public Policy at the University of Toronto in Toronto, Canada.

We analyzed Honor pre-installed keyboard apps as part of our ongoing work analyzing popular mobile and desktop apps for security and privacy issues. We found that the Baidu-based one includes vulnerabilities that allow network eavesdroppers to decrypt network transmissions. This means third parties can obtain sensitive personal information including what users have typed. To address these issues, we suggest using HTTPS or TLS rather than custom-designed network protocols. For further details, please see the attached document.

Background

All communications associated with this disclosure may be included in the Citizen Lab’s public disclosure of this vulnerability.

Next steps

Please communicate what steps you will take to address the vulnerability that we have described, and please provide the timeline you decide upon for the implementation of fixes.

Finally, upon implementation of any fixes, we ask that you communicate the full extent of the vulnerability to the Citizen Lab.

Should you have any questions about our findings please let us know. We can be reached at this email address: [email protected].

Sincerely,

The Citizen Lab

Honor to The Citizen Lab — November 23, 2023

Honor to The Citizen Lab — December 5, 2023

The Citizen Lab to Honor — March 7, 2024

iFlytek

The Citizen Lab to iFlytek — September 8, 2023

We sent the following via email:

To: [email protected]

Subject: Security issue in Xunfei Input Method

To Whom It May Concern:

The Citizen Lab is an academic research group based at the Munk School of Global Affairs & Public Policy at the University of Toronto in Toronto, Canada.

We analyzed Xunfei Input Method on Android as part of our ongoing work analyzing popular mobile and desktop apps for security and privacy issues. We found that Xunfei Input Method for Android includes a vulnerability which allows network eavesdroppers to recover the plaintext of insufficiently encrypted network transmissions, revealing sensitive information including what users have typed.

For further details, please see the attached document.

Background

All communications associated with this disclosure may be included in the Citizen Lab’s public disclosure of this vulnerability.

Next steps

Please communicate what steps you will take to address the vulnerability that we have described, and please provide the timeline you decide upon for the implementation of fixes.

Finally, upon implementation of any fixes, we ask that you communicate the full extent of the vulnerability to the Citizen Lab.

Should you have any questions about our findings please let us know. We can be reached at this email address: [email protected].

Sincerely,

The Citizen Lab

The Citizen Lab to iFlytek — September 25, 2023

The Citizen Lab to iFlytek — November 3, 2023

iFlytek to The Citizen Lab — November 6, 2023

OPPO

The Citizen Lab to OPPO — November 22, 2023

We sent the following via email:

To: [email protected]

Subject: Security issues in OPPO keyboards / OPPO预装的输入法高危漏洞

总结：多伦多大学的研究人员发现OPPO所有预装的中文输入法使用的加密协议有高危漏洞，让网路攻击者可以直接看到用户输入的内容。本文用英文解释了研究人员发现高危漏洞的细节。

To Whom It May Concern:

The Citizen Lab is an academic research group based at the Munk School of Global Affairs & Public Policy at the University of Toronto in Toronto, Canada.

We analyzed OPPO pre-installed keyboard apps as part of our ongoing work analyzing popular mobile and desktop apps for security and privacy issues. We found two that include vulnerabilities that allow network eavesdroppers to decrypt network transmissions. This means third parties can obtain sensitive personal information including what users have typed. To address these issues, we suggest using HTTPS or TLS rather than custom-designed network protocols. For further details, please see the attached document.

Background

All communications associated with this disclosure may be included in the Citizen Lab’s public disclosure of this vulnerability.

Next steps

Please communicate what steps you will take to address the vulnerability that we have described, and please provide the timeline you decide upon for the implementation of fixes.

Finally, upon implementation of any fixes, we ask that you communicate the full extent of the vulnerability to the Citizen Lab.

Should you have any questions about our findings please let us know. We can be reached at this email address: [email protected].

Sincerely,

The Citizen Lab

OPPO to The Citizen Lab — November 23, 2023

OPPO to The Citizen Lab — December 3, 2023

OPPO to The Citizen Lab — December 13, 2023

OPPO to The Citizen Lab — December 18, 2023

Samsung

The Citizen Lab to Samsung — October 16, 2023

We sent the following via email:

To: [email protected]

Subject: Security issue in Samsung Keyboard

To Whom It May Concern:

The Citizen Lab is an academic research group based at the Munk School of Global Affairs & Public Policy at the University of Toronto in Toronto, Canada.

We analyzed Samsung Keyboard on Android as part of our ongoing work analyzing popular mobile and desktop apps for security and privacy issues. We found that Samsung Keyboard for Android includes a vulnerability which allows network eavesdroppers to recover the plaintext of insufficiently encrypted network transmissions, revealing sensitive information including what users have typed. For further details, please see the attached document.

Background

All communications associated with this disclosure may be included in the Citizen Lab’s public disclosure of this vulnerability.

Next steps

Please communicate what steps you will take to address the vulnerability that we have described, and please provide the timeline you decide upon for the implementation of fixes.

Finally, upon implementation of any fixes, we ask that you communicate the full extent of the vulnerability to the Citizen Lab.

Should you have any questions about our findings please let us know. We can be reached at this email address: [email protected].

Sincerely,

The Citizen Lab

Samsung to The Citizen Lab — October 16, 2023

Samsung to The Citizen Lab — October 23, 2023

Samsung to The Citizen Lab — November 27, 2023

The Citizen Lab to Samsung — December 5, 2023

Samsung to The Citizen Lab — December 6, 2023

The Citizen Lab to Samsung — December 12, 2023

Samsung to The Citizen Lab — December 12, 2023

Samsung to The Citizen Lab — January 2, 2024

The Citizen Lab to Samsung — January 3, 2024

Samsung to The Citizen Lab — January 25, 2024

The Citizen Lab to Samsung — January 31, 2024

Samsung to The Citizen Lab — February 1, 2024

The Citizen Lab to Samsung — February 2, 2024

Samsung to The Citizen Lab — February 6, 2024

Tencent

The Citizen Lab to Tencent — November 23, 2023

We submitted the following through the Tencent Security Portal:

To Whom It May Concern:

The Citizen Lab is an academic research group based at the Munk School of Global Affairs & Public Policy at the University of Toronto in Toronto, Canada.

As part of our ongoing work analyzing popular mobile and desktop apps for security and privacy issues, we previously reported vulnerabilities in Sogou Input Method which enabled network eavesdroppers to decrypt transmitted keystroke data. See here for the details of the previous report: https://en.security.tencent.com/index.php/report/detail/73788 . We have since found similar vulnerabilities in related products which also transmit keystroke data to Sogou servers, which we detail below.

# QQ Pinyin

We analyzed QQ Pinyin on Android and Windows. We found that the Windows version (6.6.6304.400) and Android version (8.6.3) of this software contain similar vulnerabilities to those which we previously reported in Sogou Input Method.

# Samsung Keyboard (com.samsung.android.honeyboard)

We analyzed Samsung Keyboard (com.samsung.android.honeyboard) version 5.6.10.26 for Android and found that it transmits keystroke data to http://shouji.sogou.com completely in the clear without any encryption. We have also reported this issue to Samsung, who indicated that they are already working with the Sogou team on patching this issue.

# 搜狗输入法小米版 (com.sohu.inputmethod.sogou.xiaomi)

We analyzed 搜狗输入法小米版 (com.sohu.inputmethod.sogou.xiaomi) version 10.32.21.202210221903 for Android and found that it contains similar vulnerabilities to those which we previously reported in Sogou Input Method. We are also in the process of disclosing this issue to Xiaomi.

# 搜狗输入法定制版 (com.sohu.inputmethod.sogouoem)

We analyzed 搜狗输入法定制版 (com.sohu.inputmethod.sogouoem) version 8.32.0322.2305171502 for Android and found that it contains similar vulnerabilities to those which we previously reported in Sogou Input Method. We are also in the process of disclosing this issue to Oppo.

# 搜狗输入法定制版 (com.sohu.inputmethod.sogou.vivo)

We analyzed 搜狗输入法定制版 (com.sohu.inputmethod.sogou.vivo) version 10.32.13023.2305191843 for Android and found that it contains similar vulnerabilities to those which we previously reported in Sogou Input Method. We are also in the process of disclosing this issue to Vivo.

# Other apps

The following are other Android apps which reference the string “get.sogou.com”, the API endpoint used by Sogou Input Method, which may require additional investigation:

com.sohu.sohuvideo

com.tencent.docs

com.sogou.reader.free

com.sohu.inputmethod.sogou.samsung

com.sogou.text

com.sogou.novel

com.sogo.appmall

com.blank_app

com.sohu.inputmethod.sogou.nubia

com.sogou.androidtool

com.sohu.inputmethod.sogou.meizu

com.sohu.inputmethod.sogou.zte

sogou.mobile.explorer.hmct

sogou.mobile.explorer

com.sogou.translatorpen

com.sec.android.inputmethod.beta

com.sohu.inputmethod.sogou.meitu

com.sec.android.inputmethod

sogou.mobile.explorer.online

com.sohu.sohuvideo.meizu

com.sohu.inputmethod.sogou.oem

com.sogou.map.android.maps

sogou.llq.online

com.sohu.inputmethod.sogou.coolpad

com.sohu.inputmethod.sogou.chuizi

com.sogou.toptennews

com.sogou.recmaster

com.meizu.flyme.input

Note that we are not reporting that we have discovered vulnerabilities in the above list of apps. We are merely providing this list for your convenience so that you may more easily investigate and fix issues in other apps which may be using the Sogou Input Method API in an insecure manner.

# Background

If no response is received to this disclosure, the Citizen Lab will publish details regarding the security vulnerability on its website after 15 calendar days from the date of this communication. In other words, where there is no response from you, Citizen Lab will publish details regarding the vulnerability after December 7 2023.

If a substantive response is received (which excludes, for example, an auto reply) to this disclosure within 15 calendar days from the date of this communication, the Citizen Lab will provide you with 45 calendar days from the date of this communication to fix (whether in whole or in part) the vulnerability before publicly disclosing the issue. In other words, where we do receive a substantive response from you, the Citizen Lab will publish details regarding the vulnerability after January 6 2024.

All communications associated with this disclosure may be included in the Citizen Lab’s public disclosure of this vulnerability.

# Next steps

Please communicate what steps you will take to address the vulnerability that we have described, and please provide the timeline you decide upon for the implementation of fixes. Finally, upon implementation of any fixes, we ask that you communicate the full extent of the vulnerability to the Citizen Lab. Should you have any questions about our findings please let us know. We can also be reached at this email address: [email protected].

Sincerely,

The Citizen Lab

Tencent to The Citizen Lab — December 6, 2023

Tencent to The Citizen Lab — January 5, 2024

The Citizen Lab to Tencent — January 9, 2024

Vivo

The Citizen Lab to Vivo — November 22, 2023

We sent the following email:
我们发送了以下电子邮件：

To: [email protected] 至：[email protected]

Subject: Security issues in Vivo keyboard / 维沃搜狗输入法高危漏洞

总结：多伦多大学的研究人员发现维沃预装的搜狗输入法使用的加密协议有高危漏洞，让网路攻击者可以直接看到用户输入的内容。本文用英文解释了研究人员发现高危漏洞的细节。

To Whom It May Concern:
敬启者：

The Citizen Lab is an academic research group based at the Munk School of Global Affairs & Public Policy at the University of Toronto in Toronto, Canada.
公民实验室是位于加拿大多伦多多伦多大学蒙克全球事务与公共政策学院的一个学术研究小组。

We analyzed Vivo pre-installed keyboard apps as part of our ongoing work analyzing popular mobile and desktop apps for security and privacy issues. We found that the Sogou-based one includes vulnerabilities that allow network eavesdroppers to decrypt network transmissions. This means third parties can obtain sensitive personal information including what users have typed. To address these issues, we suggest using HTTPS or TLS rather than custom-designed network protocols. For further details, please see the attached document.
我们分析了 Vivo 预装的键盘应用程序，作为我们正在进行的工作的一部分，分析流行的移动和桌面应用程序的安全和隐私问题。我们发现基于搜狗的漏洞包括允许网络窃听者解密网络传输的漏洞。这意味着第三方可以获取敏感的个人信息，包括用户输入的内容。为了解决这些问题，我们建议使用 HTTPS 或 TLS，而不是自定义设计的网络协议。详情请见附件。

Background 背景

The Citizen Lab is committed to research transparency and will publish details regarding the security vulnerabilities it discovers in the context of its research activities, absent exceptional circumstances, on its website: https://citizenlab.ca/.
公民实验室致力于研究透明度，并将在其网站上公布其在其研究活动中发现的安全漏洞的详细信息，除非有特殊情况：https://citizenlab.ca/。

If no response is received to this disclosure, the Citizen Lab will publish details regarding the security vulnerability on its website after 15 calendar days from the date of this communication. In other words, where there is no response from you, Citizen Lab will publish details regarding the vulnerability after December 7 2023
如果未收到对此披露的回复，Citizen Lab 将在本通信之日起 15 个日历日后在其网站上发布有关安全漏洞的详细信息。换句话说，如果没有您的回复，Citizen Lab 将在 2023 年 12 月 7 日之后发布有关该漏洞的详细信息

If a substantive response is received (which excludes, for example, an auto reply) to this disclosure within 15 calendar days from the date of this communication, the Citizen Lab will provide you with 45 calendar days from the date of this communication to fix (whether in whole or in part) the vulnerability before publicly disclosing the issue. In other words, where we do receive a substantive response from you, the Citizen Lab will publish details regarding the vulnerability after January 6 2024
如果在本通讯之日起 15 个日历日内收到对此披露的实质性回复（例如，不包括自动回复），则公民实验室将在本通讯之日起 45 个日历日内为您提供修复（全部或部分）漏洞，然后再公开披露该问题。换言之，如果我们收到您的实质性回复，Citizen Lab 将在 2024 年 1 月 6 日之后发布有关该漏洞的详细信息

We reserve the right to publish details regarding the vulnerability to the general public before the expiry of the 45 calendar days set out above in the following situations: (1) you have disclosed the vulnerability to the general public, (2) you have patched the vulnerability, (3) you have taken the position that there is no security vulnerability, or (4) the Citizen Lab observes the vulnerability is under active exploitation.
在以下情况下，我们保留在上述 45 个日历日到期前向公众发布有关该漏洞的详细信息的权利：（1）您已向公众披露该漏洞，（2）您已修补该漏洞，（3）您已采取没有安全漏洞的立场，或（4） Citizen Lab 观察到该漏洞正在被积极利用。

All communications associated with this disclosure may be included in the Citizen Lab’s public disclosure of this vulnerability.
与此披露相关的所有通信都可能包含在公民实验室对此漏洞的公开披露中。

Next steps 后续步骤

Finally, upon implementation of any fixes, we ask that you communicate the full extent of the vulnerability to the Citizen Lab.
最后，在实施任何修复程序后，我们要求您将漏洞的全部范围传达给公民实验室。

Should you have any questions about our findings please let us know. We can be reached at this email address: [email protected].
如果您对我们的调查结果有任何疑问，请告诉我们。您可以通过以下电子邮件地址与我们联系：[email protected]。

Sincerely, 真诚地

The Citizen Lab 公民实验室

Xiaomi 小米

The Citizen Lab to Xiaomi — November 3, 2023
Citizen Lab 致小米 — 2023 年 11 月 3 日

We sent the following email:
我们发送了以下电子邮件：

To: [email protected] 至：[email protected]

Subject: Security issues in Xiaomi keyboards / 维沃搜狗输入法高危漏洞

总结：多伦多大学的研究人员发现小米所有预装的中文输入法使用的加密协议有高危漏洞，让网路攻击者可以直接看到用户输入的内容。本文用英文解释了研究人员发现高危漏洞的细节。

To Whom It May Concern:
敬启者：

We analyzed three Xiaomi keyboard apps as part of our ongoing work analyzing popular mobile and desktop apps for security and privacy issues. We found that they all include vulnerabilities that allow network eavesdroppers to decrypt network transmissions. This means third parties can obtain sensitive personal information including what users have typed. To address these issues, we suggest using HTTPS or TLS rather than custom-designed network protocols. For further details, please see the attached document.
我们分析了三款小米键盘应用程序，作为我们正在进行的工作的一部分，分析流行的移动和桌面应用程序的安全和隐私问题。我们发现它们都包含允许网络窃听者解密网络传输的漏洞。这意味着第三方可以获取敏感的个人信息，包括用户输入的内容。为了解决这些问题，我们建议使用 HTTPS 或 TLS，而不是自定义设计的网络协议。详情请见附件。

Background 背景

If no response is received to this disclosure, the Citizen Lab will publish details regarding the security vulnerability on its website after 15 calendar days from the date of this communication. In other words, where there is no response from you, Citizen Lab will publish details regarding the vulnerability after November 18 2023.
如果未收到对此披露的回复，Citizen Lab 将在本通信之日起 15 个日历日后在其网站上发布有关安全漏洞的详细信息。换句话说，如果没有您的回复，Citizen Lab 将在 2023 年 11 月 18 日之后发布有关该漏洞的详细信息。

If a substantive response is received (which excludes, for example, an auto reply) to this disclosure within 15 calendar days from the date of this communication, the Citizen Lab will provide you with 45 calendar days from the date of this communication to fix (whether in whole or in part) the vulnerability before publicly disclosing the issue. In other words, where we do receive a substantive response from you, the Citizen Lab will publish details regarding the vulnerability after December 18 2023.
如果在本通讯之日起 15 个日历日内收到对此披露的实质性回复（例如，不包括自动回复），则公民实验室将在本通讯之日起 45 个日历日内为您提供修复（全部或部分）漏洞，然后再公开披露该问题。换言之，如果我们收到您的实质性回复，Citizen Lab 将在 2023 年 12 月 18 日之后发布有关该漏洞的详细信息。

Next steps 后续步骤

Vulnerabilities across keyboard apps reveal keystrokes to network eavesdroppers

Key findings 主要发现

Introduction 介绍

Related work 相关工作

Methodology 方法论

Findings 发现

Tencent 腾讯

Baidu 百度

Weaknesses in BAIDUv4.0 protocol BAIDUv4.0协议的弱点

Privacy issues with key and IV re-use 密钥和 IV 重用的隐私问题

Weakness in cipher mode 密码模式下的弱点

Other privacy and security weaknesses 其他隐私和安全漏洞

Forward secrecy issues with static Diffie-Hellman 静态 Diffie-Hellman 的前向保密问题

Lack of message integrity 缺乏消息完整性

Vulnerability in BAIDUv3.1 protocol BAIDUv3.1协议漏洞

iFlytek 科大讯飞

Samsung 三星

Samsung Keyboard (com.samsung.android.honeyboard) 三星键盘 （com.samsung.android.honeyboard）

百度输入法 (“Baidu IME”, com.baidu.input) 百度输入法 （“Baidu Ime”， com.baidu.input）

Huawei 华为

Xiaomi 小米

百度输入法小米版 (“Baidu IME Xiaomi Version”, com.baidu.input_mi) 百度输入法小米版 （“百度IME小米版”， com.baidu.input_mi）

搜狗输入法小米版 (“Sogou IME Xiaomi Version”, com.sohu.inputmethod.sogou.xiaomi)

讯飞输入法小米版 (“iFlytek IME Xiaomi Version”, com.iflytek.inputmethod.miui) 讯飞输入法小米版 （“iFlytek IME Xiaomi Version”， com.iflytek.inputmethod.miui）

OPPO

百度输入法定制版 (“Baidu IME Custom Version”, com.baidu.input_oppo)

搜狗输入法定制版 (“Sogou IME Custom Version”, com.sohu.inputmethod.sogouoem)

Vivo

Honor

Other affected keyboard apps

Coordinated disclosure

Barriers to users receiving security updates

Language barriers in responsible disclosures

Limitations

Discussion

Impact of these vulnerabilities

How did these vulnerabilities arise

Can we systemically address these vulnerabilities?

By security researchers paying more attention to the Chinese Internet

By using app store enforcement

By using device permission models

By international standards bodies better engaging with Chinese developers

By using automated static or dynamic analysis

By using attestations in app stores

Summary of recommendations

Recommendations to security researchers

Recommendations to international standards bodies

Recommendations to app store operators

Recommendations to keyboard app developers

Recommendations to mobile operating system developers

Recommendations to device manufacturers

Recommendations to users

Acknowledgments

Appendix

Known affected software

Disclosure timelines

Baidu

Honor

iFlytek

OPPO

Samsung

Tencent

Vivo

Xiaomi 小米

allowRunningInsecureContent | Electron 安全

实践调试Ghidra代码和Ghidra脚本

相关文章

广告位

相关文章

Weaknesses in BAIDUv4.0 protocol
BAIDUv4.0协议的弱点

Privacy issues with key and IV re-use
密钥和 IV 重用的隐私问题

Weakness in cipher mode
密码模式下的弱点

Other privacy and security weaknesses
其他隐私和安全漏洞

Forward secrecy issues with static Diffie-Hellman
静态 Diffie-Hellman 的前向保密问题

Lack of message integrity
缺乏消息完整性

Vulnerability in BAIDUv3.1 protocol
BAIDUv3.1协议漏洞

Samsung Keyboard (com.samsung.android.honeyboard)
三星键盘（com.samsung.android.honeyboard）

百度输入法 (“Baidu IME”, com.baidu.input)
百度输入法（“Baidu Ime”， com.baidu.input）

百度输入法小米版 (“Baidu IME Xiaomi Version”, com.baidu.input_mi)
百度输入法小米版（“百度IME小米版”， com.baidu.input_mi）

讯飞输入法小米版 (“iFlytek IME Xiaomi Version”, com.iflytek.inputmethod.miui)
讯飞输入法小米版（“iFlytek IME Xiaomi Version”， com.iflytek.inputmethod.miui）