Passbolt: a bold use of HaveIBeenPwned

Passbolt, an Open Source Password Manager, is using the Pwned Passwords service from HaveIBeenPwned to alert users if their password is present in a previous data breach. Pwned Passwords API is based on a mathematical property known as k-Anonymity guaranteeing that it never gains enough information about a non-breached password hash to be able to breach it later. Sounds good, right?
开源密码管理器 Passbolt 正在使用 HaveIBeenPwned 的 Pwned Passwords 服务来提醒用户,如果他们的密码存在于以前的数据泄露中。Pwned Passwords API 基于称为 k-Anonymity 的数学属性,保证它永远不会获得有关未泄露密码哈希的足够信息,以便以后能够破坏它。听起来不错,对吧?

Introduction 介绍

In 2017, Troy Hunt introduced in a blog post a service to allow people to check if a password is known to be already present among the 306 million of leaked passwords from various breaches.1 Of course, it was not recommended to submit your real password or even a hash of it.
2017年,特洛伊·亨特(Troy Hunt)在一篇博客文章中介绍了一项服务,允许人们检查在各种违规行为泄露的3.06亿个密码中是否已经存在密码。 1 当然,不建议提交您的真实密码,甚至不建议提交密码的哈希值。

The following year, Junade Ali from Cloudflare proposed a very elegant solution to be able to query the service without revealing too much information.2
第二年,来自 Cloudflare 的 Junade Ali 提出了一个非常优雅的解决方案,能够在不透露太多信息的情况下查询服务。 2

[…] our approach adds an additional layer of security by utilising a mathematical property known as k-Anonymity and applying it to password hashes in the form of range queries. As such, the Pwned Passwords API service never gains enough information about a non-breached password hash to be able to breach it later.
[…]我们的方法通过利用称为 k-Anonymity 的数学属性并将其应用于范围查询形式的密码哈希,增加了额外的安全层。因此,Pwned Passwords API 服务永远不会获得有关未泄露的密码哈希的足够信息,以便以后能够破坏它。

The principle is very simple: compute the SHA1 hash of the password you want to assess and send the first 5 nibbles (i.e. the first 20 bits) of the SHA1 to the service API. In return, the API gives you all the known SHA1 hashes starting with these 5 nibbles. Then you check if your SHA1 appears in the returned list or not.
原理非常简单:计算要评估的密码的 SHA1 哈希值,并将 SHA1 的前 5 个半字节(即前 20 位)发送到服务 API。作为回报,API 会为您提供从这 5 个半字节开始的所有已知 SHA1 哈希值。然后检查您的 SHA1 是否出现在返回的列表中。

If yes, it means your password is known to be part of previous breaches and you should refrain from using it.
如果是,则意味着您的密码已知是以前泄露的一部分,您应该避免使用它。

If not, clearly knowing only 5 nibbles of a SHA1 makes any bruteforce attempt impossible due to the huge number of collisions among password candidates.
如果不是这样,显然只知道 SHA1 的 5 个小块,由于密码候选者之间存在大量冲突,因此任何暴力尝试都是不可能的。

Consequently, Troy Hunt announced the deployment of a k-Anonymity enhanced version of the Pwned Passwords service with its API v2 and its huge gain on privacy, enabling its use with real passwords.34
因此,Troy Hunt 宣布部署 Pwned Passwords 服务的 k-Anonymity 增强版本,其 API v2 及其在隐私方面的巨大收益,使其能够与真实密码一起使用。 3 4

[…] even if you don’t trust the service or you think logs may be leaked and abused (and incidentally, nothing is explicitly logged, they’re transient system logs at most), the range search goes a very long way to protecting the source.
[…]即使您不信任该服务,或者您认为日志可能被泄露和滥用(顺便说一句,没有明确记录任何内容,它们最多是暂时的系统日志),范围搜索对于保护源代码也有很长的路要走。

Pwned Passwords API v2 & v3
Pwned Passwords API v2 和 v3

The API is very easy to use. Send your request to https://api.pwnedpasswords.com/range/xxxxx with the 5 top nibbles of SHA1(your_password) and see if the remaining part of your SHA1 is present in the results.
该 API 非常易于使用。将您的请求发送给 https://api.pwnedpasswords.com/range/xxxxx 并附上 5 个顶级小点 SHA1(your_password) ,看看结果中是否存在 SHA1 的剩余部分。

Here is an example.
下面是一个示例。

$ echo -n p@ssword | sha1sum
36e618512a68721f032470bb0891adef3362cfa9
# => send 36e61 and watch for 8512a68721f032470bb0891adef3362cfa9
$ wget -q -O - https://api.pwnedpasswords.com/range/36e61|grep -i 8512a68721f032470bb0891adef3362cfa9 || echo "Not found"
8512A68721F032470BB0891ADEF3362CFA9:21804

Hmm, p@ssword is not a very good password, it has been seen 21804 times in previous breaches…
嗯, p@ssword 这不是一个很好的密码,在以前的违规行为中已经看到了 21804 次……

$ echo -n p@sszort | sha1sum
d452118ad7a65b151ac323e8abab71737896fc46
# => send d4521 and watch for 18ad7a65b151ac323e8abab71737896fc46
$ wget -q -O - https://api.pwnedpasswords.com/range/d4521|grep -i 18ad7a65b151ac323e8abab71737896fc46 || echo "Not found"
Not found

On the other side, p@sszort has never been seen so far (which is quite surprising…)
另一方面, p@sszort 到目前为止从未见过(这很令人惊讶……

Since then, an API v3 was released, adding the possibility to request padded answers so the size of the answer, even over TLS, does not become a source of leakage.5 But as far as we are concerned, the API is identical to the v2.
从那时起,发布了 API v3,增加了请求填充答案的可能性,因此即使通过 TLS,答案的大小也不会成为泄漏源。 5 但就我们而言,API 与 v2 相同。

By the way, checking a password against previous breach corpora is even a NIST recommendation for any service asking you to create an account or change your password on their platform, and historically the reason why Troy Hunt offered the Pwned Passwords service to the community in the first place.6
顺便说一句,针对以前的违规语料库检查密码甚至是NIST对任何要求您在其平台上创建帐户或更改密码的服务的建议,从历史上看,这也是Troy Hunt首先向社区提供Pwned Passwords服务的原因。 6

Here comes Passbolt Passbolt 来了

Passbolt is a Security-first, open source password manager specialized in the sharing of passwords across teams and businesses. It is based on a website you can host yourself (or use their cloud solution) and on a browser plugin which takes care of the cryptography, securing the storage on the website and providing end-to-end encryption across people sharing passwords.
Passbolt 是一个安全第一的开源密码管理器,专门用于团队和企业之间的密码共享。它基于您可以自己托管的网站(或使用他们的云解决方案)和一个浏览器插件,该插件负责加密,保护网站上的存储并在共享密码的人之间提供端到端加密。

It integrates the Pwned Passwords feature since 2018 (in the v2.0.10) and is very convenient: when typing in a new password, it immediately alerts you if “the password is part of an exposed data breach” and adjusts its verdict while you are typing in more chars for a better password.
自 2018 年以来(在 v2.0.10 中),它集成了 Pwned Passwords 功能,非常方便:输入新密码时,它会立即提醒您“密码是暴露数据泄露的一部分”,并在您输入更多字符以获得更好的密码时调整其判断。

Passbolt: a bold use of HaveIBeenPwned

Could it be a problem?
会不会有问题?

The first naive thought when we saw this feature was that if it emits queries as soon as you type, you could easily guess which first char was typed in. You have about 92 possible chars including alphanumeric and special chars and 5 nibbles of SHA1 is largely enough to discriminate which char was typed in. Then typing the second char would trigger another call to the API and another partial SHA1 would be emitted, again making trivial the recovery of the second char of your password. Etc. no matter how many chars your password ends up to.
当我们看到这个功能时,第一个天真的想法是,如果它在你输入时立即发出查询,你可以很容易地猜到输入的第一个字符。您有大约 92 个可能的字符,包括字母数字和特殊字符,而 SHA1 的 5 个半字节在很大程度上足以区分输入了哪个字符。然后,键入第二个字符将触发对 API 的另一次调用,并发出另一个部分 SHA1,再次使恢复密码的第二个字符变得微不足道。等等,无论您的密码最终有多少个字符。

Observations 观察

At this stage, we want to validate our hypotheses and observe the calls to the API by the Passbolt browser plugin.
在这个阶段,我们想验证我们的假设,并观察 Passbolt 浏览器插件对 API 的调用。

Sniffing the calls to the API on a Chrome-based browser looks like this.
在基于 Chrome 的浏览器上嗅探对 API 的调用如下所示。

  • Go to your Passbolt website instance
    转到您的 Passbolt 网站实例
  • Go to brave://extensions/
    转到 brave://extensions/
  • Activate the Developer mode
    激活开发人员模式
  • Search for the Passbolt extension / inspect views / index.html
    搜索 Passbolt 扩展 / 检查视图 / index.html
  • Clicking on index.html will raise a popup with the DevTools
    单击“index.html”将弹出一个带有 DevTools 的弹出窗口
  • Watch its Network tab while you are creating a password in your Passbolt website instance
    在 Passbolt 网站实例中创建密码时,请观看其“网络”选项卡

Passbolt: a bold use of HaveIBeenPwned

We observe that it invariably tells you that the password is exposed for any password up to 7 chars. Then it starts querying api.pwnedpasswords.com only for longer passwords.
我们观察到,它总是告诉您,对于最多 7 个字符的任何密码,密码都会公开。然后,它开始仅查询较长密码 api.pwnedpasswords.com。

1 to 1234567 -> "breached" (without actually checking)
12345678     -> API query with 7C222 (SHA1[0:5] of 12345678)
123456789    -> API query with F7C3B (SHA1[0:5] of 123456789)
123456789A   -> API query with BE472 (SHA1[0:5] of 123456789A)
123456789AB  -> API query with 4A3C4 (SHA1[0:5] of 123456789AB)

The requests are done 300 ms after last keystroke. So, if someone types at most 3 chars per second, we get all the intermediate queries. For faster typing, we will miss some intermediate queries. This small delay is due to the use of a debounce function (from npm debounce-promise package).
请求在最后一次击键后 300 毫秒完成。因此,如果有人每秒最多键入 3 个字符,我们将获得所有中间查询。为了加快键入速度,我们将错过一些中间查询。这种小延迟是由于使用了去抖动函数(来自 npm debounce-promise 包)。

Attack feasibility 攻击可行性

At first, an attacker would only see the first partial hash of a 8-char password. There are still quite a number of candidates. But then when seeing the second request very close to the first one, she should just look a which 8-char candidates, extended by a single char, match the second partial hash. And so forth.
起初,攻击者只能看到 8 个字符密码的第一个部分哈希值。仍然有相当多的候选人。但是,当看到第二个请求非常接近第一个请求时,她应该只看一个 8 个字符的候选者,由单个字符扩展,与第二个部分哈希匹配。等等。

We can estimate the complexity of recovering a password from such partial hashes.
我们可以估计从这种部分哈希中恢复密码的复杂性。

Assuming that the user is typing a new password generated perfectly randomly from a charset of 92 numbers, letters and symbol (a task virtually impossible for a human BTW).
假设用户正在输入一个由 92 个数字、字母和符号组成的字符集完全随机生成的新密码(对于人类来说,这是一项几乎不可能完成的任务)。

  • After 8 chars, the password entropy is ���2(928)=52.2 bits and we learned ���2(165)=20 bits.
    8 个字符后,密码熵是 ���2(928)=52.2 位,我们学习 ���2(165)=20 了位。

    The entropy is reduced to 32.2 bits.
    熵被简化为 32.2 比特。
  • After 9 chars, the password entropy is ���2(929)=58.7 bits and we learned 2 ���2(165)=40 bits.
    在 9 个字符之后,密码熵是 ���2(929)=58.7 位,我们学习 2 ���2(165)=40 了位。

    The entropy is reduced to 18.7 bits.
    熵被简化为 18.7 比特。
  • After 10 chars, the password entropy is ���2(9210)=65.2 bits and we learned 3 ���2(165)=60 bits.
    在 10 个字符之后,密码熵是 ���2(9210)=65.2 位,我们学习 3 ���2(165)=60 了位。

    The entropy is reduced to 5.2 bits.
    熵被简化为 5.2 比特。
  • After 11 chars, the password entropy is ���2(9211)=71.8 bits and we learned 4 ���2(165)=80 bits.
    在 11 个字符之后,密码熵是 ���2(9211)=71.8 位,我们学习 4 ���2(165)=80 了位。

    The entropy is null.
    熵为空。

    The password can be fully recovered.
    密码可以完全恢复。

Big-Endians prefer to break their egg from the bigger end. Not sure it is a good idea here. Breaking the 11-char password based on the knowledge of 4 partial hashes requires to generate all candidates and filter them based on these hashes. Which means… generating 3 996 373 778 857 415 671 808 candidates.
大端人更喜欢从更大的一端打破他们的鸡蛋。不确定这是个好主意。根据 4 个部分哈希的知识破解 11 个字符的密码需要生成所有候选者并根据这些哈希过滤它们。这意味着…生成 3 996 373 778 857 415 671 808 候选者。

From the smaller end, it seems slightly more doable as we would
从较小的一端来看,它似乎比我们更可行

  • first generate a mere 5 132 188 731 375 616 8-char candidates;
    首先生成仅 5 132 188 731 375 616 8 个字符的候选字符;
  • keep about one millionth of them (the ones matching the first partial hash);
    保留大约百万分之一(与第一个部分哈希匹配的那些);
  • extend these candidates to 9 chars;
    将这些候选字符扩展到 9 个字符;
  • keep about one millionth of them (the ones matching the second partial hash);
    保留大约百万分之一(与第二个部分哈希匹配的那些);
  • extend these candidates to 10 chars, etc.
    将这些候选字符扩展到 10 个字符,依此类推。

But is that practically feasible?
但这实际上可行吗?

By the rules of PoC||GTFO, it is time for writing a Proof-of-Concept! Ok, this was also an excuse to delve into Hashcat…
根据 PoC 规则||走开,是时候写一个概念验证了!好吧,这也是深入研究 Hashcat 的借口……

Writing a Hashcat module and its kernels

Hashcat is the world’s fastest and most advanced password recovery utility, supporting five unique modes of attack for over 300 highly-optimized hashing algorithms.

Let’s build a Hashcat module that will

  • take 4 partial hashes;
    取 4 个部分哈希值;
  • assume they correspond to the API calls while typing the 8th, 9th, 10th and 11th chars of our password;
    假设它们在键入密码的第 8、9、10 和 11 个字符时对应于 API 调用;
  • crack the 11-char password.
    破解 11 个字符的密码。

A very interesting resource for writing a Hashcat module and its kernels (a term referring to the OpenCL code snippets being compiled for the GPU) is the Hashcat Plugin Development Guide7.
编写 Hashcat 模块及其内核(指为 GPU 编译的 OpenCL 代码片段的术语)的一个非常有趣的资源是 Hashcat 插件开发指南 7 。

We will use the SHA1 module and its kernels as a basis. There is no need to change any source file of Hashcat to develop a module, we just need to add a few files. We start by copying the SHA1 module src/modules/module_00100.c to src/modules/module_90100.c (90000-99999 is free for personal uses) and changing a few parameters: a new HASH_NAME, its reference number, called KERN_TYPE, we disable the self-test capabilities by adding OPTS_TYPE_SELF_TEST_DISABLE and we modify the expected hash structure from 40 to 32 bytes as follows.
我们将使用 SHA1 模块及其内核作为基础。开发模块不需要更改 Hashcat 的任何源文件,我们只需要添加几个文件即可。我们首先将 SHA1 模块 src/modules/module_00100.c 复制到 src/modules/module_90100.c (90000-99999 免费供个人使用)并更改一些参数:一个新的 HASH_NAME ,它的参考编号,称为 KERN_TYPE ,我们通过添加 OPTS_TYPE_SELF_TEST_DISABLE 和修改预期的哈希结构来禁用自检功能从 40 到 32 字节,如下所示。

000aaaaa000bbbbb000ccccc000ddddd with  000aaaaa000bbbbb000ccccc000ddddd 跟

  • aaaaa the 5 top nibbles of the SHA1 of the first 8 chars of pwd
    aaaaa PWD 前 8 个字符的 SHA1 的 5 个顶级小啃食
  • bbbbb the 5 top nibbles of the SHA1 of the first 9 chars of pwd
    bbbbb PWD 前 9 个字符的 SHA1 的 5 个顶级小块
  • ccccc the 5 top nibbles of the SHA1 of the first 10 chars of pwd
    ccccc PWD 前 10 个字符的 SHA1 的 5 个顶级咬合
  • ddddd the 5 top nibbles of the SHA1 of the first 11 chars of pwd
    ddddd PWD 前 11 个字符的 SHA1 的 5 个顶级小块

So for the previous example 123456789AB we get the “hash” 0007C222000F7C3B000BE4720004A3C4. The alignment of each 5-nibble hash to 4 bytes is just to make our life easier when we will need to compare them with candidate hashes. Actually, Hashcat does not transmit full hashes to the GPU kernels anyway, but 4 snippets of 4 bytes each. The SHA1 module is using snippets 3, 4, 2, 1 in that order (and never using snippet 0) but we will reorder them more simply as 0, 1, 2 and 3 by changing DGST_POS0 to DGST_POS3 constants.
因此,对于前面的示例 123456789AB ,我们得到“哈希” 0007C222000F7C3B000BE4720004A3C4 。将每个 5 字节哈希与 4 字节对齐只是为了在我们需要将它们与候选哈希进行比较时让我们的生活更轻松。实际上,Hashcat 无论如何都不会将完整的哈希传输到 GPU 内核,而是传输 4 个片段,每个片段 4 个字节。SHA1 模块按此顺序使用代码段 3、4、2、1(从不使用代码段 0),但我们将通过更改 DGST_POS0 为 DGST_POS3 常量将它们更简单地重新排序为 0、1、2 和 3。

That’s it for the CPU side, we can compile with make -j.
CPU端就是这样,我们可以用 make -j .

The SHA1 module is using the fast-hash mode which means candidates are not all produced by the CPU and transmitted to the GPU, because the GPU is si fast it would be always waiting for the PCIe bus to get new candidates. Instead, base candidates are given to the GPU and the GPU generates a few derived passwords autonomously.
SHA1 模块使用快速哈希模式,这意味着候选者并非全部由 CPU 生成并传输到 GPU,因为 GPU 速度很快,因此它将始终等待 PCIe 总线获得新的候选者。取而代之的是,将基本候选密码提供给 GPU,并且 GPU 会自动生成一些派生密码。

It features several kernel attack modes (i.e. how password candidates are generated) but we will only implement the a3 mode, which is the brute-force mode. Each kernel implements several function versions: S and M for cracking single or multiple hashes. Indeed as there is no diversification seed, from one candidate you just compute its hash once and then you can compare it to as many hashes as you want at very little cost. The M version implements some smart bloom filter and binary tree search.
它具有多种内核攻击模式(即如何生成候选密码),但我们只会实现 a3 模式,即暴力破解模式。每个内核都实现了几个函数版本:S 和 M 用于破解单个或多个哈希值。事实上,由于没有多样化种子,您只需从一个候选者计算一次其哈希值,然后您可以以非常低的成本将其与任意数量的哈希值进行比较。M 版本实现了一些智能绽放过滤器和二叉树搜索。

Moreover, each kernel comes in two implementations: a pure one, easy to develop and understand, and an optimized one, slightly more hardcore with some unrolled SHA1 code inline… The optimized kernels have some password size limitation but we do not really care as we will only break 8-char (extended to 11-char) passwords. One can call the optimized kernel by using the -O option of the Hashcat client.
此外,每个内核都有两种实现:一种是纯的,易于开发和理解的,另一种是优化的,稍微硬核一些,有一些展开的 SHA1 代码内联……优化的内核有一些密码大小限制,但我们并不真正关心,因为我们只会破解 8 个字符(扩展到 11 个字符)密码。可以使用 Hashcat 客户端 -O 的选项调用优化的内核。

So we will start our PoC development by copying OpenCL/m00100_a3-pure.cl to OpenCL/m90100_a3-pure.cl and removing the multi-hash M functions, to keep things simpler. We keep the SHA1 computation but when the time comes to compare a candidate hash to the target hash, we remove the comparison logic.

     sha1_final_vector (&ctx);
-    const u32x r0 = ctx.h[DGST_R0];
-    const u32x r1 = ctx.h[DGST_R1];
-    const u32x r2 = ctx.h[DGST_R2];
-    const u32x r3 = ctx.h[DGST_R3];
-    COMPARE_S_SIMD (r0, r1, r2, r3);

And we replace it by ours.

    if (ctx.h[0] >> 12 == search[0])             // shift to keep first 5 nibbles and compare
    {
      for (u32 p9 = 20; p9 < 128; p9++) {        // guess 9th char:
        w[2] = p9 << 24;                         // append chr to current pwd
        sha1_ctx_vector_t ctx;                   // compute SHA1
        sha1_init_vector (&ctx);
        sha1_update_vector (&ctx, w, pw_len + 1);
        sha1_final_vector (&ctx);
        if (ctx.h[0] >> 12 == search[1])         // shift and compare with second partial hash
        {
          for (u32 p10 = 20; p10 < 128; p10++) { // guess 10th char...
            w[2] = (p9 << 24) | (p10 << 16);
            sha1_ctx_vector_t ctx;
            sha1_init_vector (&ctx);
            sha1_update_vector (&ctx, w, pw_len + 2);
            sha1_final_vector (&ctx);
            if (ctx.h[0] >> 12 == search[2])
            {
              for (u32 p11 = 20; p11 < 128; p11++) {
                w[2] = (p9 << 24) | (p10 << 16) | (p11 << 8);
                sha1_ctx_vector_t ctx;
                sha1_init_vector (&ctx);
                sha1_update_vector (&ctx, w, pw_len + 3);
                sha1_final_vector (&ctx);
                if (ctx.h[0] >> 12 == search[3])
                {
                  // initial hashcat logic once a hash is cracked
                  const u32 final_hash_pos = DIGESTS_OFFSET_HOST + 0;
                  if (hc_atomic_inc (&hashes_shown[final_hash_pos]) == 0)
                  {
                    mark_hash (plains_buf, d_return_buf, SALT_POS_HOST, DIGESTS_CNT, 0, final_hash_pos, gid, il_pos, 0, 0);
                  }
                }
              }
            }
          }
        }
      }
    }

Kernels are compiled on-the-fly when you use them, which is very convenient. But always remember to delete the compiled version from the cache when you are developing and modifying it! (rm kernels/m90100_a3*).
内核在使用时是即时编译的,非常方便。但是,在开发和修改编译版本时,请始终记住从缓存中删除它!( rm kernels/m90100_a3* ).

Once a match is found given a 8-char candidate, we recovered the first 11 chars of the password, but there is no mechanism to return a modified password from the GPU to the host. We could just print it with printf from the kernel itself but it does not cost much to recompute these 3 chars later.
一旦找到匹配项,给定 8 个字符的候选字符,我们就会恢复密码的前 11 个字符,但没有机制将修改后的密码从 GPU 返回到主机。我们可以从内核本身打印 printf 它,但稍后重新计算这 3 个字符的成本并不高。

Performances 性能

On our laptop equipped with a RTX 2000, this first pure and single-hash version runs at 3440 MH/s (millions of hashes per second).
在我们配备 RTX 2000 的笔记本电脑上,第一个纯单哈希版本以 3440 MH/s(每秒数百万个哈希)的速度运行。

We can also create an optimized version based on OpenCL/m00100_a3-optimized.cl and use the same comparison logic as above. But to get it working, we also need to revert a couple of optimizations incompatible with our partial hashes:
我们还可以基于 OpenCL/m00100_a3-optimized.cl 并使用与上述相同的比较逻辑创建优化版本。但为了让它工作,我们还需要恢复一些与我们的部分哈希不兼容的优化:

  • we remove the computation of e_rev and the early abort condition using it if (MATCHES_NONE_VS (e, e_rev)) continue;
    我们使用它 if (MATCHES_NONE_VS (e, e_rev)) continue; 删除了 的 e_rev 计算和早期中止条件
  • we remove a precomputation step which was done in the SHA1 module (subtracting a constant SHA1M_A from the first 4-byte hash segment to find) and we change our first test condition into if (((a + SHA1M_A) >> 12) == search[0]) to compensate it.
    我们删除了在 SHA1 模块中完成的预计算步骤(从第一个 4 字节哈希段中减去一个常量 SHA1M_A 以查找),并将第一个测试条件更改为 if (((a + SHA1M_A) >> 12) == search[0]) 对其进行补偿。

Now this optimized single-hash version runs at 4066 MH/s.
现在,这个优化的单哈希版本以 4066 MH/s 的速度运行。

Is it fast enough? In the worst-case scenario where we need to test all possible combinations out of a 92-symbol charset, the attack will take maximum 15 days. Knowing that Hashcat SHA1 runs at 5659 MH/s on a RTX 2000 and 50 GH/s on a RTX 4090, we can extrapolate that these 15 days can be squeezed into a mere 5 hours on a 8x RTX 4090 that you can rent for $4/h.
够快吗?在最坏的情况下,我们需要测试 92 个符号字符集中的所有可能组合,攻击最多需要 15 天。知道 Hashcat SHA1 在 RTX 2000 上以 5659 MH/s 的速度运行,在 RTX 4090 上以 50 GH/s 的速度运行,我们可以推断,这 15 天在 8x RTX 4090 上可以压缩到短短的 5 小时,您可以以 4 美元/小时的价格租用。

This is really an upper bound as we are not targeting passwords generated by a random generator but passwords chosen and typed in by a human. On another extreme, if we assume the first 8 chars of the password were just lowercase letters, the attack takes less than a minute on our laptop, no matter how complex and long the full password might be.
这确实是一个上限,因为我们的目标不是随机生成器生成的密码,而是由人类选择和输入的密码。在另一个极端,如果我们假设密码的前 8 个字符只是小写字母,那么无论完整密码多么复杂和冗长,攻击在我们的笔记本电脑上只需不到一分钟。

Demo 演示

Let’s simulate the typing of the password iwashere$&@!2=[#).
让我们模拟密码的输入 iwashere$&@!2=[#) 。

#!/usr/bin/env python3
import hashlib
import sys

PASS = sys.argv[1]
assert len(PASS) >= 11
lh = [hashlib.sha1(PASS[:l].encode()).hexdigest()[:5] for l in range(8, len(PASS) + 1)]
print(' '.join(lh))
./passbolt_generate_partial_hashes.py 'iwashere$&@!2=[#)'
bb0c6 814f8 a9c98 9d4ef 35310 78a67 e7cfe a32d2 37735 0540a

Knowing the partial hashes, we can call our Hashcat module against the corresponding “hash” 000bb0c6000814f8000a9c980009d4ef.
知道了部分哈希值,我们可以针对相应的“哈希值” 000bb0c6000814f8000a9c980009d4ef 调用我们的 Hashcat 模块。

echo 000bb0c6000814f8000a9c980009d4ef > hash
time ./hashcat -m 90100 hash -a 3 --potfile-disable --quiet '?l?l?l?l?l?l?l?l'

000bb0c6000814f8000a9c980009d4ef:iwashere

real    0m6,226s

The first 8 chars were recovered in 6 seconds.
前 8 个字符在 6 秒内恢复。

Breaking the next chars is straightforward as we just have to try all of the possibilities for each next char and check it against the next partial hash.
破坏下一个字符很简单,因为我们只需要尝试每个下一个字符的所有可能性,并根据下一个部分哈希进行检查。

#!/usr/bin/env python3

import hashlib
import sys

PASS = sys.argv[1]
args = sys.argv[2:]

assert args.pop(0) == hashlib.sha1((PASS).encode()).hexdigest()[:5]
while args:
    H = args.pop(0)
    for i in range(20, 128):
        if hashlib.sha1((PASS + chr(i)).encode()).hexdigest()[:5] == H:
            PASS += chr(i)
            break
print(PASS)
time ./passbolt_recovery_step2.py 'iwashere' bb0c6 814f8 a9c98 9d4ef 35310 78a67 e7cfe a32d2 37735 0540a

iwashere$&@!2=[#)

real    0m0,033s

So the full password got recovered with an extra 33 milliseconds…
因此,完整密码在额外的 33 毫秒内恢复……

Variants 变种

As we can only get all the partial hashes if the password is typed moderately slowly (3 char/s), we may occasionally miss a few partial hashes in other scenarios.
由于如果密码键入速度适中(3 char/s),我们只能获得所有部分哈希值,因此在其他情况下,我们可能偶尔会错过一些部分哈希值。

It is easy to design variants of the module tolerating some missing intermediate hashes. Missing the initial 8-char hash would be quite costly to overcome but missing  intermediate hashes would just cost e.g. 92�+1 instead of 92 trials, still quite affordable compared to the massive brute-force of the initial 8-char hash. We will still need 3 or 4 partial hashes in total to make sure we can isolate a few or a single candidate.
很容易设计模块的变体,容忍一些缺失的中间哈希值。缺少初始 8 个字符的哈希值将非常昂贵,但缺少  中间哈希值只会花费例如 92�+1 而不是 92 试验,与初始 8 个字符哈希值的巨大暴力破解相比,仍然相当实惠。我们仍然需要总共 3 或 4 个部分哈希值,以确保我们可以隔离几个或单个候选者。

Another type of variation to consider is to be able to catch cases when the user does not type linearly the password but edits it to insert or mutate some of the chars to make a better password in reaction to the breach notification returned by Passbolt. These cases are also pretty easy to take into consideration and will add only marginal complexity to the recovery process.
要考虑的另一种变体是能够捕获用户不线性键入密码的情况,而是编辑密码以插入或更改某些字符以生成更好的密码,以响应 Passbolt 返回的违规通知。这些情况也很容易考虑,只会给恢复过程增加边际复杂性。

The problems documented here are quite specific to Passbolt’s eagerness to assess the user password while being typed, but similar issues may arise as soon as related passwords are assessed in a small time frame. Actually, while documenting our findings, we stumbled upon a very interesting article by Jack Cable.8 He details the related passwords problem which may occur when a user changes a compromised password or reuses a robust password with only some minor changes on several websites. The principle is that given e.g. N partial hashes of N related passwords, one may theoretically enumerate all candidates of each partial hash and look for candidates quite similar across the N sets. If you are interested in more math and probabilities, we warmly recommend you to read the article.
这里记录的问题非常具体,因为 Passbolt 急于在键入时评估用户密码,但是一旦在短时间内评估相关密码,就会出现类似的问题。实际上,在记录我们的发现时,我们偶然发现了杰克·凯布尔(Jack Cable)的一篇非常有趣的文章。 8 他详细介绍了相关的密码问题,当用户更改泄露的密码或重复使用强大的密码时,只需在几个网站上进行一些小的更改,就会发生这种情况。其原理是,给定 N 个相关密码的 N 个部分哈希,理论上可以枚举每个部分哈希的所有候选者,并在 N 个集合中寻找非常相似的候选者。如果您对更多数学和概率感兴趣,我们热忱推荐您阅读本文。

Jack Cable reported the potential problem to Troy Hunt and to the developers of two password managers: 1Password, Bitwarden. It is very unfortunate that Passbolt was not alerted of the risks at that time, especially given the fact that Passbolt chose a usage of Pwned Passwords that makes the attack highly practical. Not even mentioning that the current implementation of Passbolt doubles its queries to the API for no reason (as the most sagacious of you might have noted in the animated snapshot), a perfect signature for the server to spot Passbolt queries among its traffic.
Jack Cable 向 Troy Hunt 和两个密码管理器的开发人员报告了这一潜在问题:1Password、Bitwarden。非常不幸的是,Passbolt 当时没有收到风险警报,特别是考虑到 Passbolt 选择了使用 Pwned Passwords 使攻击非常实用的事实。甚至没有提到 Passbolt 的当前实现无缘无故地将其对 API 的查询加倍(正如你们中最聪明的人可能在动画快照中注意到的那样),这是服务器在其流量中发现 Passbolt 查询的完美签名。

Who are the potential attackers?
谁是潜在的攻击者?

The only technical requirement is to be able to observe Passbolt plugin queries to the Pwned Password API.
唯一的技术要求是能够观察 Passbolt 插件对 Pwned Password API 的查询。

Obviously, Troy Hunt and Cloudflare (which is massively caching Pwned Passwords queries in its nodes) are in such position. We are not questioning their good faith, just stating facts.
显然,Troy Hunt 和 Cloudflare(在其节点中大量缓存 Pwned Passwords 查询)处于这样的位置。我们不是在质疑他们的诚意,只是在陈述事实。

But also anyone deploying the service based on the downloadable hashes, e.g. within a corporate network.9 Or simply anyone in position to intercept and observe the queries, such as a TLS proxy. In these two examples, we assume a corporate certificate authority has been deployed on the laptops.
还有任何基于可下载哈希值部署服务的人,例如在公司网络内。 9 或者只是任何能够拦截和观察查询的人,例如 TLS 代理。在这两个示例中,我们假设已在笔记本电脑上部署了公司证书颁发机构。

While we only implemented the simple hash functions in our Hashcat PoC, an attacker collecting many partial hashes would add the multiple hashes functions. Remember that by construction, there is no seed and each computed hash can be compared to many intercepted hashes at once. At scale it may even be interesting to build dedicated rainbow tables.
虽然我们在 Hashcat PoC 中只实现了简单的哈希函数,但收集许多部分哈希的攻击者会添加多个哈希函数。请记住,通过构造,没有种子,每个计算的哈希值可以一次与许多截获的哈希值进行比较。在规模上,建造专用的彩虹桌甚至可能很有趣。

1Password and Bitwarden 1Password 和 Bitwarden

1Password uses Pwned Passwords as well, but with the following differences:
1Password 也使用 Pwned Passwords,但有以下区别:

  • opt-in: it needs to be activated in the application settings, cf. Privacy  Check for vulnerable passwords;
    选择加入:需要在应用程序设置中激活它,参见易受攻击的密码的隐私  检查;
  • the API is called only when saving the password, so once it has been typed entirely by the user.
    只有在保存密码时才会调用 API,因此一旦用户完全键入密码。

Bitwarden usage of Pwned Passwords is as follows.
Bitwarden 对 Pwned Passwords 的用法如下。

  • opt-out while creating or changing the master password: it presents an already selected checkbox “Check known data breaches for this password”;
    在创建或更改主密码时选择退出:它会显示一个已选中的复选框“检查此密码的已知数据泄露”;
  • opt-in for every single account password: the user needs to press a button to check known data breaches for this password.
    选择加入每个帐户密码:用户需要按下一个按钮来检查此密码的已知数据泄露。

Alternative protocols 替代协议

Inspired by Pwned Passwords, Google and Stanford published in 2019 another protocol mixing k-anonymity and private set intersection methods.10 The protocol has been implemented in the Google Password Checkup.11 Theoretically it could potentially be misused as well: a partial 2-byte hash is sent to the server. Imagine a client implementing the same logic as Passbolt: after typing 12 chars of a strong password (entropy of ���2(9212)=78.3 bits), it could be recovered from 5 intercepted 2-byte hashes (5 ���2(2562)=80 bits). But two factors temper this risk: here the hash is computed over a concatenation of the username and the password, and the hash algorithm is Argon2, which is intentionally slow to compute. So an actual brute-force would be way less practical. Note that the 2019 paper hinted at a plan to migrate to a zero-password leakage variant of the protocol, avoiding completely the k-anonymity leakage. It is unclear if it has been done as of today.12
受 Pwned Passwords 的启发,谷歌和斯坦福大学于 2019 年发布了另一种混合了 k-anonyity 和私有集交集方法的协议。 10 该协议已在 Google 密码检查中实施。 11 从理论上讲,它也可能被滥用:将部分 2 字节哈希发送到服务器。想象一下,一个客户端实现了与 Passbolt 相同的逻辑:在输入 12 个字符的强密码( ���2(9212)=78.3 位熵)后,可以从 5 个截获的 2 字节哈希( 5 ���2(2562)=80 位)中恢复它。但有两个因素缓和了这种风险:这里的哈希是通过用户名和密码的串联计算的,而哈希算法是 Argon2,它故意计算得很慢。因此,实际的蛮力将不那么实用。请注意,2019 年的论文暗示了迁移到协议的零密码泄漏变体的计划,完全避免了 k 匿名泄漏。目前尚不清楚截至今天是否已经完成。 12

Let’s also mention the paper of Cloudflare and Cornell University, which is exploring more advanced protocols (FSB and IDB) to further reduce information leakage in the APIs, and the subsequent paper of Cloudflare, Cornell University and University of Wisconsin-Madison, aiming at proactively warning users if similar passwords have been leaked and could jeopardize a user password by credential tweaking attack.13,14 Their MIGP protocol (Might I Get Pwned, a nod to Troy’s Have I Been Pwned) is extensively described on Cloudflare blog.15
我们还要提到 Cloudflare 和康奈尔大学的论文,该论文正在探索更高级的协议(FSB 和 IDB)以进一步减少 API 中的信息泄露,以及随后 Cloudflare、康奈尔大学和威斯康星大学麦迪逊分校的论文,旨在主动警告用户,如果类似的密码被泄露,并可能通过凭据调整攻击危及用户密码。 13 14 他们的 MIGP 协议(Might I Get Pwned,对 Troy 的 Have I Been Pwned 的致敬)在 Cloudflare 博客上得到了广泛的描述。 15

Conclusion 结论

Pwned Passwords is an interesting example of the dangers of designing a cryptographic protocol with a deemed acceptable risk, being used in an unforeseen way such that the risk becomes suddenly unacceptable.
Pwned Passwords 是一个有趣的例子,说明设计一个被认为具有可接受风险的加密协议的危险,以不可预见的方式使用,使得风险突然变得不可接受。

We demonstrated that for an entity able to spy on Pwned Passwords API usage by Passbolt while a user types in a new password of 11 chars or more, it is surprisingly doable to recover the password.
我们证明,当用户输入 11 个字符或更多字符的新密码时,对于能够监视 Passbolt 使用 Pwned Passwords API 的实体,恢复密码是令人惊讶的可行。

That said, such a service, when used properly, is still very valuable for the password managers users to get alerted of compromised passwords.
也就是说,如果使用得当,这样的服务对于密码管理器用户收到密码泄露的警报仍然非常有价值。

Passbolt chose to fix the issue in the browser extensions v4.6.2 by querying the Pwned Passwords API
Passbolt 选择通过查询 Pwned Passwords API 来修复浏览器扩展 v4.6.2 中的问题

  • only when saving a new or updated password;
    仅在保存新密码或更新密码时;
  • and only if the password estimated entropy is larger than 60 bits (which is about 10 chars if using mixed symbols, or 20 chars if only using numbers, for example).
    并且仅当密码估计的熵大于 60 位时(例如,如果使用混合符号,则约为 10 个字符,如果仅使用数字,则约为 20 个字符)。

At the time Jack Cable alerted Troy Hunt about some risk of information leakage via queries about related passwords, adding warnings to the service documentation was discarded as part of an assumed trade-off.8
当杰克·凯布尔(Jack Cable)通过查询相关密码向特洛伊·亨特(Troy Hunt)发出有关信息泄露风险的警报时,作为假设权衡的一部分,在服务文档中添加警告被丢弃了。 8

Let us hope that this time the risk is tangible enough for Troy Hunt to document bad usages and associated risks of his API.
让我们希望这一次的风险足够明显,让 Troy Hunt 能够记录他的 API 的不良用法和相关风险。

Update – 2023-04-20 更新 – 2023-04-20

After the initial publication of this blog post on April 17, 2024, we notified Troy Hunt, and as a result, the Pwned Passwords APIv3 documentation has been updated to include a warning about the risks of incremental searches.16
在 2024 年 4 月 17 日首次发布这篇博文后,我们通知了 Troy Hunt,因此,Pwned Passwords APIv3 文档已更新,以包含有关增量搜索风险的警告。 16

Acknowledgements 确认

Communication with Passbolt was quite a model of vendor reactivity and openness. We thank them for their prompt reaction to all our inquiries. They provided clear communication throughout the process, and actively collaborated to ensure a smooth resolution.
与 Passbolt 的沟通是供应商反应性和开放性的典范。我们感谢他们对我们的所有询问作出的迅速反应。他们在整个过程中提供了清晰的沟通,并积极合作以确保顺利解决。

We also thank Troy for his quick update of the API documentation!
我们还要感谢 Troy 对 API 文档的快速更新!

And finally, thanks to our colleagues at Quarkslab for proofreading this article!
最后,感谢 Quarkslab 的同事对本文的校对!


Disclosure timeline 披露时间表

  • 2024/03/22 – Vulnerability reported to Passbolt
    2024/03/22 – 向 Passbolt 报告漏洞
  • 2024/03/23 – Acknowledgement of the vulnerability
    2024/03/23 – 确认漏洞
  • 2024/03/27 – Informal and open discussion IRL with Passbolt CEO at InCyber 2024, Lille. Thanks for the hoodie 😉
    2024/03/27 – 在里尔举行的 InCyber 2024 上与 Passbolt 首席执行官进行非正式和公开讨论。感谢您的连帽衫;)
  • 2024/03/28 – Passbolt communicated their action plan, reaching out for our feedback on their remediation strategy
    2024/03/28 – Passbolt 传达了他们的行动计划,并征求了我们对他们补救策略的反馈
  • 2024/03/28 – Release of passbolt_styleguide v4.6.3 implementing the actual fixes
    2024/03/28 – 发布 passbolt_styleguide v4.6.3 实现实际修复
  • 2024/03/29 – Release of passbolt-browser-extension v4.6.2 mentioning the 15 implemented fixes in the changelog
    2024/03/29 – 发布 passbolt-browser-extension v4.6.2,在更新日志中提到了 15 个已实现的修复
  • 2024/03/30 – Chrome extension v4.6.2 published
    2024/03/30 – Chrome 扩展 v4.6.2 发布
  • 2024/04/03 – Firefox extension v4.6.2 published
    2024/04/03 – Firefox 扩展 v4.6.2 发布
  • 2024/04/04 – Edge extension v4.6.2 published
    2024/04/04 – Edge 扩展 v4.6.2 发布
  • 2024/04/04 – Quarkslab sends some comments on the proposed strategy, calling for opt-out option and clearer messages to the user when the password is evaluated as very weak (< 60 bits) but not sent to Pwned Passwords API
    2024/04/04 – Quarkslab 对提议的策略发送了一些评论,当密码被评估为非常弱(< 60 位)但未发送到 Pwned Passwords API 时,要求用户选择退出选项和更清晰的消息
  • 2024/04/04 – Passbolt indicates opt-out is available to admins on the Pro/Cloud versions and will be extended to the Community Edition as well. User message will be made clearer as well.
    2024/04/04 – Passbolt 表示 Pro/Cloud 版本的管理员可以选择退出,并将扩展到社区版。用户消息也将更加清晰。
  • 2024/04/11 – Windows application v1.0 pushed
    2024/04/11 – Windows 应用 v1.0 推送
  • 2024/04/11 – Release of passbolt_api v4.6.2This version is a targeted security release of both the API and the browser extension focusing on fixing security issues reported by security researchers.
    2024/04/11 – 发布 passbolt_api v4.6.2,此版本是 API 和浏览器扩展的有针对性的安全版本,专注于修复安全研究人员报告的安全问题。
  • 2024/04/16 – CVE requested by Passbolt
    2024/04/16 – Passbolt 请求的 CVE
  • 2024/04/17 – Synchronized publication of this post and the Passbolt Incident Report
    2024/04/17 – 此帖子和 Passbolt 事件报告同步发布
  • 2024/04/17 – Quarkslab brings this work to the attention of Troy Hunt and suggests a warning about such type of misuse in the APIv3 documentation
    2024/04/17 – Quarkslab 提请 Troy Hunt 注意这项工作,并在 APIv3 文档中提出了有关此类滥用的警告
  • 2024/04/20 – APIv3 documentation has been updated to include such warning, and this blog post has been updated accordingly 16
    2024/04/20 – APIv3 文档已更新以包含此类警告,此博文也已相应 16 更新

  1. Introducing 306 Million Freely Downloadable Pwned Passwords 
    推出 3.06 亿个可免费下载的密码 ↩

  2. Validating Leaked Passwords with k-Anonymity 
    使用 k-Anonymity ↩ 验证泄露的密码

  3. I’ve Just Launched “Pwned Passwords” V2 With Half a Billion Passwords for Download 
    我刚刚推出了“Pwned Passwords”v2,其中包含五亿个密码可供下载 ↩

  4. API v2 – Pwned Passwords overview – Searching by range 
    API v2 – Pwned Passwords 概述 – 按范围↩搜索

  5. API v3 – Pwned Passwords overview – Searching by range 
    API v3 – Pwned Passwords 概述 – 按范围↩搜索

  6. NIST Special Publication 800-63B – Digital Identity Guidelines – Authentication and Lifecycle Management 
    NIST 特别出版物 800-63B – 数字身份准则 – 身份验证和生命周期管理 ↩

  7. Hashcat Plugin Development Guide 
    Hashcat 插件开发指南 ↩

  8. Attacks on applications of k-anonymity for password retrieval 
    对用于密码检索↩ ↩的 k-anonymity 应用程序的攻击

  9. How to create an offline self-hosted haveibeenpwned API service 
    如何创建离线自承载 haveibeenpwned API 服务 ↩

  10. Protecting accounts from credential stuffing with password breach alerting 
    使用密码泄露警报↩保护帐户免受撞库攻击

  11. Protect your accounts from data breaches with Password Checkup 
    通过密码检查↩保护您的帐户免遭数据泄露

  12. Why password managers are your safety net during a data breach 
    为什么密码管理器是您在数据泄露↩期间的安全网

  13. Protocols for Checking Compromised Credentials 
    用于检查泄露凭据↩的协议

  14. Might I Get Pwned: A Second Generation Compromised Credential Checking Service 
    我可能会被 Pwned:第二代受损的凭据检查服务 ↩

  15. Privacy-Preserving Compromised Credential Checking 
    保护隐私的泄露凭据检查 ↩

  16. API v3 – Incremental Searching 
    API v3 – 增量搜索 ↩ ↩

原文始发于Quarkslab’s Website:Passbolt: a bold use of HaveIBeenPwned

版权声明:admin 发表于 2024年4月21日 下午2:44。
转载请注明:Passbolt: a bold use of HaveIBeenPwned | CTF导航

相关文章