AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS

AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS

About a month ago, we started seeing reports on activities from DuckTail , a cybercrime outfit reportedly based in Vietnam. Detonating one of the samples, we observed that a new account was being created on the analysis machine, followed by an RDP connection from an operator who downloaded additional tools, stole cookies, etc. 
大约一个月前,我们开始看到有关DuckTail活动的报道,据报道,DuckTail是一家位于越南的网络犯罪组织。引爆其中一个样本,我们观察到正在分析机器上创建一个新帐户,然后是来自操作员的 RDP 连接,该操作员下载了其他工具、窃取了 cookie 等。

The attack itself had little unique qualities, save for the use of a little-known feature of the .NET language: AOT compilation. We decided to leave the cybercriminals aside and look into the inner workings of such programs in more detail.
除了使用了 .NET 语言中一个鲜为人知的功能:AOT 编译之外,攻击本身几乎没有什么独特的品质。我们决定将网络犯罪分子放在一边,更详细地研究此类程序的内部运作。

WHAT IS AOT? 什么是AOT?

Background on .NET  .NET 的背景

Most reverse engineers have a good opinion of .NET programs. State-of-the-art handle them very well, reconstructing original source code with a high degree of accuracy in most cases. Sure, they could always be obfuscated, but as far as reverse-engineering goes it was still on the pleasant end of the spectrum.
大多数逆向工程师都对 .NET 程序有很好的评价。最先进的技术可以很好地处理它们,在大多数情况下以高度的准确性重建原始源代码。当然,它们总是可以混淆,但就逆向工程而言,它仍然处于令人愉快的一端。

Interpreted languages such as .NET tend to contain a lot more information in the produced binaries than their native counterparts – information that is extremely helpful to the analyst. Source code written by the program author is converted to “Microsoft Intermediate Language” (MSIL) during the compilation phase, and later interpreted by a specific program (the “runtime”). The advantage this approach is that (in theory) the program written in such a language can work on any OS and CPU architecture, as long as there is a runtime available for them:
解释型语言(如 .NET)在生成的二进制文件中往往比其本机对应语言包含更多的信息,这些信息对分析师非常有帮助。程序作者编写的源代码在编译阶段转换为“Microsoft 中间语言”(MSIL),然后由特定程序(“运行时”)解释。这种方法的优点是(理论上)用这种语言编写的程序可以在任何操作系统和 CPU 架构上运行,只要有可用的运行时:

AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS

Even though the “just-in-time” (JIT) compiler can leverage context-specific information to perform optimizations during the execution, it is generally admitted that the “free” portability comes at the cost of slightly reduced performance compared to programs directly compiled to architecture-specific assembly.
尽管“即时”(JIT)编译器可以在执行过程中利用特定于上下文的信息来执行优化,但人们普遍承认,与直接编译为特定于体系结构的程序集的程序相比,“免费”可移植性是以性能略有下降为代价的。

Introducing “ahead of time” (AOT) compilation
引入“提前”(AOT) 编译

But what if the multiplatform aspect is of no interest to the developers? What if they know in advance which systems their application will be deployed to? In that case, there is little reason to live with the overhead caused by the intermediate language. If the development toolchain supports it, they might as well produce native binaries directly, as they would if they were writing C or C++ code. This is called AOT (“ahead of time”) compilation:
但是,如果开发人员对多平台方面不感兴趣怎么办?如果他们事先知道他们的应用程序将部署到哪些系统,该怎么办?在这种情况下,没有理由忍受中间语言造成的开销。如果开发工具链支持它,他们也可以直接生成本机二进制文件,就像他们编写 C 或 C++ 代码一样。这称为 AOT(“提前”)编译:

AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS

This is bad news for reverse-engineers, as the disappearance of MSIL in the chain means we have no alternative but to start our analysis at the assembly level.
这对逆向工程师来说是个坏消息,因为MSIL在链条中的消失意味着我们别无选择,只能从装配级别开始分析。

A quick survey on VirusTotal shows that as of today, there are 544 “clean” AOT binaries on the platform (as in, detected by no antivirus program) against 1667 malicious ones (detected by 3 engines or more). VirusTotal’s selection bias aside, an un-scientific estimation is that around 75% of current .NET AOT samples are malicious, which makes it a decent indicator of malicious behavior. Other researchers have also noted the presence of AOT malware in the wild.
对VirusTotal的快速调查显示,截至今天,平台上有544个“干净”的AOT二进制文件(例如,没有被防病毒程序检测到),而有1667个恶意二进制文件(由3个或更多引擎检测到)。撇开 VirusTotal 的选择偏差不谈,一个不科学的估计是,当前大约 75% 的 .NET AOT 样本是恶意的,这使其成为恶意行为的一个不错的指标。其他研究人员也注意到 AOT 恶意软件在野外的存在。

HOW TO RECOGNIZE .NET AOT BINARIES?
如何识别 .NET AOT 二进制文件?

As far as I can tell, it seems that PE files generated this way have a couple of recognizable characteristics:
据我所知,以这种方式生成的 PE 文件似乎具有几个可识别的特征:

  • Only one export (DotNetRuntimeDebugHeader)
    只有一个导出 ( DotNetRuntimeDebugHeader )
  • A section named .managed
    名为 .managed

Such files can be looked up easily on VirusTotal. In addition, you can find the exact .NET version which was used to compile a sample, looking for a string matching the following regular expression in the binary:([.\— a-z0-9]*?)\d\+[a-f0-9]{40}\b.
此类文件可以在VirusTotal上轻松查找。此外,还可以找到用于编译示例的确切 .NET 版本,在二进制文件中查找与以下正则表达式匹配的字符串: ([.\— a-z0-9]*?)\d\+[a-f0-9]{40}\b .

For instance: 8.0.0+5535e31a712343a63f5d7d796cd874e563e5ac14, with the last part being the hash of the corresponding commit in the .NET runtime source code – always useful to detect tampered compilation timestamps.
例如: 8.0.0+5535e31a712343a63f5d7d796cd874e563e5ac14 ,最后一部分是 .NET 运行时源代码中相应提交的哈希值 – 对于检测被篡改的编译时间戳总是很有用的。

SETTING UP AOT FOR TEST PROJECTS
为测试项目设置 AOT

Setting up a blank project to get acquainted with AOT code is not completely trivial, especially if you’re unfamiliar with Visual Studio. There are a few prerequisites:
设置一个空白项目来熟悉 AOT 代码并非完全易事,尤其是在不熟悉 Visual Studio 的情况下。有几个先决条件:

  • Visual Studio 2022 (older versions offer no support for AOT)
    Visual Studio 2022(旧版本不支持 AOT)
  • Selecting the “Desktop development with C++” component during setup
    在安装过程中选择“使用 C++ 进行桌面开发”组件
  • .NET SDK (at least version 7).
    .NET SDK(至少版本 7)。

Create a new C# “Console Application” project and make sure to check “Enable native AOT publish” (this option is only available in .NET 8 and later; for version 7 you must manually add the property <PublishAot>true</PublishAot> to your file .csproj):
创建新的 C#“控制台应用程序”项目,并确保选中“启用本机 AOT 发布”(此选项仅在 .NET 8 及更高版本中可用;对于版本 7,必须手动将属性 <PublishAot>true</PublishAot> 添加到文件中 .csproj ):

AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS

If you haven’t touched .NET for a while, you’ll notice that the “Hello World” template is now a one-liner – the Program class declaration appears to be implicit now. Feel free to restore the explicit code, as it makes things a little clearer when we look at the corresponding assembly code. In order to generate the AOT binary, right-click on your project and select “Publish”. For our purposes, the easiest option seems to be publishing to a folder on disk. There are a few more options to specify for the build, mainly the target CPU architecture (x64 in this case):
如果你有一段时间没有接触过 .NET,你会注意到“Hello World”模板现在是单行代码 – Program 类声明现在似乎是隐式的。随意恢复显式代码,因为当我们查看相应的汇编代码时,它使事情变得更加清晰。要生成AOT二进制文件,请右键单击您的项目并选择“发布”。就我们的目的而言,最简单的选择似乎是发布到磁盘上的文件夹。还有更多选项可以指定用于构建,主要是目标 CPU 体系结构(在本例中为 x64):

AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS

Once this is taken care of, clicking the “Publish” button should generate a self-contained binary in the specified folder. Our one-line “Hello World” project weighs around 1.2MB (3MB when optimizations are disabled), which is terrible news because it means the program embeds a lot of library code.
完成此操作后,单击“发布”按钮应在指定文件夹中生成一个独立的二进制文件。我们的单行“Hello World”项目重约 1.2MB(禁用优化时为 3MB),这是一个可怕的消息,因为这意味着该程序嵌入了大量库代码。

IDA PRO AND AOT IN PRACTICE
IDA PRO 和 AOT 在实践中

Looking at sample programs leads us to a few observations:
通过查看示例程序,我们可以得出一些结论:

  • .NET AOT looks a lot like C++ code, which is no surprise since Visual Studio required the C++ component to generate AOT files.
    .NET AOT 看起来很像 C++ 代码,这并不奇怪,因为 Visual Studio 需要 C++ 组件来生成 AOT 文件。
  • The calling convention for x64 appears to be pretty standard, using the registers RCX, RDX, R8 and R9 and then the stack to pass arguments.
    x64 的调用约定似乎非常标准,使用寄存器 RCX、RDX、R8 和 R9,然后使用堆栈来传递参数。
  • IDA doesn’t have signatures for .NET runtime functions compiled ahead of time, so nothing is recognized.
    IDA 没有提前编译的 .NET 运行时函数的签名,因此无法识别任何内容。
  • Pointers to strings lead to uninitialized memory in a section called hydrated 😱
    指向字符串的指针会导致名为 hydrated 😱

AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS

The latter phenomenon can be traced back to an optimization which aims at reducing the binary size during the compilation step. What it means for us is that we will be forced to use a debugger to figure out what strings are being manipulated during runtime.
后一种现象可以追溯到旨在在编译步骤中减小二进制大小的优化。对我们来说,这意味着我们将被迫使用调试器来弄清楚在运行时正在操作哪些字符串。

Identifying library functions
识别库函数

Let’s assume we’re now working on an unknown .NET AOT sample. Our first order of business is to obtain names for the library functions which represent over 95% of the binary. Without those, it’s safe to say that we’re not going to make it.
假设我们现在正在处理一个未知的 .NET AOT 示例。我们的首要任务是获取库函数的名称,这些函数代表了超过 95% 的二进制文件。没有这些,可以肯定地说,我们不会成功。

To achieve this, we follow an approach similar to what Hex-Rays has been doing with Golang and their go2pat utility – generate a .NET AOT binary which contains all possible imports, then extract all symbols and generate byte patterns for them. Our approach doesn’t need to be as generic however: we know which .NET version we’re targeting thanks to a string present in the target binary (see previous title). What we need next is to create a binary containing as many library functions as possible so we can make signatures for them. Unfortunately for us, the AOT compiler trims all unused code and there is no way to turn this off.
为了实现这一点,我们遵循了一种类似于 Hex-Rays 对 Golang 及其 go2pat 实用程序所做的方法——生成一个包含所有可能导入的 .NET AOT 二进制文件,然后提取所有符号并为它们生成字节模式。但是,我们的方法不需要那么通用:由于目标二进制文件中存在一个字符串,我们知道我们面向哪个 .NET 版本(请参阅上一个标题)。接下来我们需要创建一个包含尽可能多的库函数的二进制文件,以便我们可以为它们进行签名。不幸的是,AOT 编译器会修剪所有未使用的代码,并且无法将其关闭。

We can work around this issue, because even though we need to use the functions we’re importing, there’s no need to do anything meaningful with them to prevent the compiler from optimizing them out. But how to generate such code? I have zero experience writing .NET, but you know who does? Language models. After a few incremental queries, I generated a source file we can compile into a 10MB AOT binary – along with its PDB file containing all associated symbols. There’s probably a lot missing still, but this should be plenty of library code to begin with.
我们可以解决这个问题,因为即使我们需要使用我们正在导入的函数,也不需要对它们做任何有意义的事情来阻止编译器优化它们。但是如何生成这样的代码呢?我编写 .NET 的经验为零,但你知道谁有吗?语言模型。经过几次增量查询后,我生成了一个源文件,我们可以编译成一个 10MB 的 AOT 二进制文件,以及包含所有相关符号的 PDB 文件。可能还有很多东西缺失,但这应该是大量的库代码。

Creating IDA Pro signatures
创建 IDA Pro 签名

The next step is to create signatures for known .NET AOT functions. For this, we can use FLARE’s  idb2pat.py script to generate a .pat for the ~31000 functions contained in my test program. The .pat file then needs to be converted to a .sig file IDA can load; I’ll detail the steps here because I know this isn’t something reverse-engineers do on a daily basis (more detailed explanations can be found here).
下一步是为已知的 .NET AOT 函数创建签名。为此,我们可以使用 FLARE 的 idb2pat.py 脚本为我的测试程序中包含的 ~31000 个函数生成一个 .pat 。然后,需要将 .pat 文件转换为 IDA 可以加载 .sig 的文件;我将在这里详细介绍这些步骤,因为我知道这不是逆向工程师每天都在做的事情(更详细的解释可以在这里找到)。

First, you have to download FLAIR utilities from the Hex-Rays download center . Presently, we have a .pat which contains entries like this:
首先,您必须从 Hex-Rays 下载中心下载 FLAIR 实用程序。目前,我们有一个 .pat 包含如下条目的条目:

488D05........488D0D........837908017501C3488BD0E9.............. 00 0000 001D :00000000 __GetNonGCStaticBase_System_Collections_System_SR :00000015 loc_140001015 ^00000003 ?__NONGCSTATICS@System_Collections_System_SR@@ ^0000000A off_14094E4D0 ^00000019 S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__CheckStaticClassConstructionReturnNonGCStaticBase

Each line is essentially a byte-pattern like those you would find today in Yara rules, followed by lengths and checksums, the name of the function as well as referenced names. The full documentation for this format can be found in pat.txtin the FLAIR folder. The following command compiles those human-readable-ish patterns into binary signatures:
每一行本质上都是一个字节模式,就像你今天在Yara规则中看到的字节模式一样,后面跟着长度和校验和,函数的名称以及引用的名称。此格式的完整文档可在 FLAIR 文件夹中找到 pat.txt 。以下命令将这些人类可读的模式编译为二进制签名:

sigmake.exe input.pat output.sig -n "[description]"

However, odds are that this will not immediately work as idb2pat created the patterns automatically, and collisions are likely to happen (~800 in my case). The process for resolving them involves editing the .exc file created by sigmake in the same folder, and manually indicating which function to keep:
但是,这很可能不会立即起作用,因为 idb2pat 自动创建了模式,并且可能会发生冲突(在我的情况下为 ~800)。解决这些问题的过程包括编辑在同一文件夹中创建 .exc 的文件 sigmake ,并手动指示要保留的功能:

;--------- (delete these lines to allow sigmake to read this file)

; add '+' at the start of a line to select a module

; add '-' if you are not sure about the selection

; do nothing if you want to exclude all modules

 

memcpy_CopyDown_Intel                               BD 6EF1 [...]

memcpy_CopyDown_amd                                 BD 6EF1 [...]

 

Bool__System_IConvertible_ToByte                    00 0000 [...]

Bool__System_IConvertible_ToInt16                   00 0000 [...]

Bool__System_IConvertible_ToInt32                   00 0000 [...]

Bool__System_IConvertible_ToSByte                   00 0000 [...]

Bool__System_IConvertible_ToUInt16                  00 0000 [...]

Bool__System_IConvertible_ToUInt32                  00 0000 [...]

Bool__System_IConvertible_ToUInt64                  00 0000 [...]

 

[...]

As you will see, in most cases the functions seem to be similar enough that losing the name due to ambiguity would be a shame. At this stage, I just use the search and replace function of a text editor to add a “+” character at the start of each line following a blank one (replace  ^$\r\n  by  \r\n\+ ). Do not forget to delete the first lines of the file as instructed, and run again sigmake.
正如您将看到的,在大多数情况下,这些功能似乎非常相似,以至于由于歧义而丢失名称将是一种耻辱。在这个阶段,我只是使用文本编辑器的搜索和替换功能,在每行开头添加一个“+”字符,后面跟着一个空白字符(替换为 ^$\r\n \r\n\+ )。不要忘记按照说明删除文件的第一行,然后再次 sigmake 运行。

We can now place the obtained .sig file in the $IDAUSR/sig/pcfolder and apply it manually via the File > Load File > FLIRT Signature File.
我们现在可以将获取 .sig 的文件放在文件夹中, $IDAUSR/sig/pc 并通过 File > Load File > FLIRT Signature File .

AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS

The results are really satisfying, as the navigator bar at the top of the screen goes from this:
结果确实令人满意,因为屏幕顶部的导航栏是这样的:

AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS

…to this: …对此:

AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS

You can find my .sig here , but I would only recommend it for binaries built with .NET 7.0.
你可以在这里找到我的 .sig ,但我只推荐它用于使用 .NET 7.0 构建的二进制文件。

What next? 接下来呢?

Now that the library functions are identified properly, the original repository can serve as documentation – let’s start with RhpNewFast, a crucial function tasked with allocating objects. It receives a pointer to a MethodTable structure which represents the object to allocate. We need to figure out what is being instantiated, but obviously we do not have names for static structures. The decompiled code is littered with calls like:
现在,库函数已正确识别,原始存储库可以用作文档 – 让我们从 RhpNewFast 开始,这是一个负责分配对象的关键函数。它接收指向 MethodTable 结构的指针,该结构表示要分配的对象。我们需要弄清楚正在实例化什么,但显然我们没有静态结构的名称。反编译的代码中充斥着如下调用:

v4 = RhpNewFast(&stru_140261200);

MethodTable structures (or EETypes) contain basic information about the corresponding object in their header (such as its size, related types, and flags controlling the garbage collector behavior), followed by a function table reminiscent of C++ vtables:
MethodTable 结构(或 EETypes )在其标头中包含有关相应对象的基本信息(例如其大小、相关类型和控制垃圾回收器行为的标志),后跟一个让人联想到 C++ vtables 的函数表:

AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS

Thankfully these methods are identified properly with our signature file, and we can deduce that the object being instantiated is some sort of Exception (specifically, OutOfMemoryException even though this is not visible here).
值得庆幸的是,这些方法被我们的签名文件正确识别,我们可以推断出被实例化的对象是某种异常(具体来说, OutOfMemoryException 即使这里不可见)。

Can we do better? .NET actually has RTTI capabilities. While I’m sure there’s a way to write a Python script that recovers all type information (but haven’t dug into how such information is stored), we can score quick wins using a debugger. Intercept calls to the S_P_CoreLib_System_Type__GetTypeFromEETypePtr (or its wrapper, Object__GetType), put a pointer to the desired MethodTable in ECX, and wait for the call to  NativeFormatRuntimeNamedTypeInfo__ToString to get your desired answer in the return value (just follow EAX in the heap).

WRAP-UP: A GENERIC APPROACH FOR UNKNOWN FRAMEWORKS

There’s no doubt that .NET AOT programs will give reversers a hard time, especially when compared to their MSIL counterparts. However, using the techniques described above, we were able to recover symbols as well as typing information. This brings us back to a situation close to that of analyzing programs in Go (but with a functional decompiler). From here on I’d recommend following the approach I use for Golang, i.e. using a debugger to inspect calls to standard library functions and try to reconstruct the original code in this way.

This blog post also demonstrates a generic approach that can be followed when attempting to analyze programs written in a new language (which .NET AOT is, from our reversing perspective). I always tend to go for quick and dirty approaches at first, but if you’d like to know more about AOT internals, I recommend checking out Michal Strehovsky’s blog post on the subject.
这篇博文还演示了一种通用方法,在尝试分析用新语言编写的程序时可以遵循该方法(从我们的反向角度来看,.NET AOT 就是这样)。一开始我总是倾向于采用快速而肮脏的方法,但如果您想了解更多关于 AOT 内部的信息,我建议您查看 Michal Strehovsky 关于该主题的博客文章。

As a closing note, I will share my opinion that AOT malware will become more commonplace in the future, because of two main factors:
作为结束语,我将分享我的观点,即 AOT 恶意软件在未来将变得更加普遍,因为有两个主要因素:

  • Microsoft is making it easier with new .NET versions to publish AOT applications;
    Microsoft 正在通过新的 .NET 版本更轻松地发布 AOT 应用程序;
  • It’s a free “make my app annoying to reverse” button – word is simply going to spread.
    这是一个免费的“让我的应用程序烦人反转”按钮——消息只会传播开来。

Feel free to reach out on Twitter if you have other useful techniques when analyzing these binaries!
如果您在分析这些二进制文件时有其他有用的技术,请随时在 Twitter 上联系!

INDICATORS OF COMPROMISE
入侵指标

9ba3b2ce74d60e0960be0e2544f7497339f1f115db93afb94e5512a8c990f63f BotGetKeyChromium

268aec06d44359b21bfe1c0c13abb75d1e37add2c8512acb6e0a0835b939b9b9 BotGetKeyChromium

9b8a1424cd299629e8dccdb1c7c4f3caad78fecec083c9e27b6a3dc281d5b1ca BotGetKeyChromium

6d689bfc12d18a6e4dae9309e3260f71d93de1fb9864f8545cbc30a24e181b1f BotGetKeyChromium

7fd054a810f5d942bc18d91d8e31285b484982bf5c8ace0c12c8ad64b0f183d4 BotGetKeyChromium

9ba3b2ce74d60e0960be0e2544f7497339f1f115db93afb94e5512a8c990f63f BotGetKeyChromium

1e082ed9733b033a0c9b27a0d1146397771b350b013ea3e9fba228e1400a263f ResetMainBot

ab8d86ac204d9c9ae689d87b9d2f7319b38125f7659ff2ba7cbfed13cbf0a13d ResetMainBot

Yara rules 雅苒规则

import "pe"
rule NET_AOT {
    meta:
        description = "Detects .NET binaries compiled ahead of time (AOT)"
        author = "HarfangLab"
        distribution = "TLP:WHITE"
        version = "1.0"
        last_modified = "2024-01-08"
        hash = "5b922fc7e8d308a15631650549bdb00c"

    condition:
        pe.is_pe and pe.number_of_exports == 1 and
        pe.exports("DotNetRuntimeDebugHeader") and
        pe.section_index(".managed") >= 0
}

原文始发于IVAN KWIATKOWSKI:AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS

版权声明:admin 发表于 2024年1月22日 下午6:56。
转载请注明:AN INTRODUCTION TO REVERSE ENGINEERING .NET AOT APPLICATIONS | CTF导航

相关文章