Ghidra nanoMIPS ISA module

IoT 2周前 admin
12 0 0

Introduction 介绍

In late 2023 and early 2024, the NCC Group Hardware and Embedded Systems practice undertook an engagement to reverse engineer baseband firmware on several smartphones. This included MediaTek 5G baseband firmware based on the nanoMIPS architecture. While we were aware of some nanoMIPS modules for Ghidra having been developed in private, there was no publicly available reliable option for us to use at the time, which led us to develop our own nanoMIPS disassembler and decompiler module for Ghidra.
在 2023 年底和 2024 年初,NCC Group 硬件和嵌入式系统业务部门参与了对多款智能手机上的基带固件进行逆向工程。其中包括基于nanoMIPS架构的MediaTek 5G基带固件。虽然我们知道一些用于 Ghidra 的 nanoMIPS 模块是私下开发的,但当时没有公开可用的可靠选项供我们使用,这促使我们为 Ghidra 开发了自己的 nanoMIPS 反汇编器和解编译器模块。

In the interest of time, we focused on implementing the features and instructions that we encountered on actual baseband firmware, and left complex P-Code instruction emulation unimplemented where it was not yet needed. Though the module is a work in progress, it still decompiles the majority of the baseband firmware we’ve analyzed. Combined with debug symbol information included with some MediaTek firmware, it has been very helpful in the reverse engineering process.
为了节省时间,我们专注于实现我们在实际基带固件上遇到的功能和指令,而将复杂的 P 代码指令仿真留在尚不需要的地方。尽管该模块仍在开发中,但它仍然会分解我们分析过的大多数基带固件。结合部分联发科固件附带的调试符号信息,在逆向工程过程中非常有帮助。

Here we will demonstrate how to load a MediaTek baseband firmware into Ghidra for analysis with our nanoMIPS ISA module.
在这里,我们将演示如何将 MediaTek 基带固件加载到 Ghidra 中,以便使用我们的 nanoMIPS ISA 模块进行分析。

Target firmware 目标固件

For an example firmware to analyze, we looked up phones likely to include a MediaTek SoC with 5G support. Some relatively recent Motorola models were good candidates. (These devices were not part of our client engagement.)
为了分析固件示例,我们查找了可能包含支持 5G 的联发科 SoC 的手机。一些相对较新的摩托罗拉机型是不错的候选者。(这些设备不是我们客户参与的一部分。

We found many Android firmware images on https://mirrors.lolinet.com/firmware/lenomola/, including an image for the Motorola Moto Edge 2022, codename Tesla: https://mirrors.lolinet.com/firmware/lenomola/tesla/official/. This model is based on a MediaTek Dimensity 1050 (MT6879) SoC.
我们在 https://mirrors.lolinet.com/firmware/lenomola/ 上发现了许多 Android 固件映像,包括代号为 Tesla: https://mirrors.lolinet.com/firmware/lenomola/tesla/official/ 的摩托罗拉 Moto Edge 2022 映像。该型号基于联发科天玑 1050 (MT6879) SoC。

There are some carrier-specific variations of the firmware. We’ll randomly choose XT2205-1_TESLA_TMO_12_S2ST32.71-118-4-2-6_subsidy-TMO_UNI_RSU_QCOM_regulatory-DEFAULT_cid50_R1_CFC.xml.zip.
固件有一些特定于运营商的变体。我们将随机选择XT2205-1_TESLA_TMO_12_S2ST32.71-118-4-2-6_subsidy-TMO_UNI_RSU_QCOM_regulatory-DEFAULT_cid50_R1_CFC.xml.zip。

Extracting nanoMIPS firmware
提取nanoMIPS固件

The actual nanoMIPS firmware is in the md1img.img file from the Zip package.
实际的nanoMIPS固件位于Zip软件包中的 md1img.img 文件中。

To extract the content of the md1img file we also wrote some Kaitai structure definitions with simple Python wrapper scripts to run the structure parsing and output different sections to individual files. The ksy Kaitai definitions can also be used to interactively explore these files with the Kaitai IDE.
为了提取文件的内容, md1img 我们还编写了一些带有简单 Python 包装脚本的 Kaitai 结构定义来运行结构解析并将不同的部分输出到单个文件。 ksy Kaitai 定义还可用于通过 Kaitai IDE 以交互方式浏览这些文件。

Running md1_extract.py with an --outdir option will extract the files contained within md1img.img:
使用选项 --outdir 运行 md1_extract.py 将提取包含在 md1img.img :

$ ./md1_extract.py ../XT2205-1_TESLA_TMO_12_S2STS32.71-118-4-2-6-3_subsidy-TMO_UNI_RSU_QCOM_regulatory-DEFAULT_cid50_CFC/md1img.img --outdir ./md1img_out/
extracting files to: ./md1img_out
md1rom: addr=0x00000000, size=43084864
        extracted to 000_md1rom
cert1md: addr=0x12345678, size=1781
        extracted to 001_cert1md
cert2: addr=0x12345678, size=988
        extracted to 002_cert2
md1drdi: addr=0x00000000, size=12289536
        extracted to 003_md1drdi
cert1md: addr=0x12345678, size=1781
        extracted to 004_cert1md
cert2: addr=0x12345678, size=988
        extracted to 005_cert2
md1dsp: addr=0x00000000, size=6776460
        extracted to 006_md1dsp
cert1md: addr=0x12345678, size=1781
        extracted to 007_cert1md
cert2: addr=0x12345678, size=988
        extracted to 008_cert2
md1_filter: addr=0xffffffff, size=300
        extracted to 009_md1_filter
md1_filter_PLS_PS_ONLY: addr=0xffffffff, size=300
        extracted to 010_md1_filter_PLS_PS_ONLY
md1_filter_1_Moderate: addr=0xffffffff, size=300
        extracted to 011_md1_filter_1_Moderate
md1_filter_2_Standard: addr=0xffffffff, size=300
        extracted to 012_md1_filter_2_Standard
md1_filter_3_Slim: addr=0xffffffff, size=300
        extracted to 013_md1_filter_3_Slim
md1_filter_4_UltraSlim: addr=0xffffffff, size=300
        extracted to 014_md1_filter_4_UltraSlim
md1_filter_LowPowerMonitor: addr=0xffffffff, size=300
        extracted to 015_md1_filter_LowPowerMonitor
md1_emfilter: addr=0xffffffff, size=2252
        extracted to 016_md1_emfilter
md1_dbginfodsp: addr=0xffffffff, size=1635062
        extracted to 017_md1_dbginfodsp
md1_dbginfo: addr=0xffffffff, size=1332720
        extracted to 018_md1_dbginfo
md1_mddbmeta: addr=0xffffffff, size=899538
        extracted to 019_md1_mddbmeta
md1_mddbmetaodb: addr=0xffffffff, size=562654
        extracted to 020_md1_mddbmetaodb
md1_mddb: addr=0xffffffff, size=12280622
        extracted to 021_md1_mddb
md1_mdmlayout: addr=0xffffffff, size=8341403
        extracted to 022_md1_mdmlayout
md1_file_map: addr=0xffffffff, size=889
        extracted to 023_md1_file_map

The most relevant files are:
最相关的文件是:

  • md1rom is the nanoMIPS firmware image
    md1rom 是 nanoMIPS 固件映像
  • md1_file_map provides slightly more context on the md1_dbginfo file: its original filename is DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31.xz
    md1_file_map 提供了 md1_dbginfo 有关文件的更多上下文:其原始文件名是 DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31.xz
  • md1_dbginfo is an XZ compressed binary file containing debug information for md1rom, including symbols
    md1_dbginfo 是一个 XZ 压缩的二进制文件,其中包含 的 md1rom 调试信息,包括符号

Extracting debug symbols
提取调试符号

md1_dbginfo is another binary file format containing symbols and filenames with associated addresses. We’ll rename it and decompress it based on the filename from md1_file_map:
md1_dbginfo 是另一种二进制文件格式,包含符号和文件名以及关联地址。我们将根据以下 md1_file_map 文件名重命名并解压缩它:

$ cp 018_md1_dbginfo DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31.xz
$ unxz DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31.xz
$ hexdump DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31 | head
00000000  43 41 54 49 43 54 4e 52  01 00 00 00 98 34 56 00  |CATICTNR.....4V.|
00000010  43 41 54 49 01 00 00 00  00 00 00 00 4e 52 31 36  |CATI........NR16|
00000020  2e 52 32 2e 4d 54 36 38  37 39 2e 54 43 32 2e 50  |.R2.MT6879.TC2.P|
00000030  52 31 2e 53 50 00 4d 54  36 38 37 39 5f 53 30 30  |R1.SP.MT6879_S00|
00000040  00 4d 54 36 38 37 39 5f  4e 52 31 36 2e 54 43 32  |.MT6879_NR16.TC2|
00000050  2e 50 52 31 2e 53 50 2e  56 31 37 2e 50 33 38 2e  |.PR1.SP.V17.P38.|
00000060  30 33 2e 32 34 2e 30 33  52 00 32 30 32 33 2f 30  |03.24.03R.2023/0|
00000070  35 2f 31 39 20 32 32 3a  33 31 00 73 00 00 00 2b  |5/19 22:31.s...+|
00000080  ed 53 00 49 4e 54 5f 56  65 63 74 6f 72 73 00 4c  |.S.INT_Vectors.L|
00000090  08 00 00 54 08 00 00 62  72 6f 6d 5f 65 78 74 5f  |...T...brom_ext_|

To extract information from the debug info file, we made another Kaitai definition and wrapper script that extracts symbols and outputs them in a text format compatible with Ghidra’s ImportSymbolsScript.py script:
为了从调试信息文件中提取信息,我们制作了另一个 Kaitai 定义和包装脚本,该脚本提取符号并以与 Ghidra ImportSymbolsScript.py 脚本兼容的文本格式输出:

$ ./mtk_dbg_extract.py md1img_out/DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31 | tee dbg_symbols.txt
INT_Vectors 0x0000084c l
brom_ext_main 0x00000860 l
INT_SetPLL_Gen98 0x00000866 l
PLL_Set_CLK_To_26M 0x000009a2 l
PLL_MD_Pll_Init 0x000009da l
INT_SetPLL 0x000009dc l
INT_Initialize_Phase1 0x027b5c80 l
INT_Initialize_Phase2 0x027b617c l
init_cm 0x027b6384 l
init_cm_wt 0x027b641e l
...

(Currently the script is set to only output label definitions rather than function definitions, as it was unknown if all of the symbols were for functions.)
(目前,该脚本设置为仅输出标签定义而不是函数定义,因为不知道是否所有符号都用于函数。

Loading nanoMIPS firmware into Ghidra
将 nanoMIPS 固件加载到 Ghidra 中

Install the extension 安装扩展

First, we’ll have to install the nanoMIPS module for Ghidra. In the main Ghidra window, go to “File > Install Extensions”, click the “Add Extension” plus button, and select the module Zip file (e.g., ghidra_11.0.3_PUBLIC_20240424_nanomips.zip). Then restart Ghidra.
首先,我们必须为 Ghidra 安装 nanoMIPS 模块。在 Ghidra 主窗口中,转到“文件>安装扩展”,单击“添加扩展”加号按钮,然后选择模块 Zip 文件(例如, ghidra_11.0.3_PUBLIC_20240424_nanomips.zip )。然后重新启动 Ghidra。

Initial loading 初始加载

Load md1rom as a raw binary image. Select 000_md1rom from the md1img.img extract directory and keep “Raw Binary” as the format. For Language, click the “Browse” ellipsis and find the little endian 32-bit nanoMIPS option (nanomips:LE:32:default) using the filter, then click OK.
md1rom 加载为原始二进制映像。从 md1img.img 数据提取目录中选择 000_md1rom 并保留“原始二进制文件”作为格式。对于语言,单击“浏览”省略号并使用过滤器找到小端 32 位 nanoMIPS 选项 ( nanomips:LE:32:default ),然后单击确定。

We’ll load the image at offset 0 so no further options are necessary. Click OK again to load the raw binary.
我们将以偏移量 0 加载图像,因此不需要其他选项。再次单击“确定”以加载原始二进制文件。

When Ghidra asks if you want to do an initial auto-analysis, select No. We have to set up a mirrored memory address space at 0x90000000 first.
当 Ghidra 询问您是否要进行初始自动分析时,请选择“否”。 0x90000000 首先,我们必须设置一个镜像的内存地址空间。

Memory mapping 内存映射

Open the “Memory Map” window and click plus for “Add Memory Block”.
打开“内存映射”窗口,然后单击“添加内存块”的加号。

We’ll name the new block “mirror”, set the starting address to ram:90000000, the length to match the length of the base image “ram” block (0x2916c40), permissions to read and execute, and the “Block Type” to “Byte Mapped” with a source address of 0 and mapping ratio of 1:1.
我们将新块命名为“mirror”,将起始地址设置为 ram:90000000 ,长度以匹配基本镜像的长度 “ram” 块 ( 0x2916c40 ),读取和执行权限,并将 “Block Type” 设置为 “Byte Mapped”,源地址为 0,映射比为 1:1。

Also change the permissions for the original “ram” block to just read and execute. Save the memory map changes and close the “Memory Map” window.
此外,将原始“ram”块的权限更改为仅读取和执行。保存内存映射更改并关闭“内存映射”窗口。

Note that this memory map is incomplete; it’s just the minimal setup required to get disassembly working.
请注意,此内存映射不完整;这只是进行拆卸工作所需的最低设置。

Debug symbols 调试符号

Next, we’ll load up the debug symbols. Open the Script Manager window and search for ImportSymbolsScript.py. Run the script and select the text file generated by mtk_dbg_extract.py earlier (dbg_symbols.txt). This will create a bunch of labels, most of them in the mirrored address space.
接下来,我们将加载调试符号。打开“脚本管理器”窗口并搜索 ImportSymbolsScript.py 。运行脚本并选择之前 ( dbg_symbols.txt ) 生成的 mtk_dbg_extract.py 文本文件。这将创建一堆标签,其中大部分位于镜像地址空间中。

Disassembly 拆卸

Now we can begin disassembly. There is a jump instruction at address 0 that will get us started, so just select the byte at address 0 and press “d” or right-click and choose “Disassemble”. Thanks to the debug symbols, you may notice this instruction jumps to the INT_Initialize_Phase1 function.
现在我们可以开始拆卸了。地址 0 处有一个跳转指令可以让我们开始,因此只需选择地址 0 处的字节并按“d”或右键单击并选择“反汇编”。多亏了调试符号,您可能会注意到此指令跳转到 INT_Initialize_Phase1 该函数。

Flow-based disassembly will now start to discover a bunch of code. The initial disassembly can take several minutes to complete.
基于流的反汇编现在将开始发现一堆代码。初始拆卸可能需要几分钟才能完成。

Then we can run the normal auto-analysis with “Analysis > Auto Analyze…”. This should also discover more code and spend several minutes in disassembly and decompilation. We’ve found that the “Non-Returning Functions” analyzer creates many false positives with the default configuration in these firmware images, which disrupts the code flow, so we recommend disabling it for initial analysis.
然后,我们可以使用“Analysis > Auto Analyze…”运行正常的自动分析。这也应该发现更多的代码,并花费几分钟的时间进行反汇编和反编译。我们发现,“非返回函数”分析器在这些固件映像中的默认配置下会创建许多误报,这会中断代码流,因此我们建议禁用它进行初始分析。

The one-shot “Decompiler Parameter ID” analyzer is a good option to run next for better detection of function input types.
一次性的“反编译器参数 ID”分析器是接下来运行的不错选择,以便更好地检测函数输入类型。

Conclusion 结论

Although the module is still a work in progress, the results are already quite useable for analysis and allowed to us to reverse engineer some critical features in baseband processors.
尽管该模块仍在开发中,但结果已经非常可用于分析,并允许我们对基带处理器中的一些关键功能进行逆向工程。

The nanoMIPS Ghidra module and MediaTek binary file unpackers can be found on our GitHub at:
nanoMIPS Ghidra 模块和 MediaTek 二进制文件解压缩器可以在我们的 GitHub 上找到:

原文始发于James Chambers:Ghidra nanoMIPS ISA module

版权声明:admin 发表于 2024年5月9日 上午8:54。
转载请注明:Ghidra nanoMIPS ISA module | CTF导航

相关文章