原文始发于James Chambers:Ghidra nanoMIPS ISA module
Introduction 介绍
In late 2023 and early 2024, the NCC Group Hardware and Embedded Systems practice undertook an engagement to reverse engineer baseband firmware on several smartphones. This included MediaTek 5G baseband firmware based on the nanoMIPS architecture. While we were aware of some nanoMIPS modules for Ghidra having been developed in private, there was no publicly available reliable option for us to use at the time, which led us to develop our own nanoMIPS disassembler and decompiler module for Ghidra.
在 2023 年底和 2024 年初,NCC Group 硬件和嵌入式系统业务部门参与了对多款智能手机上的基带固件进行逆向工程。其中包括基于nanoMIPS架构的MediaTek 5G基带固件。虽然我们知道一些用于 Ghidra 的 nanoMIPS 模块是私下开发的,但当时没有公开可用的可靠选项供我们使用,这促使我们为 Ghidra 开发了自己的 nanoMIPS 反汇编器和解编译器模块。
In the interest of time, we focused on implementing the features and instructions that we encountered on actual baseband firmware, and left complex P-Code instruction emulation unimplemented where it was not yet needed. Though the module is a work in progress, it still decompiles the majority of the baseband firmware we’ve analyzed. Combined with debug symbol information included with some MediaTek firmware, it has been very helpful in the reverse engineering process.
为了节省时间,我们专注于实现我们在实际基带固件上遇到的功能和指令,而将复杂的 P 代码指令仿真留在尚不需要的地方。尽管该模块仍在开发中,但它仍然会分解我们分析过的大多数基带固件。结合部分联发科固件附带的调试符号信息,在逆向工程过程中非常有帮助。
Here we will demonstrate how to load a MediaTek baseband firmware into Ghidra for analysis with our nanoMIPS ISA module.
在这里,我们将演示如何将 MediaTek 基带固件加载到 Ghidra 中,以便使用我们的 nanoMIPS ISA 模块进行分析。
Target firmware 目标固件
For an example firmware to analyze, we looked up phones likely to include a MediaTek SoC with 5G support. Some relatively recent Motorola models were good candidates. (These devices were not part of our client engagement.)
为了分析固件示例,我们查找了可能包含支持 5G 的联发科 SoC 的手机。一些相对较新的摩托罗拉机型是不错的候选者。(这些设备不是我们客户参与的一部分。
We found many Android firmware images on https://mirrors.lolinet.com/firmware/lenomola/, including an image for the Motorola Moto Edge 2022, codename Tesla: https://mirrors.lolinet.com/firmware/lenomola/tesla/official/. This model is based on a MediaTek Dimensity 1050 (MT6879) SoC.
我们在 https://mirrors.lolinet.com/firmware/lenomola/ 上发现了许多 Android 固件映像,包括代号为 Tesla: https://mirrors.lolinet.com/firmware/lenomola/tesla/official/ 的摩托罗拉 Moto Edge 2022 映像。该型号基于联发科天玑 1050 (MT6879) SoC。
There are some carrier-specific variations of the firmware. We’ll randomly choose XT2205-1_TESLA_TMO_12_S2ST32.71-118-4-2-6_subsidy-TMO_UNI_RSU_QCOM_regulatory-DEFAULT_cid50_R1_CFC.xml.zip.
固件有一些特定于运营商的变体。我们将随机选择XT2205-1_TESLA_TMO_12_S2ST32.71-118-4-2-6_subsidy-TMO_UNI_RSU_QCOM_regulatory-DEFAULT_cid50_R1_CFC.xml.zip。
Extracting nanoMIPS firmware
提取nanoMIPS固件
The actual nanoMIPS firmware is in the md1img.img
file from the Zip package.
实际的nanoMIPS固件位于Zip软件包中的 md1img.img
文件中。
To extract the content of the md1img
file we also wrote some Kaitai structure definitions with simple Python wrapper scripts to run the structure parsing and output different sections to individual files. The ksy
Kaitai definitions can also be used to interactively explore these files with the Kaitai IDE.
为了提取文件的内容, md1img
我们还编写了一些带有简单 Python 包装脚本的 Kaitai 结构定义来运行结构解析并将不同的部分输出到单个文件。 ksy
Kaitai 定义还可用于通过 Kaitai IDE 以交互方式浏览这些文件。
Running md1_extract.py
with an --outdir
option will extract the files contained within md1img.img
:
使用选项 --outdir
运行 md1_extract.py
将提取包含在 md1img.img
:
$ ./md1_extract.py ../XT2205-1_TESLA_TMO_12_S2STS32.71-118-4-2-6-3_subsidy-TMO_UNI_RSU_QCOM_regulatory-DEFAULT_cid50_CFC/md1img.img --outdir ./md1img_out/ extracting files to: ./md1img_out md1rom: addr=0x00000000, size=43084864 extracted to 000_md1rom cert1md: addr=0x12345678, size=1781 extracted to 001_cert1md cert2: addr=0x12345678, size=988 extracted to 002_cert2 md1drdi: addr=0x00000000, size=12289536 extracted to 003_md1drdi cert1md: addr=0x12345678, size=1781 extracted to 004_cert1md cert2: addr=0x12345678, size=988 extracted to 005_cert2 md1dsp: addr=0x00000000, size=6776460 extracted to 006_md1dsp cert1md: addr=0x12345678, size=1781 extracted to 007_cert1md cert2: addr=0x12345678, size=988 extracted to 008_cert2 md1_filter: addr=0xffffffff, size=300 extracted to 009_md1_filter md1_filter_PLS_PS_ONLY: addr=0xffffffff, size=300 extracted to 010_md1_filter_PLS_PS_ONLY md1_filter_1_Moderate: addr=0xffffffff, size=300 extracted to 011_md1_filter_1_Moderate md1_filter_2_Standard: addr=0xffffffff, size=300 extracted to 012_md1_filter_2_Standard md1_filter_3_Slim: addr=0xffffffff, size=300 extracted to 013_md1_filter_3_Slim md1_filter_4_UltraSlim: addr=0xffffffff, size=300 extracted to 014_md1_filter_4_UltraSlim md1_filter_LowPowerMonitor: addr=0xffffffff, size=300 extracted to 015_md1_filter_LowPowerMonitor md1_emfilter: addr=0xffffffff, size=2252 extracted to 016_md1_emfilter md1_dbginfodsp: addr=0xffffffff, size=1635062 extracted to 017_md1_dbginfodsp md1_dbginfo: addr=0xffffffff, size=1332720 extracted to 018_md1_dbginfo md1_mddbmeta: addr=0xffffffff, size=899538 extracted to 019_md1_mddbmeta md1_mddbmetaodb: addr=0xffffffff, size=562654 extracted to 020_md1_mddbmetaodb md1_mddb: addr=0xffffffff, size=12280622 extracted to 021_md1_mddb md1_mdmlayout: addr=0xffffffff, size=8341403 extracted to 022_md1_mdmlayout md1_file_map: addr=0xffffffff, size=889 extracted to 023_md1_file_map
The most relevant files are:
最相关的文件是:
md1rom
is the nanoMIPS firmware image
md1rom
是 nanoMIPS 固件映像md1_file_map
provides slightly more context on themd1_dbginfo
file: its original filename isDbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31.xz
md1_file_map
提供了md1_dbginfo
有关文件的更多上下文:其原始文件名是DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31.xz
md1_dbginfo
is an XZ compressed binary file containing debug information formd1rom
, including symbols
md1_dbginfo
是一个 XZ 压缩的二进制文件,其中包含 的md1rom
调试信息,包括符号
Extracting debug symbols
提取调试符号
md1_dbginfo
is another binary file format containing symbols and filenames with associated addresses. We’ll rename it and decompress it based on the filename from md1_file_map
:
md1_dbginfo
是另一种二进制文件格式,包含符号和文件名以及关联地址。我们将根据以下 md1_file_map
文件名重命名并解压缩它:
$ cp 018_md1_dbginfo DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31.xz $ unxz DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31.xz $ hexdump DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31 | head 00000000 43 41 54 49 43 54 4e 52 01 00 00 00 98 34 56 00 |CATICTNR.....4V.| 00000010 43 41 54 49 01 00 00 00 00 00 00 00 4e 52 31 36 |CATI........NR16| 00000020 2e 52 32 2e 4d 54 36 38 37 39 2e 54 43 32 2e 50 |.R2.MT6879.TC2.P| 00000030 52 31 2e 53 50 00 4d 54 36 38 37 39 5f 53 30 30 |R1.SP.MT6879_S00| 00000040 00 4d 54 36 38 37 39 5f 4e 52 31 36 2e 54 43 32 |.MT6879_NR16.TC2| 00000050 2e 50 52 31 2e 53 50 2e 56 31 37 2e 50 33 38 2e |.PR1.SP.V17.P38.| 00000060 30 33 2e 32 34 2e 30 33 52 00 32 30 32 33 2f 30 |03.24.03R.2023/0| 00000070 35 2f 31 39 20 32 32 3a 33 31 00 73 00 00 00 2b |5/19 22:31.s...+| 00000080 ed 53 00 49 4e 54 5f 56 65 63 74 6f 72 73 00 4c |.S.INT_Vectors.L| 00000090 08 00 00 54 08 00 00 62 72 6f 6d 5f 65 78 74 5f |...T...brom_ext_|
To extract information from the debug info file, we made another Kaitai definition and wrapper script that extracts symbols and outputs them in a text format compatible with Ghidra’s ImportSymbolsScript.py
script:
为了从调试信息文件中提取信息,我们制作了另一个 Kaitai 定义和包装脚本,该脚本提取符号并以与 Ghidra ImportSymbolsScript.py
脚本兼容的文本格式输出:
$ ./mtk_dbg_extract.py md1img_out/DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31 | tee dbg_symbols.txt INT_Vectors 0x0000084c l brom_ext_main 0x00000860 l INT_SetPLL_Gen98 0x00000866 l PLL_Set_CLK_To_26M 0x000009a2 l PLL_MD_Pll_Init 0x000009da l INT_SetPLL 0x000009dc l INT_Initialize_Phase1 0x027b5c80 l INT_Initialize_Phase2 0x027b617c l init_cm 0x027b6384 l init_cm_wt 0x027b641e l ...
(Currently the script is set to only output label definitions rather than function definitions, as it was unknown if all of the symbols were for functions.)
(目前,该脚本设置为仅输出标签定义而不是函数定义,因为不知道是否所有符号都用于函数。
Loading nanoMIPS firmware into Ghidra
将 nanoMIPS 固件加载到 Ghidra 中
Install the extension 安装扩展
First, we’ll have to install the nanoMIPS module for Ghidra. In the main Ghidra window, go to “File > Install Extensions”, click the “Add Extension” plus button, and select the module Zip file (e.g., ghidra_11.0.3_PUBLIC_20240424_nanomips.zip
). Then restart Ghidra.
首先,我们必须为 Ghidra 安装 nanoMIPS 模块。在 Ghidra 主窗口中,转到“文件>安装扩展”,单击“添加扩展”加号按钮,然后选择模块 Zip 文件(例如, ghidra_11.0.3_PUBLIC_20240424_nanomips.zip
)。然后重新启动 Ghidra。
Initial loading 初始加载
Load md1rom
as a raw binary image. Select 000_md1rom
from the md1img.img
extract directory and keep “Raw Binary” as the format. For Language, click the “Browse” ellipsis and find the little endian 32-bit nanoMIPS option (nanomips:LE:32:default
) using the filter, then click OK.
md1rom
加载为原始二进制映像。从 md1img.img
数据提取目录中选择 000_md1rom
并保留“原始二进制文件”作为格式。对于语言,单击“浏览”省略号并使用过滤器找到小端 32 位 nanoMIPS 选项 ( nanomips:LE:32:default
),然后单击确定。
We’ll load the image at offset 0 so no further options are necessary. Click OK again to load the raw binary.
我们将以偏移量 0 加载图像,因此不需要其他选项。再次单击“确定”以加载原始二进制文件。
When Ghidra asks if you want to do an initial auto-analysis, select No. We have to set up a mirrored memory address space at 0x90000000
first.
当 Ghidra 询问您是否要进行初始自动分析时,请选择“否”。 0x90000000
首先,我们必须设置一个镜像的内存地址空间。
Memory mapping 内存映射
Open the “Memory Map” window and click plus for “Add Memory Block”.
打开“内存映射”窗口,然后单击“添加内存块”的加号。
We’ll name the new block “mirror”, set the starting address to ram:90000000
, the length to match the length of the base image “ram” block (0x2916c40
), permissions to read and execute, and the “Block Type” to “Byte Mapped” with a source address of 0 and mapping ratio of 1:1.
我们将新块命名为“mirror”,将起始地址设置为 ram:90000000
,长度以匹配基本镜像的长度 “ram” 块 ( 0x2916c40
),读取和执行权限,并将 “Block Type” 设置为 “Byte Mapped”,源地址为 0,映射比为 1:1。
Also change the permissions for the original “ram” block to just read and execute. Save the memory map changes and close the “Memory Map” window.
此外,将原始“ram”块的权限更改为仅读取和执行。保存内存映射更改并关闭“内存映射”窗口。
Note that this memory map is incomplete; it’s just the minimal setup required to get disassembly working.
请注意,此内存映射不完整;这只是进行拆卸工作所需的最低设置。
Debug symbols 调试符号
Next, we’ll load up the debug symbols. Open the Script Manager window and search for ImportSymbolsScript.py
. Run the script and select the text file generated by mtk_dbg_extract.py
earlier (dbg_symbols.txt
). This will create a bunch of labels, most of them in the mirrored address space.
接下来,我们将加载调试符号。打开“脚本管理器”窗口并搜索 ImportSymbolsScript.py
。运行脚本并选择之前 ( dbg_symbols.txt
) 生成的 mtk_dbg_extract.py
文本文件。这将创建一堆标签,其中大部分位于镜像地址空间中。
Disassembly 拆卸
Now we can begin disassembly. There is a jump instruction at address 0 that will get us started, so just select the byte at address 0 and press “d” or right-click and choose “Disassemble”. Thanks to the debug symbols, you may notice this instruction jumps to the INT_Initialize_Phase1
function.
现在我们可以开始拆卸了。地址 0 处有一个跳转指令可以让我们开始,因此只需选择地址 0 处的字节并按“d”或右键单击并选择“反汇编”。多亏了调试符号,您可能会注意到此指令跳转到 INT_Initialize_Phase1
该函数。
Flow-based disassembly will now start to discover a bunch of code. The initial disassembly can take several minutes to complete.
基于流的反汇编现在将开始发现一堆代码。初始拆卸可能需要几分钟才能完成。
Then we can run the normal auto-analysis with “Analysis > Auto Analyze…”. This should also discover more code and spend several minutes in disassembly and decompilation. We’ve found that the “Non-Returning Functions” analyzer creates many false positives with the default configuration in these firmware images, which disrupts the code flow, so we recommend disabling it for initial analysis.
然后,我们可以使用“Analysis > Auto Analyze…”运行正常的自动分析。这也应该发现更多的代码,并花费几分钟的时间进行反汇编和反编译。我们发现,“非返回函数”分析器在这些固件映像中的默认配置下会创建许多误报,这会中断代码流,因此我们建议禁用它进行初始分析。
The one-shot “Decompiler Parameter ID” analyzer is a good option to run next for better detection of function input types.
一次性的“反编译器参数 ID”分析器是接下来运行的不错选择,以便更好地检测函数输入类型。
Conclusion 结论
Although the module is still a work in progress, the results are already quite useable for analysis and allowed to us to reverse engineer some critical features in baseband processors.
尽管该模块仍在开发中,但结果已经非常可用于分析,并允许我们对基带处理器中的一些关键功能进行逆向工程。
The nanoMIPS Ghidra module and MediaTek binary file unpackers can be found on our GitHub at:
nanoMIPS Ghidra 模块和 MediaTek 二进制文件解压缩器可以在我们的 GitHub 上找到: