Encoded-Resource-in-PE
Description
Malware can embed resources (it's often the case for launchers/loaders but these resources can be encrypted. This article hows how you can decrypt these resources in python.
Context
In this example, we'll take the executable Lab12-02.exe from the Practical Malware Analysis book. The executable contains a resource section as depicted below:
This resource can be extracted with ResourceHacker:
In this case though, the resource is encrypted:
$ file Data_1.bin Data_1.bin: data $ xxd Data_1.bin | head 0000000: 0c1b d141 4241 4141 4541 4141 bebe 4141 ...ABAAAEAAA..AA 0000010: f941 4141 4141 4141 0141 4141 4141 4141 .AAAAAAA.AAAAAAA 0000020: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA 0000030: 4141 4141 4141 4141 4141 4141 a141 4141 AAAAAAAAAAAA.AAA 0000040: 4f5e fb4f 41f5 488c 60f9 400d 8c60 1529 O^.OA.H.`.@..`.) 0000050: 2832 6131 332e 2633 202c 6122 202f 2f2e (2a13.&3 ,a" //. 0000060: 3561 2324 6133 342f 6128 2f61 050e 1261 5a#$a34/a(/a...a 0000070: 2c2e 2524 6f4c 4c4b 6541 4141 4141 4141 ,.%$oLLKeAAAAAAA 0000080: 568c 8ad0 12ed e483 12ed e483 12ed e483 V............... 0000090: faf2 ef83 13ed e483 faf2 ee83 01ed e483 ................
Analyze decryption routine
Parameters of sub_401000
Let's analyze where the code accesses the resource and how it reads it. We know that the decryption routine is sub_401000 at offset 0x00401425 (we will analyze this function later):
This function accepts 3 parameters: var_8, dwSize and 0x41.
First, FindResource is called to find the resource in UNICODE > LOCALIZATION and a handle to this resource is saved to hResInfo:
.text:00401362 loc_401362: ; CODE XREF: extractResource+2D�j
.text:00401362 push offset Type ; "UNICODE"
.text:00401367 push offset Name ; "LOCALIZATION"
.text:0040136C mov eax, [ebp+hModule]
.text:0040136F push eax ; hModule
.text:00401370 call ds:FindResourceA
.text:00401376 mov [ebp+hResInfo], eax
Then, this handle is used by LoadResource:
.text:00401386 mov ecx, [ebp+hResInfo]
.text:00401389 push ecx ; hResInfo
.text:0040138A mov edx, [ebp+hModule]
.text:0040138D push edx ; hModule
.text:0040138E call ds:LoadResource
The size of the resource is gathered with a call to SizeofResource and saved to dwSize:
.text:004013B7 mov ecx, [ebp+hResInfo]
.text:004013BA push ecx ; hResInfo
.text:004013BB mov edx, [ebp+hModule]
.text:004013BE push edx ; hModule
.text:004013BF call ds:SizeofResource
.text:004013C5 mov [ebp+dwSize], eax
Then memory is alloated with VirtualAlloc and saved to var_8:
.text:004013D0 push 4
.text:004013D2 push 1000h ; flAllocationType
.text:004013D7 mov eax, [ebp+dwSize]
.text:004013DA push eax ; dwSize
.text:004013DB push 0 ; lpAddress
.text:004013DD call ds:VirtualAlloc
.text:004013E3 mov [ebp+var_8], eax
As a conclusion, our 3 parameters are defined as follows:
# | Parameter | Description |
---|---|---|
1 | var_8 | Memory allocation for the decoded resource |
2 | dwSize | Size of the resource |
3 | 0x41 | Hex key (likely to be used as a decryption key) |
Analysis of sub_401000
Now, we can analyze sub_401000. This function is processing bytes of the resource in a loop and XOR's each of them with 0x41 (the key previously seen):
Decrypt the resource
Python script
It's trivial to decode the resource in python:
#!/usr/bin/env python
b = bytearray(open('Data_1.bin', 'rb').read())
for i in range(len(b)):
b[i] ^= 0x41
open('out_file', 'wb').write(b)
The decoded resource is a valid PE:
$ python decode.py $ file out_file out_file: PE32 executable (console) Intel 80386, for MS Windows
WinHex
Using WinHex, you can also decrypt the resource (Edit > Modify Data > XOR):