SEH in Visual Studio: how does it work?
Let's begin with a brief description of a little trick that I've found while analyzing a malware detected by ESET as Win32/Rootkit.Avatar. For more information about it you can check this detailed article.
In particular, the article mentions a specific behaviour of the malware: "the malware raises an exception to pass control to an installed exception-handler". That is:
...
.text:0040235B push offset sub_402B26
.text:00402360 mov eax, large fs:0
.text:00402366 push eax
.text:00402367 mov large fs:0, esp
...
This is the standard way an exception handler is installed, and it corresponds to a try-catch statement in C++.
Anyway, if we keep analyzing the code we'll notice that this isn't the only exception handler being installed. In fact, we are going to see how the malware takes advantage of another one, in the attempt to hide a common debugger check (the PEB one) inside it.
Exception handlers have already been extensively documented in the past, but this one is a little bit trickier because it makes use of a Visual Studio specific implementation: the try-except statement. Here is how it is implemented in the malware:
...
.text:00401CC5 push offset dword_4044C8
.text:00401CCA push offset __except_handler3
.text:00401CCF mov eax, large fs:0
.text:00401CD5 push eax
.text:00401CD6 mov large fs:0, esp
...
This is the installation code for a SEH in Visual Studio, and there are two substantial differences in respect to the previous code: the first one is that it seems to be installing a standard library routine ("__except_handler3") as an exception handler, which doesn't look suspicious; the second one is a little bit confusing if you haven't read the specifications before.
However, a closer look will reveal the trick. In fact another value is being pushed, that is:
.text:00401CC5 push offset dword_4044C8
We usually wouldn't expect to see this additional "push", and we would think that it isn't related to the SEH, but... it actually is!
In particular, this "push" is putting on the stack the address of a data structure named "scopetable entry", documented by Matt Pietrek, which has the following definition:
typedef struct _SCOPETABLE
{
DWORD previousTryLevel;
DWORD lpfnFilter;
DWORD lpfnHandler;
} SCOPETABLE, *PSCOPETABLE;
It specifies the addresses of the code blocks to be executed for the filter expression ("lpfnFilter") and for the except body ("lpfnHandler"):
__try {
... code
}
__except(filter expression) {
... except body
}
The library routine "__except_handler3" uses this information first to call the code for the filter expression, which will decide if the exception is handled or not, and then to dispatch execution to the except body (in case it's handled). So, actually, the real exception handler installed by the malware is not the library one, but it is the one inside the except body. We can see this structure in the malware:
.rdata:004044C8 dword_4044C8 dd 0FFFFFFFFh
.rdata:004044CC dd offset filter
.rdata:004044D0 dd offset except_body
and the related code:
.text:00401CFA mov [ecx], al ; trigger exception!
.text:00401CFC jmp short loc_401D13
.text:00401CFE ; ----------------------------------------------
.text:00401CFE
.text:00401CFE filter:
.text:00401CFE mov eax, 1
.text:00401D03 retn
.text:00401D04 ; ----------------------------------------------
.text:00401D04
.text:00401D04 except_body:
.text:00401D04 mov esp, [ebp+var_18]
.text:00401D07 mov eax, large fs:30h
.text:00401D0D mov al, [eax+2]
.text:00401D10 mov [ebp+var_1C], eax
.text:00401D13
.text:00401D13 loc_401D13:
.text:00401D13 mov [ebp+var_4], 0FFFFFFFFh
.text:00401D1A mov al, byte ptr [ebp+var_1C]
.text:00401D1D mov ecx, [ebp+var_10]
.text:00401D20 mov large fs:0, ecx
.text:00401D27 pop edi
.text:00401D28 pop esi
.text:00401D29 pop ebx
.text:00401D2A mov esp, ebp
.text:00401D2C pop ebp
.text:00401D2D retn
From this listing you can see that the filter code always returns true, which means that the except body is always executed when an exception happens (and the code triggers one on purpose on line 00401CFA). On execution, the except body checks PEB.BeingDebugged in order to detect a debugger attached to the process, and returns true or false depending on the result. Later, the function that called the above code, will check such a flag and terminate execution in case of debugger detection.
A better way to exploit the SEH implementation.
A better way to exploit the SEH implementation.
So, all this trouble just to hide the check for the debugger inside a try-except statement and to make it a bit more difficult to trace but, as it is, this trick is not really being effective. Is it possible to do better?
Well, if we put the debugger check inside the filter code rather than in the except body, we can make the filter return false in case of debugger detection, which means the library handler "__except_handler3" won't call the except body, and will terminate the execution instead. This would confuse things, because the decision on whether to terminate execution or not is taken inside a library code routine, rather than in the malware code itself. In this case, if someone debugs the malware he will find that the execution always terminates when running the standard Visual Studio exception handler code, and will have to dig into it to understand what's happening.
Well, if we put the debugger check inside the filter code rather than in the except body, we can make the filter return false in case of debugger detection, which means the library handler "__except_handler3" won't call the except body, and will terminate the execution instead. This would confuse things, because the decision on whether to terminate execution or not is taken inside a library code routine, rather than in the malware code itself. In this case, if someone debugs the malware he will find that the execution always terminates when running the standard Visual Studio exception handler code, and will have to dig into it to understand what's happening.
It would look like this:
__try
{
//...
RaiseException(0, 0, 0, 0);
}
__except(!IsDebuggerPresent())
{
//...
}
Briefly: the code guarded in the try block will cause an exception; the filter routine is the check implemented via the IsDebuggerPresent API, which returns true if the debugger is attached and false otherwise. So, in case a debugger is detected, the filter returns zero, and the except block is never called, causing the process to simply crash.
Of course, you can obfuscate the code in the filter routine and make it not so obvious, and this will leave the analyst puzzling in why is the code crashing inside Visual Studio standard library routine :).
"__except_handler4"?!
"__except_handler4"?!
"__except_handler3" is the standard library code, but it was susceptible to corruption in case of stack overflow, and this caused security problems. So with new versions of Visual Studio, the function was updated to "__except_handler4", which is essentially the same routine with additional features.
In particular, it uses canaries to protect the SEH data, in order to make sure that the pointers to the exception handlers have not been overwritten:
.text:004010C5 @__security_check_cookie@4 proc near ; DATA XREF: __except_handler4+11 o
.text:004010C5 cmp ecx, ___security_cookie
.text:004010CB jnz short loc_4010CF
.text:004010CD rep retn
.text:004010CF
.text:004010CF loc_4010CF: ; CODE XREF: __security_check_cookie(x)+6 j
.text:004010CF jmp ___report_gsfailure
.text:004010CF @__security_check_cookie@4 endp
Furthermore, the old "__except_handler3" was library code that was linked and embedded in the user executable, while "__except_handler4" instead is only a small wrapper for the API "_except_handler4_common", exported by the Visual Studio runtime dll (module msvcr*.dll):
.text:00401799 mov edi, edi
.text:0040179B push ebp
.text:0040179C mov ebp, esp
.text:0040179E push [ebp+arg_C]
.text:004017A1 push [ebp+arg_8]
.text:004017A4 push [ebp+arg_4]
.text:004017A7 push [ebp+arg_0]
.text:004017AA push offset @__security_check_cookie@4 ; __security_check_cookie(x)
.text:004017AF push offset ___security_cookie
.text:004017B4 call _except_handler4_common
.text:004017B9 add esp, 18h
.text:004017BC pop ebp
.text:004017BD retn
Obfuscating algorithms.
In particular, it uses canaries to protect the SEH data, in order to make sure that the pointers to the exception handlers have not been overwritten:
.text:004010C5 @__security_check_cookie@4 proc near ; DATA XREF: __except_handler4+11 o
.text:004010C5 cmp ecx, ___security_cookie
.text:004010CB jnz short loc_4010CF
.text:004010CD rep retn
.text:004010CF
.text:004010CF loc_4010CF: ; CODE XREF: __security_check_cookie(x)+6 j
.text:004010CF jmp ___report_gsfailure
.text:004010CF @__security_check_cookie@4 endp
Furthermore, the old "__except_handler3" was library code that was linked and embedded in the user executable, while "__except_handler4" instead is only a small wrapper for the API "_except_handler4_common", exported by the Visual Studio runtime dll (module msvcr*.dll):
.text:00401799 mov edi, edi
.text:0040179B push ebp
.text:0040179C mov ebp, esp
.text:0040179E push [ebp+arg_C]
.text:004017A1 push [ebp+arg_8]
.text:004017A4 push [ebp+arg_4]
.text:004017A7 push [ebp+arg_0]
.text:004017AA push offset @__security_check_cookie@4 ; __security_check_cookie(x)
.text:004017AF push offset ___security_cookie
.text:004017B4 call _except_handler4_common
.text:004017B9 add esp, 18h
.text:004017BC pop ebp
.text:004017BD retn
Now that we know all the details related to the SEH implementation in Visual Studio, I would like to propose a simple yet powerful idea to obfuscate algorithms.
Briefly, you can:
- Create a set of basic virtualized opcodes, each one represented by a different function.
- Use these opcodes to write an algorithm encoding it in a data structure (each opcode will be associated to a particular "id number").
- Execute each instruction of the program through a different filter expression. This means that if your algorithm consists of "n" opcodes, you will have "n" try-except blocks (that is, "n" filter expressions) and you will have to generate "n" exceptions as well.
Here is the source code of a working POC that implements the RC4 algorithm:
#include <windows.h>
#include <stdio.h>
// globals used to keep the jl flags and the ip
int flags, eip;
// opcodes
#define OPC_MOD 0x11
#define OPC_XOR 0x12
#define OPC_CMP 0x13
#define OPC_JL 0x14
#define OPC_JMP 0x15
#define OPC_HLT 0x16
#define OPC_MOV 0x17
#define OPC_ADD 0x18
// operand types
#define OP_V 1 // variable
#define OP_C 2 // constant
#define OP_P 3 // pointer
// sizes
#define OP_BYTE 1
#define OP_DWORD 2
// opcode characterization
typedef struct _OPCODE
{
BYTE opcode;
BYTE type_op1;
BYTE type_op2;
BYTE size;
} OPCODE;
// macro to fill the opcode arrays quickly
#define MAKE_OPC(__opc, __op1, __op2, __size, __param1, __param2) \
(__opc | (__op1 << 8) | (__op2 << 16) | (__size << 24)), \
(DWORD)__param1, \
(DWORD)__param2
// EXC_RUN to execute the opcodes arrays "rc4_init_op" and "rc4_crypt_op"
#define EXC_RUN(__myprogram) \
eip = 0; flags = 0; \
while(eip != EIP_HALT){ EXC_TRY() EXC_EXCEPTION(__myprogram) EXC_USED_OPCODES() eip += 3;}
#define EIP_HALT 0xFFFFFFFF
#define EXC_TRY() \
__try{ __try{ __try{ __try{ __try{ __try{ __try{ __try{
#define EXC_EXCEPTION(__program) RaiseException(__program[eip], 0, 2, (ULONG_PTR*)(&__program[eip+1]));
#define EXC_INSTR(__opc) }__except(__opc(GetExceptionCode(), GetExceptionInformation())){}
#define EXC_USED_OPCODES() \
EXC_INSTR(cmp) EXC_INSTR(mov) EXC_INSTR(add) EXC_INSTR(hlt) \
EXC_INSTR(jmp) EXC_INSTR(jl) EXC_INSTR(mod) EXC_INSTR(xor)
// flags values after cmp
#define GT 0
#define LT 1
#define EQ 2
// checks the opcode and extracts its operands
BOOL chckopc_extr(BYTE opcode, BYTE opc, DWORD **op1, DWORD **op2, struct _EXCEPTION_POINTERS *ep)
{
EXCEPTION_RECORD *er;
if(opcode != opc) return false;
er = ep->ExceptionRecord;
*op1 = (DWORD*)(er->ExceptionInformation[0]);
*op2 = (DWORD*)(er->ExceptionInformation[1]);
return true;
}
// reads an operand given its type and size
DWORD readop(DWORD *op, BYTE type, BYTE size)
{
switch(type)
{
case OP_V:
if(size == OP_BYTE)
return *((BYTE*)op);
else
return *op;
case OP_C:
return (DWORD)op;
case OP_P:
if(size == OP_BYTE)
return *((BYTE*)(*op));
else
return *((DWORD*)(*op));
}
return 0;
}
// assigns data to an operand given its type and size
void assignop(DWORD *op, BYTE type, BYTE size, DWORD data)
{
switch(type)
{
case OP_V:
if(size == OP_BYTE)
*((BYTE*)op) = (BYTE)data;
else
*op = data;
break;
case OP_C:
*op = data;
break;
case OP_P:
if(size == OP_BYTE)
*((BYTE*)(*op)) = (BYTE)data;
else
*((DWORD*)(*op)) = data;
break;
}
}
// -----------------------------------------------------------------
// Opcodes
// x = x % y
int mod(unsigned int code, struct _EXCEPTION_POINTERS *ep)
{
DWORD *op1, *op2;
if(!chckopc_extr((((OPCODE*)&code)->opcode), OPC_MOD, &op1, &op2, ep))
return false;
*op1 = *op1 % *op2;
return true;
}
// x = x ^ y
int xor(unsigned int code, struct _EXCEPTION_POINTERS *ep)
{
DWORD *op1, *op2;
if(!chckopc_extr((((OPCODE*)&code)->opcode), OPC_XOR, &op1, &op2, ep))
return false;
*op1 = *op1 ^ *op2;
return true;
}
// unsigned compare
int cmp(unsigned int code, struct _EXCEPTION_POINTERS *ep)
{
DWORD src1, src2;
DWORD *op1, *op2;
if(!chckopc_extr((((OPCODE*)&code)->opcode), OPC_CMP, &op1, &op2, ep))
return false;
src1 = readop(op1, ((OPCODE*)&code)->type_op1, ((OPCODE*)&code)->size);
src2 = readop(op2, ((OPCODE*)&code)->type_op2, ((OPCODE*)&code)->size);
(src1 > src2) ? flags = GT : ((src1 < src2) ? flags = LT : flags = EQ);
return true;
}
// eip = x IFF flags == LT
int jl(unsigned int code, struct _EXCEPTION_POINTERS *ep)
{
DWORD *op1, *op2;
if(!chckopc_extr((((OPCODE*)&code)->opcode), OPC_JL, &op1, &op2, ep))
return false;
if(flags == LT)
eip = ((DWORD)op1 * 3) - 3;
return true;
}
// eip = x
int jmp(unsigned int code, struct _EXCEPTION_POINTERS *ep)
{
DWORD *op1, *op2;
if(!chckopc_extr((((OPCODE*)&code)->opcode), OPC_JMP, &op1, &op2, ep))
return false;
eip = ((DWORD)op1 * 3) - 3;
return true;
}
// eip = EIP_HALT
int hlt(unsigned int code, struct _EXCEPTION_POINTERS *ep)
{
DWORD *op1, *op2;
if(!chckopc_extr((((OPCODE*)&code)->opcode), OPC_HLT, &op1, &op2, ep))
return false;
eip = EIP_HALT - 3;
return true;
}
// move data
int mov(unsigned int code, struct _EXCEPTION_POINTERS *ep)
{
DWORD src2;
DWORD *op1, *op2;
if(!chckopc_extr((((OPCODE*)&code)->opcode), OPC_MOV, &op1, &op2, ep))
return false;
src2 = readop(op2, ((OPCODE*)&code)->type_op2, ((OPCODE*)&code)->size);
assignop(op1, ((OPCODE*)&code)->type_op1, ((OPCODE*)&code)->size, src2);
return true;
}
// add data
int add(unsigned int code, struct _EXCEPTION_POINTERS *ep)
{
DWORD src1, src2;
DWORD *op1, *op2;
if(!chckopc_extr((((OPCODE*)&code)->opcode), OPC_ADD, &op1, &op2, ep))
return false;
src1 = readop(op1, ((OPCODE*)&code)->type_op1, ((OPCODE*)&code)->size);
src2 = readop(op2, ((OPCODE*)&code)->type_op2, ((OPCODE*)&code)->size);
src2 += src1;
assignop(op1, ((OPCODE*)&code)->type_op1, ((OPCODE*)&code)->size, src2);
return true;
}
// -----------------------------------------------------------------
void main(void)
{
// test vector:
// ascii key 0123456789abcdef
// hex plaintext: 0000000000000000
// hex ciphertext: 7494c2e7104b0879
BYTE *temp_perm, *temp_perm2, *temp_key, *temp_plain, *temp_cipher;
BYTE perm_byte, swap_byte;
DWORD j, index1, index2, key_index, key_byte;
int i, keylen = 8, plainlen = 8;
BYTE perm[256];
BYTE key[8] = {0x01, 0x23, 0x45, 0x67, 0x89, 0xab, 0xcd, 0xef};
BYTE plaintext[8] = {0, 0, 0, 0, 0, 0, 0, 0};
BYTE ciphertext[8];
temp_perm = perm;
DWORD rc4_init_op[] = {
/* 000 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &i, 0), // init permutation box
/* 001 */ MAKE_OPC(OPC_MOV, OP_P, OP_V, OP_BYTE, &temp_perm, &i),
/* 002 */ MAKE_OPC(OPC_ADD, OP_V, OP_C, OP_DWORD, &temp_perm, 1),
/* 003 */ MAKE_OPC(OPC_ADD, OP_V, OP_C, OP_DWORD, &i, 1),
/* 004 */ MAKE_OPC(OPC_CMP, OP_V, OP_C, OP_DWORD, &i, 256),
/* 005 */ MAKE_OPC(OPC_JL, 0, 0, 0, 1, 0),
/* 006 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_BYTE, &index1, 0),
/* 007 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_BYTE, &index2, 0),
/* 008 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &j, 0), // apply the key to the permutation box
/* 009 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &i, 0),
/* 010 */ MAKE_OPC(OPC_MOV, OP_V, OP_V, OP_DWORD, &key_index, &i),
/* 011 */ MAKE_OPC(OPC_MOD, 0, 0, 0, &key_index, &keylen),
/* 012 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &temp_key, key),
/* 013 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_DWORD, &temp_key, &key_index),
/* 014 */ MAKE_OPC(OPC_MOV, OP_V, OP_P, OP_BYTE, &key_byte, &temp_key),
/* 015 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &temp_perm, perm),
/* 016 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_DWORD, &temp_perm, &i),
/* 017 */ MAKE_OPC(OPC_MOV, OP_V, OP_P, OP_BYTE, &perm_byte, &temp_perm),
/* 018 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_BYTE, &j, &perm_byte),
/* 019 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_BYTE, &j, &key_byte),
/* 020 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &temp_perm, perm), // swap bytes
/* 021 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_DWORD, &temp_perm, &j),
/* 022 */ MAKE_OPC(OPC_MOV, OP_V, OP_P, OP_BYTE, &swap_byte, &temp_perm),
/* 023 */ MAKE_OPC(OPC_MOV, OP_P, OP_V, OP_BYTE, &temp_perm, &perm_byte),
/* 024 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &temp_perm, perm),
/* 025 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_DWORD, &temp_perm, &i),
/* 026 */ MAKE_OPC(OPC_MOV, OP_P, OP_V, OP_BYTE, &temp_perm, &swap_byte),
/* 027 */ MAKE_OPC(OPC_ADD, OP_V, OP_C, OP_DWORD, &i, 1),
/* 028 */ MAKE_OPC(OPC_CMP, OP_V, OP_C, OP_DWORD, &i, 256),
/* 029 */ MAKE_OPC(OPC_JL, 0, 0, 0, 10, 0),
/* 030 */ MAKE_OPC(OPC_HLT, 0, 0, 0, 0, 0)
};
DWORD rc4_crypt_op[] = {
/* 000 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &i, 0),
/* 001 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &index1, 0),
/* 002 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &index2, 0),
/* 003 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &j, 0),
/* 004 */ MAKE_OPC(OPC_ADD, OP_V, OP_C, OP_BYTE, &index1, 1), // update indices
/* 005 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &temp_perm, perm),
/* 006 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_DWORD, &temp_perm, &index1),
/* 007 */ MAKE_OPC(OPC_ADD, OP_V, OP_P, OP_BYTE, &index2, &temp_perm),
/* 008 */ MAKE_OPC(OPC_MOV, OP_V, OP_P, OP_BYTE, &swap_byte, &temp_perm), // swap bytes
/* 009 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &temp_perm2, perm),
/* 010 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_DWORD, &temp_perm2, &index2),
/* 011 */ MAKE_OPC(OPC_MOV, OP_V, OP_P, OP_BYTE, &perm_byte, &temp_perm2),
/* 012 */ MAKE_OPC(OPC_MOV, OP_P, OP_V, OP_BYTE, &temp_perm2, &swap_byte),
/* 013 */ MAKE_OPC(OPC_MOV, OP_P, OP_V, OP_BYTE, &temp_perm, &perm_byte),
/* 014 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &temp_perm, perm), // xor
/* 015 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_DWORD, &temp_perm, &index1),
/* 016 */ MAKE_OPC(OPC_MOV, OP_V, OP_P, OP_BYTE, &j, &temp_perm),
/* 017 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &temp_perm, perm),
/* 018 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_DWORD, &temp_perm, &index2),
/* 019 */ MAKE_OPC(OPC_MOV, OP_V, OP_P, OP_BYTE, &perm_byte, &temp_perm),
/* 020 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_BYTE, &j, &perm_byte),
/* 021 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &temp_plain, plaintext),
/* 022 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_DWORD, &temp_plain, &i),
/* 023 */ MAKE_OPC(OPC_MOV, OP_V, OP_P, OP_BYTE, &perm_byte, &temp_plain),
/* 024 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &temp_perm, perm),
/* 025 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_DWORD, &temp_perm, &j),
/* 026 */ MAKE_OPC(OPC_MOV, OP_V, OP_P, OP_BYTE, &swap_byte, &temp_perm),
/* 027 */ MAKE_OPC(OPC_XOR, 0, 0, 0, &swap_byte, &perm_byte),
/* 028 */ MAKE_OPC(OPC_MOV, OP_V, OP_C, OP_DWORD, &temp_cipher, ciphertext),
/* 029 */ MAKE_OPC(OPC_ADD, OP_V, OP_V, OP_DWORD, &temp_cipher, &i),
/* 030 */ MAKE_OPC(OPC_MOV, OP_P, OP_V, OP_BYTE, &temp_cipher, &swap_byte),
/* 031 */ MAKE_OPC(OPC_ADD, OP_V, OP_C, OP_DWORD, &i, 1),
/* 032 */ MAKE_OPC(OPC_CMP, OP_V, OP_V, OP_DWORD, &i, &plainlen),
/* 033 */ MAKE_OPC(OPC_JL, 0, 0, 0, 4, 0),
/* 034 */ MAKE_OPC(OPC_HLT, 0, 0, 0, 0, 0)
};
EXC_RUN(rc4_init_op);
EXC_RUN(rc4_crypt_op);
printf("cipher: %02X %02X %02X %02X %02X %02X %02X %02X\n",
ciphertext[0], ciphertext[1], ciphertext[2], ciphertext[3],
ciphertext[4], ciphertext[5], ciphertext[6], ciphertext[7]);
}
Following this idea, you can easily implement any other algorithm and this bears several advantages in term of obfuscation. In fact, in order to understand the code, you have to analyze the array containing all the opcodes, that is dynamically generated:
...
.text:00401678 mov [ebp+var_160], ebx
.text:0040167E mov [ebp+var_15C], 1
.text:00401688 mov [ebp+var_158], 2020113h
.text:00401692 mov [ebp+var_154], ebx
.text:00401698 mov [ebp+var_150], 100h
.text:004016A2 mov [ebp+var_14C], 14h
.text:004016AC mov [ebp+var_148], 0Ah
.text:004016B6 xor ebx, ebx
.text:004016B8 mov [ebp+var_144], ebx
.text:004016BE mov [ebp+var_140], 16h
.text:004016C8 mov [ebp+var_13C], ebx
.text:004016CE mov [ebp+var_138], ebx
.text:004016D4 mov [ebp+var_44C], eax
.text:004016DA lea ebx, [ebp+var_458]
.text:004016E0 mov [ebp+var_448], ebx
.text:004016E6 mov [ebp+var_444], 0
.text:004016F0 mov [ebp+var_440], eax
.text:004016F6 lea ebx, [ebp+var_464]
.text:004016FC mov [ebp+var_43C], ebx
.text:00401702 mov [ebp+var_438], 0
.text:0040170C mov [ebp+var_434], eax
.text:00401712 lea ebx, [ebp+var_460]
.text:00401718 mov [ebp+var_430], ebx
.text:0040171E mov [ebp+var_42C], 0
.text:00401728 mov [ebp+var_428], eax
...
Moreover, the opcodes aren't referenced by any direct call, because they are executed only due to the "RaiseException" API, which is guarded within various nested try-except blocks. This results in a chain of filter expressions and except bodies (which constitute an additional layer above the opcode routines) that are triggered by the scopetable mechanism:
; while(eip != EIP_HALT)
.text:00401ACE loc_401ACE: ; CODE XREF: _main+A92 j
.text:00401ACE mov dword_404370, eax
.text:00401AD3 cmp eax, 0FFFFFFFFh
.text:00401AD6 jz loc_401D87
...
; this is the code guarded inside the nested try/excepts
...
.text:00401B11 lea edx, [ebp+eax*4+Arguments]
.text:00401B18 push edx ; lpArguments
.text:00401B19 push 2 ; nNumberOfArguments
.text:00401B1B push ecx ; dwExceptionFlags
.text:00401B1C mov eax, [ebp+eax*4+dwExceptionCode]
.text:00401B23 push eax ; dwExceptionCode
.text:00401B24 call ds:RaiseException
...
.text:00401B57 jmp loc_401D71
...
; a couple of filter expressions and except bodies
...
.text:00401B5C loc_401B5C: ; DATA XREF: .rdata:00403290 o
.text:00401B5C mov eax, [ebp+var_14]
.text:00401B5F mov ecx, [eax]
.text:00401B61 mov edx, [ecx]
.text:00401B63 mov [ebp+var_4F0], edx
.text:00401B69 call sub_401050
.text:00401B6E retn
.text:00401B6F ; ---------------------------------------------------------------------------
.text:00401B6F
.text:00401B6F loc_401B6F: ; DATA XREF: .rdata:00403294 o
.text:00401B6F mov esp, [ebp+var_18]
.text:00401B72 mov [ebp+var_4], 6
.text:00401B79 mov [ebp+var_4], 5
.text:00401B80 mov [ebp+var_4], 4
.text:00401B87 mov [ebp+var_4], 3
.text:00401B8E mov [ebp+var_4], 2
.text:00401B95 mov [ebp+var_4], 1
.text:00401B9C mov [ebp+var_4], 0
.text:00401BA3 jmp loc_401D71
.text:00401BA8 ; ---------------------------------------------------------------------------
.text:00401BA8
.text:00401BA8 loc_401BA8: ; DATA XREF: .rdata:00403284 o
.text:00401BA8 mov eax, [ebp+var_14]
.text:00401BAB mov edx, [eax]
.text:00401BAD mov ecx, [edx]
.text:00401BAF mov [ebp+var_4D4], ecx
.text:00401BB5 call sub_401170
.text:00401BBA retn
.text:00401BBB ; ---------------------------------------------------------------------------
.text:00401BBB
.text:00401BBB loc_401BBB: ; DATA XREF: .rdata:00403288 o
.text:00401BBB mov esp, [ebp+var_18]
.text:00401BBE mov [ebp+var_4], 5
.text:00401BC5 mov [ebp+var_4], 4
.text:00401BCC mov [ebp+var_4], 3
.text:00401BD3 mov [ebp+var_4], 2
.text:00401BDA mov [ebp+var_4], 1
.text:00401BE1 mov [ebp+var_4], 0
.text:00401BE8 jmp loc_401D71
...
; outside the nested try/block there is the code to increase the virtual EIP
...
.text:00401D71 loc_401D71: ; CODE XREF: _main+867 j
.text:00401D71 ; _main+8B3 j
...
As you can see, the algorithm is all broken and it's not easy to figure out what the code is attempting to do, neither it is to automate the detection of specific routines.