Thursday, March 31, 2016

Introduction to Native (aka Undocumented) API

Hi all,

My second post to my new blog and I am even more excited than I was when I posted the first one. That is because this week I was looking at some really "hackish" stuff. Anyway, if you are here then perhaps you already have heard of the so called "undocumented api" or "native api". If you are checking this post as part of my updates sent to forums and sites like linkedin then perhaps you may not know have heard of this so called undocumented API or native API.  So take a deep breath... this is going to be a long read...

This post is for beginners who are developers, who have used C, C++ and Windows API already and who wants to dive into the "hackish" dimensions of Windows. Windows gives a mammoth API for programmers who develop with C/C++/Assembly language. Windows also uses this API. However Windows OS also have some "hidden" interfaces that it uses apart from the Windows (public) API. I assume you are aware of this documented API aka Windows API.

So today it is a misnomer when we it as  "undocumented API". Most of these native APIs are documented now.. Starting Windows NT days till today, there were some belligerent and talented hackers who did surgery on Windows OS and studied the internal facilities (code) that the operating system uses and also exposes to public especially programmers. Native API had a very advanced documentation in sysinternals website earlier, when it wasn't part of Microsoft and the articles were "visible". There are so many books and MSDN articles written on it. I would primarily thank, for this article, Matt Pietrek, Mark Russinovich, Sven Shreiber, Gary Nebbet, Prasad Dabak, Milind Borate... I may have missed some names.. I learned from many.. so apologies to them who know I follow them but I didn't take their name here... I also have referred and for some code. I used exploit-monday to refer the AMAZING documentation of some "undocumented" data structures and functions put by Matt Graeber. I only referred rohitab's forum to check if my code is "really fine". 

The code I have put here is definitely different from rohitab's because I use explicit linking. But before I talk about all that... firstly what is this so called "native" or "undocumented" API? Well if you think about all the available application programming interface that Windows provides as layers, then one layer (almost the top) comprises of Windows API that is offered but a huge set of DLL files like KernelBase(kernel32), User32, Gdi32, advapi32, crypt32 etc... Now if that is the first layer or the interface you use as a programmer, then the next layer is "NTDLL" layer (let us put it this way to simplify). To get a clear understanding I would ask the reader to refer to the popular Windows Internals books..  They are called "undocumented" because Microsoft never documented these functions calls and data structures officially (until now where there is some partial documentation). NTDLL.DLL is like a layer by itself and most of the API calls from the other DLLs call functions of NTDLL.DLL. This is one of the major "code path" and by code path I mean flow of code from user mode to kernel mode. NTDLL is like the last layer of user mode and then the code switches mode to kernel mode and your code (actually in form of system calls and requests to system) continue executing. The user mode Windows API is quite well documented. The kernel mode programming interface is provided by ntoskrnl, hal etc and that is documented too. Driver developers will be and should be well aware of those interfaces. Native API sits somewhere in between but isn't well documented. Microsoft uses this layer to do proprietary advancement to "chewing" user mode code and then pushing "massaged" code to kernel mode for execution. You can imagine this is an abstraction offered to the Windows API itself. You call Windows API. Windows API calls NTDLL functions (native api) and then there is mode switch to kernel and so on..
Did that simplify it better than any other documentation about native API? I don't know... if you think so please comment on this article. I can edit and make it better if needed.

Okay, so why do you need to know about "native api"? -  You can easily achieve almost all tasks with Windows API. Even higher frameworks offered to programmers like .Net, Java etc all use Windows API. Even the OS uses some of the functions offered by this "Windows API" layer. You would still use native api to know more about what happens inside. You use it because it can give more information and all that perhaps in a single function call. You use it because you skip a layer (WinAPI) and so your code saves several CPU cycles as you skip a lot of instructions that make the Windows API. Hackers and Anti-hackers have been using the undocumented interfaces for a long time. Hooks and Undocumented APIs are favorite for both hackers and anti-hackers. If you are lucky you will find an article about how Symantec anti-virus got into trouble because of using undocumented interfaces (not necessarily native API) after the introduction of Windows Vista. Microsoft warns users from using these APIs because they can change. For most tasks programmers SHOULD use Windows API.

Ok enough of theory, let us put it to practical use... what I will show in this article is a famous code that people usually look for.. "How to list processes using native api" or "NtQuerySystemInformation"?

"NtQuerySystemInformation" is a very powerful function that resides in ntdll.dll. As the name says, it queries a lot of system information. You choose the "information" that you want from an "enumeration". Matt has  re-documented (seems to be the latest) in It was indeed documented well, in year 2000 by Gary Nebbet. As said earlier, Microsoft can anytime make any changes to these structures, enumerations and functions. I haven't really done a diff but I could see a few more "information type" added to this enumeration in Matt's.

I assume the reader knows to use LoadLibrary() and do explicit linking.  I won't explain all those elementary stuff here. So here is what we need to do, to enumerate the processes currently managed by the operating system... (experts see how careful I was when I said that (lol) - processes don't run + there can be terminated-but-hanging-around processes (zombies) - refer Mark/David/Alex discussion in their internals-book as well as some forums).

You may use winternl.h to get the limited Microsoft documentation (and access) of native api/structs. I prefer to have my own list and I used Matt's documentation said earlier. winternl.h has been used and ntdll library has been implicitly linked, in the code that is shown in rohitab, that is if you want to do it that way... We will do explicit linking here...pretty much what people are mostly looking for...

#include "undoc.h" //uses - NtQuerySystemInformation.h

#define BLOB_SIZE 1024 * 1024 //Allocate a really large pool

int _tmain(int argc,PTSTR *argv)
    SYSTEM_PROCESS_INFORMATION *pspi_next,*pspi;
    NTSTATUS ns;
    pspi = (PSYSTEM_PROCESS_INFORMATION)HeapAlloc(GetProcessHeap(),0,BLOB_SIZE);


    pspi_next = pspi;
    do{ //Well, if this code is even running then there are at least 8 - 10 process in the system
        //depending upon the OS version (even WinPE!)

        pspi_next = (PSYSTEM_PROCESS_INFORMATION)(((PBYTE)pspi_next) + pspi_next->NextEntryOffset);


    return 0;

//undoc.h header [excerpted - rest you can refer exploit-monday and also give credit to the
//author Matt for his valuable contribution]

typedef NTSTATUS (NTAPI *_NtQuerySystemInformation)
    (IN SYSTEM_INFORMATION_CLASS SystemInformationClass,
    OUT PVOID SystemInformation,
    IN ULONG SystemInformationLength,

_NtQuerySystemInformation NtQuerySystemInformation = (_NtQuerySystemInformation)


We are using the undocumented NtQuerySystemInformation function to get some process information stored in the memory that is managed by the operating system. This information is SYSTEM_PROCESS_INFORMATION structure and we have one per process. We do not know how many processes... So allocate a large blob. Thankfully Heap functions do support this amount I gave. If not use VirtualAlloc (  Given one chunk of this information, it has an offset to the next chunk of the same "information set". Use that "link" to traverse all "chunks" until you reach end of blob. We don't check if it is end of blob, instead we see if "NextEntryOffset" goes 0. The rest is basics..


Thursday, March 10, 2016

How to verify a PE digital signature (Extended version)

A very good morning/afternoon/evening.. I am writing this at 3:32 AM... so I may say Good Early Morning as well...  I can stay deprived of sleep but not knowledge... :)

Well, if you are checking this topic I assume you already know about digital signatures and especially how they are used on PE (Portable Executable) images. Anyway, I got a project where I had to work on scanning some cabinet files that contain digital signature. And as of this writing, the world (matrix) is going through a significant change. SHA1 is getting deprecated and SHA2 is being implemented all over the world (corporate world).

So the best tool as of now is "signtool" that is provided with the driver development kit, SDK or with Visual studio. There are numerous sites published in the past couple of months and being published now on what is digital signature, how to have multiple signature and what not. So I was reading them and I learned a lot. Anyway I primarily started with two sources that I want the reader to read before he reads further...

Most of the information you see there are already documented. What is not documented is some stuff on how to check signature via catalog files and the sysinternals link provides code for exactly that problem. But not adequate so I added up two calls to make it better

(Look for Karthik - emm so many pseudonyms!!!)

But there are some good things that has happened. Microsoft has documented a few more functions that I am sure will be fully used for coding in the coming months and there will be so many websites perhaps MSDN itself showing up some codes that use them. I didn't see anyone using them as of this writing and of course I am using it for my project. I tried writing to a well known forum and they rejected my article for aesthetics, so here comes my first page! Ok, enough of stories and history...

So Microsoft has documented some functions that are in WinTrust.dll. Some sort of WT helper functions. I assume WinTrust helper functions.  "Signtool" (wdk 8) do not make use of these calls apparently. Only one of these functions was found. Anyway so what we can do now that we weren't able to do a year back or so is to check signature via WinVerifyTrust (venerable) and also use some information that it "stores" to get additional information.

All the functions including data structures used by them are documented. So let me just give the code here... (Again I assume you already know well about basic Crypto API usage and also I assume you went through the above forums)...

BOOL VerifyEmbeddedSignature2(HANDLE _h_verify_state)
    CMSG_SIGNER_INFO *pcmsgsi;

     _WTHelperProvDataFromStateData WTHelperProvDataFromStateData;
    WTHelperProvDataFromStateData = (_WTHelperProvDataFromStateData)

    if(WTHelperProvDataFromStateData == NULL)
          return  FALSE;
    pCPD = WTHelperProvDataFromStateData(_h_verify_state);

    if(pCPD == NULL)
            return FALSE;

    _WTHelperGetProvSignerFromChain WTHelperGetProvSignerFromChain;
    WTHelperGetProvSignerFromChain = (_WTHelperGetProvSignerFromChain)                         GetProcAddress(LoadLibrary(L"wintrust.dll"),"WTHelperGetProvSignerFromChain");

    if(WTHelperGetProvSignerFromChain == NULL)
            return FALSE;
    pCPS = WTHelperGetProvSignerFromChain(pCPD,0,FALSE,0);
    if(pCPD == NULL)
            return FALSE;
    pcmsgsi = pCPS->psSigner;
    printf("Hash Algorithm identifier (OID) - %s : Description - %s",pcmsgsi->HashAlgorithm.pszObjId,


Ok so the only parameter I pass to this function is "state data" that is actually got by calling WinVerifyTrust. How to do that is clearly shown in the MSDN code sample as well as the other sample in SysInternals forum.

Well, the only "extension" that I am providing here is get the hash algorithm. This code returns "SHA 2" OID or "SHA 1" OID. If any file is dual signed the signature at index 0 is pulled. I tested in two cases, one with a dual sign (SHA 1 + 256) and the other with just SHA 256. The GetAlgorithmName is a helper function I wrote to get a friendly name for the OID.

I am sure there is much more we can do now like connect the certificates information we get using these APIs and use "CERT functions" to get complete chaining info!!!