I love it when a plan comes together

2006-04-14

Discovering and then fixing the problem that prevented my tools from running on themselves planted a seed of an idea in my mind. My APIHook library and any code that used it died horribly under leak testing tools such as Purify and BoundsChecker. I’d put it down to me trying to be too clever but not being clever enough and ignored it but it’s a pain to have code that you can’t polish. Solving the problem of the “wrong address” suggested to me that perhaps these other tools were causing similar problems. It’s most likely that they both operate in a similar fashion to my tools (at the API interception level) and since my “fix” was specific to my hooking technique (I hacked the fix into my hook for GetProcAddress()) it made sense that if I could find a more generic solution I might fix the code so that it worked when running under any tool that hooked APIs…

Updated 4th May 2023 to fix broken links

The problem was this; any tool that hooks an API using IAT patching most probably also hooks LoadLibrary() and GetProcAddress(); it needs to so that it can make sure the whole process is hooked and that the whole process has the same view of the addresses of the APIs that it has hooked. My DLL injection API hooking technique (well, Jeffrey Richter’s really), relies on being able to create a remote thread in the target process and have that thread run LoadLibrary() to load a DLL into that process. The crashes I was experiencing were due to the fact that the address I was getting for LoadLibrary() was the address of a hook which was specific to my tool’s process rather than being the real address in Kernel32.dll which is the same in all processes… My first attempt at fixing this was a hack that would only work with my own hooks.

The usual way to obtain the address of a function within a dll is to call GetProcAddress(), however, since we must get the real address of LoadLibrary() and since we must assume that both LoadLibrary() and GetProcAddress() have already been patched by the time we try to obtain the address we need to avoid GetProcAddress() and, instead, do things the hard way.

So, how do you find the address of an exported function from a DLL if you can’t use the normal means? Step one is map the DLL into memory as a normal file. Once you’ve done this you can pick apart the PE file format (the format used for all Windows DLL and EXE files) and walk the various structures until you have the address that you need. Luckilly the “hard way” isn’t really that hard; especially since Matt Pietrek has written a great article on the PE file format, and even more especially since that article comes with code that dumps PE files.

The resulting piece of code that works out the address of a function within a module is as follows:

DWORD CRawImage::GetExportedProcRVA(
   const _tstring &name) const
{
   const string narrowName = CStringConverter::TtoA(name);

   const DWORD exportsStartRVA = m_pNTHeader->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress;
   const DWORD exportsEndRVA = exportsStartRVA + m_pNTHeader->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].Size;

   IMAGE_EXPORT_DIRECTORY *pExports = reinterpret_cast<IMAGE_EXPORT_DIRECTORY*>(GetPtrFromRVA(exportsStartRVA, m_pNTHeader, m_pImageBase));

   DWORD *pFunctions = reinterpret_cast<DWORD*>(GetPtrFromRVA(pExports->AddressOfFunctions, m_pNTHeader, m_pImageBase));
   WORD *pOrdinals =	reinterpret_cast<WORD *>(GetPtrFromRVA(pExports->AddressOfNameOrdinals, m_pNTHeader, m_pImageBase));
   DWORD *pFuncNames =  reinterpret_cast<DWORD*>(GetPtrFromRVA(pExports->AddressOfNames, m_pNTHeader, m_pImageBase));

   for (size_t i = 0; i < pExports->NumberOfFunctions; ++i)
   {
      const DWORD entryPointRVA = *(pFunctions + i);

      if (entryPointRVA != 0)       // Skip over gaps in exported function ordinals
      {
         // Look for a name

         for (size_t j = 0; j < pExports->NumberOfNames; ++j)
         {
            if (pOrdinals[j] == i)
            {
               if (narrowName == reinterpret_cast<char*>(GetPtrFromRVA(pFuncNames[j], m_pNTHeader, m_pImageBase)))
               {
                  // We dont handle forwarders...

                  return entryPointRVA;
               }
            }
         }
      }
   }

   return 0;
}

Note that this only returns the RVA (relative virtual address) of the function. We actually want the address relative to the load address of Kernel32.dll rather than relative to where we mapped our view of Kernel32.dll to be able to walk its PE format…

The following provides the address when given a handle to a properly loaded module…

FARPROC CRawImage::GetExportedProcAddressInModule(
   HMODULE hModule,
   const _tstring &name) const
{
   FARPROC pProc = 0;

   const DWORD rva = GetExportedProcRVA(name);

   if (rva)
   {
      pProc = reinterpret_cast<FARPROC>(reinterpret_cast<DWORD>(hModule) + rva);
   }

   return pProc;
}

So now that I’m free from using a potentially compromised GetProcAddress() and can locate the address of LoadLibrary() myself I can remove the hacky fix that I put in to my hooking code. Once the code was adjusted and the tests passed I built the whole lot under the tools that were causing problems and everything worked nicely.

I love it when a plan comes together…