I have written a blog article in the past describing the Kernel Debugging Block (KDBG) in detail http://scudette.blogspot.ch/2012/11/finding-kernel-debugger-block.html as it is used by Volatility in order to "bootstrap" the analysis process. Many plugins require a list of processes, and Volatility uses the KDBG in order to locate the PsActiveProcessHead symbol (which is the head of the doubly linked list holding the _EPROCESS objects together).

Recently, the Volatility blog reminded us that the KDBG is critical for memory analysis. In that post, the author recognizes that the KDBG block is encoded on Window 8 and is not readily scanned for using the usual kdbgscan plugin. In particular that blog post states:

An encoded KDBG can have a hugely negative effect on your ability to perform memory forensics. This structure contains a lot of critical details about the system, including the pointers to the start of the lists of active processes and loaded kernel modules, the address of the PspCid handle table, the ranges for the paged and non-paged pools, etc. If all of these fields are encoded, your day becomes that much more difficult.

We have previously demonstrated in our OSDFC training workshop that the KDBG block can be trivially overwritten without affecting system stability. Since the kdbgscan plugin simply scans for the plain text "KDBG" signature, by overwriting this signature it is impossible to locate the KDBG, nor bootstrap memory analysis. Indeed with Volatility you are going to have a really bad day. It is still possible to workaround this limitation, and our workshop describes all the workarounds available, but it is definitely not ideal.

This problem was also discussed in the Black Hat talk One-byte Modification for Breaking Memory Forensic Analysis.

Does Rekall use the KDBG?

Volatility windows profiles are typically generated using the pdbparse project, using the script pdb_tpi_vtypes.py script. They normally only contain the vtype definitions (embedded into python files, for example vista_sp0_x64_vtypes.py).

While developing the Rekall profile system (which is described in detail in previous blog posts), new profiles were generated for windows kernels. Rather than rely on the pdbparse project to parse the pdb files, we have implemented a complete Microsoft PDB parser within the Rekall framework (This will be described in a future blog post).

Microsoft PDB files contain a number of streams. One of the streams describe struct definitions and can be used to generate the vtypes. However, interestingly, there are a few more streams which extract global symbols from the PDB file. (The pdbparse project does provide an additional script to extract the constants from the pdb file, but that script is not currently used by Volatility).

In other words, the PDB file contains the addresses in memory of many symbols. This is akin to the System.map file we use when analyzing a Linux memory image. Lets examine a typical Rekall windows profile:

  "PromoteNode": 611168,
  "PropertyEval": 451884,
  "PsAcquireProcessExitSynchronization": 1157620,
  "PsActiveProcessHead": 96160,
  "PsAssignImpersonationToken": 1479504,
  "PsBoostThreadIo": 219912,
  "KdD3Transition": 805316,
  "KdDebuggerDataBlock": 2003056,
  "KdDebuggerEnabled": 2562992,
  "KdDebuggerInitialize0": 805256,
  "KdDebuggerInitialize1": 805244,

We can see that the typical Microsoft kernel PDB file contains a huge number of symbols which are not exported in the PE export table. In particular we see the symbol PsActiveProcessHead which is required to list processes. We also see the exact location of the Kernel Debugger block in KdDebuggerDataBlock symbol (Just in case we need it). The symbol offset is specified relative to the Kernel Base address (i.e. the MZ header where the kernel is mapped into memory).

Let us examine in detail the steps that Rekall goes through in the pslist module by enabling verbose logging:

$ rekall --verbose -f  ~/images/win7.elf pslist
INFO:root:Autodetected physical address space Elf64CoreDump                     1
DEBUG:root:Opened url http://profiles.rekall.googlecode.com/git//pe.gz
INFO:root:Loaded profile pe from URL:http://profiles.rekall.googlecode.com/git/ 2
DEBUG:root:Verifying profile GUID/F8E2A8B5C9B74BF4A6E4A48F180099942             3
DEBUG:root:Opened url http://profiles.rekall.googlecode.com/git//GUID/F8E2A8B5C9B74BF4A6E4A48F180099942.gz
DEBUG:root:Opened url http://profiles.rekall.googlecode.com/git//ntoskrnl.exe/AMD64/6.1.7600.16385/F8E2A8B5C9B74BF4A6E4A48F180099942.gz
INFO:root:Loaded profile ntoskrnl.exe/AMD64/6.1.7600.16385/F8E2A8B5C9B74BF4A6E4A48F180099942 from URL:http://profiles.rekall.googlecode.com/git/
INFO:root:Loaded profile GUID/F8E2A8B5C9B74BF4A6E4A48F180099942 from URL:http://profiles.rekall.googlecode.com/git/
DEBUG:root:Found _EPROCESS @ 0x2818140 (DTB: 0x187000)                          4
INFO:root:Detected ntkrnlmp.pdb with GUID F8E2A8B5C9B74BF4A6E4A48F180099942
  Offset (V)   Name                    PID   PPID   Thds     Hnds   Sess  Wow64 Start                    Exit
-------------- -------------------- ------ ------ ------ -------- ------ ------ ------------------------ ------------------------
INFO:root:Detected kernel base at 0xF8000261F000                                5
0xfa80008959e0 System                    4      0     84      511 ------  False 2012-10-01 21:39:51+0000 -
0xfa8001994310 smss.exe                272      4      2       29 ------  False 2012-10-01 21:39:51+0000 -
0xfa8002259060 csrss.exe               348    340      9      436      0  False 2012-10-01 21:39:57+0000 -
1 Rekall auto-detects this image as contained in an EWF file.
2 Rekall now contacts the profile repository to retrieve the parser for the PE file format.
3 The PE profile is used to scan for RSDS signatures. These are verified so we can be pretty confident that we loaded the exact profile for this image.
4 The Kernel DTB is located by scanning for the Idle process.
5 We now find the kernel’s base address. Once that is known, the addresses of all symbols in the kernel’s virtual address space are known directly from the profile. i.e. We do not need to scan for anything, we already know where everything is.

Rekall generally does not need to use the KDBG at all. This is much faster since it does not need to scan for it, but more importantly, is much more robust because malware can not overwrite the PsActiveProcessHead symbol without crashing the system.

Since Rekall uses a profile repository we are able to locate the exact profile for the kernel we are analyzing. Therefore we do not need to scan for anything - we always prefer to just read the exact addresses from the profile without guessing. This makes analysis far more robust and simple.

Another example, the callbacks plugin.

Another example of this technique is the callbacks plugin. Here, Volatility resorts to disassembling various exported functions to try to locate the offset of a number of non-exported callback pointer tables (e.g. PsSetLoadImageNotifyRoutine is disassembled to get to PspLoadImageNotifyRoutine). This algorithm is pretty fragile and complex. It also only works on 32 bit systems at the moment, since signatures need to be developed for different architectures.

However, this algorithm is entirely not needed, if one uses the correct profile for the exact kernel version. You can simply look up the exact addresses of the (non-exported) symbols you need. Here is the Rekall code:

        routines = ["_PspLoadImageNotifyRoutine",             1

        for symbol in routines:
            # The list is an array of 8 _EX_FAST_REF objects
            addrs = self.profile.get_constant_object(         2

            for addr in addrs:                                3
                callback = addr.dereference_as("_GENERIC_CALLBACK")
                if callback:
                    yield "GenericKernelCallback", callback.Callback, None
1 We look up each one of these symbols by name.
2 We use the profile directly to instanstiate an array of 8 _EX_FAST_REF.
3 We dereference each of the addresses to find the callbacks.

There is no need to scan or disassemble anything to retrieve the symbol addresses, since we know exactly where they are already.

What else can we do with profile constants?

The amount of information provided in the kernel PDB files is truly extensive. Not only does Microsoft provide non-exported function names, but also global names, string names, import table entries and much more.

This is extremely useful when disassembling code in Rekall. Since Rekall disassembles the code which is resident in memory, all relocations, imports, exports etc have already been done by the kernel. In other words if we see a memory reference, we can resolve it to know where it is or what it is without considering imports.

Here is an example of disassembling the PsSetLoadImageNotifyRoutine routine on a 64 bit image (This is what Volatility is doing in the callbacks plugin).

$ rekall -f  ~/images/win7.elf dis 'ntoskrnl.exe!PsSetLoadImageNotifyRoutine'
   Address      Rel Op Codes             Instruction                    Comment
-------------- ---- -------------------- ------------------------------ -------
------ ntoskrnl.exe!PsSetLoadImageNotifyRoutine ------
0xf80002aa1050    0 48895c2408           MOV [RSP+0x8], RBX
0xf80002aa1055    5 57                   PUSH RDI
0xf80002aa1056    6 4883ec20             SUB RSP, 0x20
0xf80002aa105a    A 33d2                 XOR EDX, EDX
0xf80002aa105c    C e8bfb1feff           CALL 0xf80002a8c220            ntoskrnl.exe!ExAllocateCallBack
0xf80002aa1061   11 488bf8               MOV RDI, RAX
0xf80002aa1064   14 4885c0               TEST RAX, RAX
0xf80002aa1067   17 7507                 JNZ 0xf80002aa1070             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x20
0xf80002aa1069   19 b89a0000c0           MOV EAX, 0xffffffffc000009a
0xf80002aa106e   1E eb4a                 JMP 0xf80002aa10ba             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x6A
0xf80002aa1070   20 33db                 XOR EBX, EBX
0xf80002aa1072   22 488d0d27d4d9ff       LEA RCX, [RIP-0x262bd9]        0xFFFFF8A0001310BF ntoskrnl.exe!PspLoadImageNotifyRoutine
0xf80002aa1079   29 4533c0               XOR R8D, R8D
0xf80002aa107c   2C 488bd7               MOV RDX, RDI
0xf80002aa107f   2F 488d0cd9             LEA RCX, [RCX+RBX*8]
0xf80002aa1083   33 e8c817f8ff           CALL 0xf80002a22850            ntoskrnl.exe!ExCompareExchangeCallBack
0xf80002aa1088   38 84c0                 TEST AL, AL
0xf80002aa108a   3A 7511                 JNZ 0xf80002aa109d             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x4D
0xf80002aa108c   3C ffc3                 INC EBX
0xf80002aa108e   3E 83fb08               CMP EBX, 0x8
0xf80002aa1091   41 72df                 JB 0xf80002aa1072              ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x22
0xf80002aa1093   43 488bcf               MOV RCX, RDI
0xf80002aa1096   46 e805e9f5ff           CALL 0xf800029ff9a0            ntoskrnl.exe!IopDeallocateApc
0xf80002aa109b   4B ebcc                 JMP 0xf80002aa1069             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x19
0xf80002aa109d   4D f083053bd4d9ff01     LOCK ADD DWORD [RIP-0x262bc5], 0x1 0x1 ntoskrnl.exe!PspLoadImageNotifyRoutineCount
0xf80002aa10a5   55 8b05d5d3d9ff         MOV EAX, [RIP-0x262c2b]        0x7 ntoskrnl.exe!PspNotifyEnableMask
0xf80002aa10ab   5B a801                 TEST AL, 0x1
0xf80002aa10ad   5D 7509                 JNZ 0xf80002aa10b8             ntoskrnl.exe!PsSetLoadImageNotifyRoutine + 0x68
0xf80002aa10af   5F f00fba2dc8d3d9ff00   LOCK BTS DWORD [RIP-0x262c38], 0x0 0x7 ntoskrnl.exe!PspNotifyEnableMask
0xf80002aa10b8   68 33c0                 XOR EAX, EAX
0xf80002aa10ba   6A 488b5c2430           MOV RBX, [RSP+0x30]
0xf80002aa10bf   6F 4883c420             ADD RSP, 0x20
0xf80002aa10c3   73 5f                   POP RDI

We can see that addresses are resolved according to the known symbols at that address (In the Volatility code we are actually after the PspLoadImageNotifyRoutine address).