Rekall Tutorial

The main goals of the Rekall framework are to enhance user experience by making common tasks easier and more intuitive as well as provide a powerful and capable interface for automation and performing more complex operations.

1. Installation

Rekall is available as a python package installable via the pip package manager. We support running Rekall under a virtualenv environment (This guarantees that the exact versions of all dependencies are met).

Simply type (for example on Linux):

$ virtualenv  /tmp/MyEnv
New python executable in /tmp/MyEnv/bin/python
Installing setuptools, pip...done.
$ source /tmp/MyEnv/bin/activate
$ pip install rekall

To have all the dependencies installed. You still need to have python and pip installed first.

To be able to run the Rekall GUI, you will need to install the rekall-gui package:

$ pip install rekall-gui

For windows, Rekall is also available as a self contained installer package. Please check the download page for the most appropriate installer to use.

2. A Rekall walkthrough

This section is a quick tour of the Rekall user interface. The program can accept command line options - which we can learn more about by using the --help option (Abbreviates to show only the important options):

$ rekal -h
usage: rekal [-p PROFILE] [-v] [-q] [--debug] [--output_style {concise,full}]
             [--logging_level {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
...
             [--version] [-]

Output control:
  -v, --verbose         Set logging to debug level.
  --output_style {concise,full}
                        How much information to show. Default is 'concise'.
  --plugin [PLUGIN [PLUGIN ...]]
                        Load user provided plugin bundle.
  -h, --help            Show help about global paramters.
  --cache {file,memory,timed}
                        Type of cache to use.
  --repository_path [REPOSITORY_PATH [REPOSITORY_PATH ...]]
                        Path to search for profiles. This can take any form
                        supported by the IO Manager (e.g. zip files,
                        directories, URLs etc)
  -f FILENAME, --filename FILENAME
                        The raw image to load.
  --live                Enable live memory analysis.
  --version             Prints the Rekall version and exits.

Interface:
  --pager PAGER         The pager to use when output is larger than a screen
                        full.
  -F {text,json,wide,xls,test,data}, --format {text,json,wide,xls,test,data}
                        The output format to use. Default (text)
  --timezone TIMEZONE   Timezone to output all times (e.g. Australia/Sydney).

Plugin shell options:
  -p PROFILE, --profile PROFILE
                        Name of the profile to load. This is the filename of
                        the profile found in the profiles directory. Profiles
                        are searched in the profile path order.

Some of the most frequently used flags are described below:

--verbose

This is a shorthand to setting the logging level to DEBUG. Rekall will produce debug messages of its operation. You should use this if you want to know more of what Rekall is doing and also to attach output for bug reports.

--plugin

If provided, Rekall loads this python file at start time. The file may define any plugins, overlays etc which might add additional functionality to Rekall.

--cache

Rekall has a caching mechanism to be able to remember important information about images in between executions. When Rekall starts up it looks in the cache to see if this particular image has been previously analyzed. This allows Rekall to load previously derived data from cache instead of recalculating it. There are 3 cache modes: file is the usual persistant cache (by default in ~/.rekall-cache). memorycache is only present in memory for the life of this process. timed cache is used for live images which keep changing (it is essentially a memory cache which is flushed periodically).

--repository_path

Rekall normally loads the required profiles from a profile repository. By default, Rekall will use the rekall public repository. If you do not have internet access or want to host the repository locally, you can use git clone to copy the entire repository somewhere and then specify --repository_path to indicate where profiles should be loaded.

--filename

The --filename option is the name of the image to analyse. If you are using the raw device as provided by the winpmem driver, this will be \\.\pmem (windows) for example or /proc/kcore (linux). You almost always want to specify a filename to operate on. A notable exception is when using the --live option which will set the appropriate filename automatically.

--live

The --live option enables live memory analysis. Note that you would usually need to be running as root to do this. When running on Windows, Rekall will insert the winpmem driver. On OSX, Rekall will insert the macpmem driver. On Linux, Rekall will attempt to use the /proc/kcore memory device. When specifying this option you do not need to specify the --filename option because Rekall will automatically open the right device. When Rekall exits, the driver will be unloaded.

--profile

The --profile flag specifies the profile to use. The profile is a JSON file with operating system specific data used to parse the data structures in the image. The profile used must exactly match the operating system version in the image. This parameter is usually only needed if you are generating your own profile (e.g. when analyzing an unusual Linux system). Profiles are normally autodetected in Rekall and are not usually specified by the user.

--pager

The pager can be specified as the program that will be used to inspect the result of each plugin. Since many plugins produce a great deal of output text, a pager is often needed. Rekall will write the output to a temporary file, and launch this program to view it. For example, on windows it is useful to use notepad to examine the output from each plugin (e.g. --pager notepad). On linux use --pager less for example, or even --pager "gvim -f".

--format

Rekall output is produced in a number of formats. The default format is text which consists of tabulated human readable output. The wide format produces expanded rows (useful for very wide tables with many columns). Often it is useful to further process the output by machine. In that case the data format can be used to produce machine parsable JSON data. Rekall can also produce an Excel compatible spreadsheet (if you have the openpyxl package installed).

--timezone

Internally Rekall always works with times in UTC (i.e. without a timezone) however it is often convenient to output data in a particular timezone. Use this option to cause output to be produced in that timezone. Note that Rekall always fully specifies all times so it does not really matter what timezone you use for output. For example specify --timezone Australia/Sydney will cause timestamps to be written as 2012-10-02 07:39:51+1000 instead of 2012-10-01 21:39:51Z

The Rekall architecture is built around plugins. A plugin is an extension with a name (e.g. pslist) which produces some analysis tasks and generates some output. There are many Rekall plugins for many operating systems and configurations. Not all plugins are applicable for all images. For example, a plugin which is designed to work on Windows 7 will not be active on a Windows XP image.

Similarly we try to keep the names of plugins consistent across operating systems. For example, the pslist plugin is named the same for all operating systems, even though it is actually implemented by different code for each OSX.

The interactive shell

If the command is followed by a plugin name, the tool will not use the interactive shell, but rather run the plugin and exit:

Example: Running plugin from command line.
$ rekal -f xp-laptop-2005-06-25.img pslist
Offset (V) Name                    PID   PPID   Thds     Hnds   Sess  Wow64 Start                Exit
---------- -------------------- ------ ------ ------ -------- ------ ------ -------------------- --------------------
0x823c87c0 System                    4      0     61     1140 ------  False -                    -
0x81fdf020 smss.exe                448      4      3       21 ------  False 2005-06-25 16:47:28  -
0x81f5a3b8 csrss.exe               504    448     12      596      0  False 2005-06-25 16:47:30  -
0x81f8eb10 winlogon.exe            528    448     21      508      0  False 2005-06-25 16:47:31  -
0x820e0da0 services.exe            580    528     18      401      0  False 2005-06-25 16:47:31  -
...

Some command plugins use additional options specific to their module. You can read plugin specific help by specifying the --help option after the name of the plugin:

$ rekal -f xp-laptop-2005-06-25.img pslist --help
usage: rekal pslist [-h] [--kdbg KDBG] [--eprocess EPROCESS [EPROCESS ...]]
                     [--phys_eprocess PHYS_EPROCESS [PHYS_EPROCESS ...]]
                     [--pid PID [PID ...]] [--proc_regex PROC_REGEX]

List processes for windows.

optional arguments:
  -h, --help            show this help message and exit
  --eprocess EPROCESS [EPROCESS ...]
                        Kernel addresses of eprocess structs.
  --phys_eprocess PHYS_EPROCESS [PHYS_EPROCESS ...]
                        Physical addresses of eprocess structs.
  --pid PID [PID ...]   One or more pids of processes to select.
  --proc_regex PROC_REGEX
                        A regex to select a profile by name.
Note

Because Rekall does not know which specific plugin will handle a particular plugin name until it inspects the image, it is essential that the image be specified with the --filename arg. If this is not specified, Rekall must assume there is no such plugin (so above rekal pslist -h will not work because Rekall does not know which version of pslist we are searching for).

For this reason we recommend that users use the interactive mode as much as possible (see below).

For example to list all the svchost processes, we can apply a regex to process names:

$ rekal -f xp-laptop-2005-06-25.img pslist --proc_regex svc
Offset (V) Name                    PID   PPID   Thds     Hnds   Sess  Wow64 Start                Exit
---------- -------------------- ------ ------ ------ -------- ------ ------ -------------------- --------------------
0x81fa5aa0 svchost.exe             740    580     17      198      0  False 2005-06-25 16:47:32  -
0x81fa8650 svchost.exe             800    580     10      302      0  False 2005-06-25 16:47:33  -
0x81faba78 svchost.exe             840    580     83     1589      0  False 2005-06-25 16:47:33  -
0x81f8dda0 svchost.exe             984    580      6       90      0  False 2005-06-25 16:47:35  -
0x81f6e7e8 svchost.exe            1024    580     15      207      0  False 2005-06-25 16:47:35  -
0x82081da0 svchost.exe            1484    580      6      119      0  False 2005-06-25 16:47:59  -

In order to use the interactive shell, do not specify any plugin to run:

$ rekal -f ~/images/win7.elf
 ---------------------------------------------------------------------------
 The Rekall Memory Forensic framework 1.5.0 (Furka).

 "We can remember it for you wholesale!"

 This program is free software; you can redistribute it and/or modify it under
 the terms of the GNU General Public License.

 See http://www.rekall-forensic.com/docs/Manual/tutorial.html to get started.
 ----------------------------------------------------------------------------
 [1] win7.elf 13:54:24>

The interactive shell marks input lines with the name of the image and a time of day. In order to run a plugin simply type its name and press enter:

[1] win7.elf 13:54:24> pslist   1
---------------------> pslist() 2
Offset (V) Name                    PID   PPID   Thds     Hnds   Sess  Wow64 Start                Exit
---------- -------------------- ------ ------ ------ -------- ------ ------ -------------------- --------------------
0x823c87c0 System                    4      0     61     1140 ------  False -                    -
0x81fdf020 smss.exe                448      4      3       21 ------  False 2005-06-25 16:47:28  -
1The user enters the bare word pslist as a command.
2IPython is configured for auto-execution and assumed the user wants to run a function called pslist()

Plugins appear as functions in the namespace the shell is running in. IPython sees the bare word pslist and assumes you mean to run the function pslist(). Running this function will render the output to the screen.

All plugins which are applicable to the current image and profile are also collected in the variable plugins in the namespace. This means we can use command line completion to discover all the plugins we could use on the current image:

In [3]: plugins.[tab][tab]
plugins.callbacks    plugins.handles      plugins.modules      plugins.raw2dmp
plugins.cmdscan      plugins.hashdump     plugins.mutantscan   plugins.regdump
plugins.connections  plugins.hivedump     plugins.null         plugins.sockets
plugins.connscan     plugins.hivescan     plugins.pas2vas      plugins.svcscan
plugins.consoles     plugins.impscan      plugins.pedump       plugins.symlinkscan

To learn more about each of these plugins we can follow the name of the plugin with a single question mark:

[1] win7.elf 13:56:55> plugins.pslist?
file:       /home/scudette/rekall/rekall-core/rekall/plugins/windows/taskmods.py
Plugin:     WinPsList (pslist)
Parameters:
  profile:       Name of the profile to load. This is the filename of the profile found in the profiles directory. Profiles are searched in the profile path order.
  dtb:           The DTB physical address. (type: IntParser)
  eprocess:      Kernel addresses of eprocess structs. (type: ArrayIntParser)
  phys_eprocess: Physical addresses of eprocess structs. (type: ArrayIntParser)
  pid:           One or more pids of processes to select. (type: ArrayIntParser)
  proc_regex:    A regex to select a process by name. (type: RegEx)
  method:        Method to list processes (Default uses all methods). (type: ChoiceArray)
Docstring:  List processes for windows.
Link:       http://www.rekall-forensic.com/epydocs/rekall.plugins.windows.taskmods.WinPsList-class.html

Following the name of the plugin with two question marks lists the source of the plugin as well:

[1] win7.elf 13:58:57> plugins.pslist??
file:       /home/scudette/rekall/rekall-core/rekall/plugins/windows/taskmods.py
Plugin:     WinPsList (pslist)
Parameters:
  profile:       Name of the profile to load. This is the filename of the profile found in the profiles directory. Profiles are searched in the profile path order.
  dtb:           The DTB physical address. (type: IntParser)
  eprocess:      Kernel addresses of eprocess structs. (type: ArrayIntParser)
  phys_eprocess: Physical addresses of eprocess structs. (type: ArrayIntParser)
  pid:           One or more pids of processes to select. (type: ArrayIntParser)
  proc_regex:    A regex to select a process by name. (type: RegEx)
  method:        Method to list processes (Default uses all methods). (type: ChoiceArray)
Docstring:  List processes for windows.
Link:       http://www.rekall-forensic.com/epydocs/rekall.plugins.windows.taskmods.WinPsList-class.html
source:
class WinPsList(common.WinProcessFilter):
    """List processes for windows."""

    __name = "pslist"

    eprocess = None

    @classmethod
    def args(cls, metadata):
        super(WinPsList, cls).args(metadata)
        metadata.set_description("""
        Lists the processes by following the _EPROCESS.PsActiveList.

        In the windows operating system, processes are linked together through a
        doubly linked list. This plugin follows the list around, printing
        information about each process.

        To begin, we need to find any element on the list. This can be done by:

        1) Obtaining the _KDDEBUGGER_DATA64.PsActiveProcessHead - debug
           information.

        2) Finding any _EPROCESS in memory (e.g. through psscan) and following
           its list.

        This plugin supports both approaches.
        """)
...

This is useful in order to verify how a particular plugin works. Note that commandline completion can be used to speed up plugin selection and arguments for plugins:

[1] win7.elf 14:01:37> plugins.ps[tab][tab]
plugins.pslist   plugins.psscan   plugins.pstree   plugins.psxview
[1] win7.elf 14:01:37> plugins.pslist [tab][tab]
dtb=            eprocess=       method=         phys_eprocess=  pid=            proc_regex=     profile=
[1] win7.elf 14:01:37> plugins.pslist pr[tab][tab]
proc_regex=  profile=
[1] win7.elf 14:01:37> plugins.pslist proc_regex="svc"

We can now specify the parameters to be passed to the plugin (This is the same as the command line example above):

Example: Specifying plugin options in the interactive shell.
[1] win7.elf 14:15:55> pslist proc_regex="svc"
---------------------> pslist(proc_regex="svc")
  _EPROCESS            Name          PID   PPID   Thds    Hnds    Sess  Wow64           Start                     Exit
-------------- -------------------- ----- ------ ------ -------- ------ ------ ------------------------ ------------------------
0xfa80024f85d0 svchost.exe            236    480     19      455      0 False  2012-10-01 14:40:01Z     -
0xfa80023f6770 svchost.exe            608    480     12      352      0 False  2012-10-01 21:39:59Z     -
0xfa8002522b30 svchost.exe            624    480     16      372      0 False  2012-10-01 14:40:01Z     -
0xfa800242a350 svchost.exe            716    480      7      260      0 False  2012-10-01 14:40:00Z     -
0xfa80024589e0 svchost.exe            768    480     23      535      0 False  2012-10-01 14:40:00Z     -

The interactive shell is the most powerful and flexible interface and so the remainder of this tutorial will focus on it.

At the heart of the interactive interface is the session object. The session is an object which contains information about the current image analysis.

It follows that after running the pslist() plugin for the first time, a number of new objects are stored in the session, and they can be reused the second time a pslist() is run. At any time we can view the current session by simply printing it:

[1] win7.elf 14:09:09> print session
Rekall Memory Forensics session Started on Sun Jan 31 13:56:55 2016.

Config:
{
  __dummy = False
  autodetect = ['nt_index', 'osx', 'pe', 'windows_kernel_file', 'rsds', 'ntfs', 'linux']
  autodetect_build_local = basic
  autodetect_build_local_tracked = set(['win32k', 'ntdll', 'nt', 'tcpip'])
  autodetect_scan_length = 1000000000
...
  timezone = UTC
  verbose = False
  version = False
}

Cache (<FileCache @ /home/scudette/.rekall_cache/sessions/v1.0/sessions/de814bea95020499e782df54aa4b0d5ce4544823>):
{
  default_address_space = WindowsAMD64PagedMemory@0x00187000 (Kernel AS@0x187000)
  dtb = 1601536
  profile = nt/GUID/F8E2A8B5C9B74BF4A6E4A48F180099942
  profile_obj = <AMD64 profile nt/GUID/F8E2A8B5C9B74BF4A6E4A48F180099942 (Nt)>
  pslist_CSRSS = set([275427705947968, 275427703135024, 275427707404384, 275427696227600, 275427701196688, 2754276960 ...
  pslist_Handles = set([275427705361984, 275427703135024, 275427688164144, 275427696227600, 275427701196688, 2754277015 ...
  pslist_PsActiveProcessHead = set([275427705947968, 275427703135024, 275427701697328, 275427707404384, 275427706255504, 2754277015 ...
  pslist_PspCidTable = set([275427706779456, 275427703135024, 275427688164144, 275427696227600, 275427701196688, 2754277015 ...
  pslist_Sessions = set([275427700679504, 275427705361984, 275427705947968, 275427688984672, 275427701697328, 2754276962 ...
}

We can see that the session contains two parts - a configuration part and a cache. The cache will store files in the specified directory. The session shows us the content of the cache.

default_address_space

The default_address_space is the address space that will be used to instantiate new objects from the interactive shell by default. When reading memory we must always use an address space. For example, we can use the physical address space (i.e. the raw image) or a virtual address space (for example, the kernel or any process’s address space). Many plugins rely on the correct default address space to be used. This is typically known as the Process Context and can be changed using the ccplugin. By default the default_address_space is set to the kernel’s address space.

For example, suppose we want to examine the process listing from earlier in more details. The process list has a column with a name _EPROCESS indicating this is the address of the kernel’s _EPROCESS struct. Rekall typically lists the names and addresses of critical kernel data structures in its tabular output.

Suppose we now wish to examine the _EPROCESS object reported by the pslist module above for pid 768. We read the virtual offset of the process as 0xfa80024589e0 in the default address space. We first create an instance of the _EPROCESS at the reported offset, and assign it to a variable. We then can examine all the fields of this struct by using command line completion (double tab):

[1] win7.elf 14:15:58> task = session.profile._EPROCESS(0xfa80024589e0)
[1] win7.elf 14:19:57> task.
Display all 180 possibilities? (y or n)
task.AccountingFolded              task.Flags                         task.OtherOperationCount           task.SectionBaseAddress            task.cast
...
[1] win7.elf 14:19:57> task.UniqueProcessId
               Out<19 >  [unsigned int:UniqueProcessId]: 0x00000300
[1] win7.elf 14:20:08> task.Ima
task.ImageFileName    task.ImageNotifyDone  task.ImagePathHash
[1] win7.elf 14:20:08> task.ImageFileName
               Out<20 >  [String:ImageFileName]: 'svchost.exe\x00'

When examining a member in a struct (such as _EPROCESS.ImageFileName) we receive an instance of a rekall BaseObject. This has both a type (e.g. String) and a name (e.g. ImageFileName) as well as a human readable representation (e.g. svchost.exe).

For example in the task object we have a type of _EPROCESS (which is a struct), and it exists at offset 0x81fa5aa0:

In [13]: task
Out[13]: [_EPROCESS _EPROCESS] @ 0x81FA5AA0

To view the entire object we can print it (you can use the shortcut p as a shortcut for print - either will work).

[1] win7.elf 14:20:14> p task
---------------------> p(task)
[_EPROCESS _EPROCESS] @ 0xFA80024589E0 (pid=768)
  0x00 Pcb                          [_KPROCESS Pcb] @ 0xFA80024589E0
  0x160 ProcessLock                  [_EX_PUSH_LOCK ProcessLock] @ 0xFA8002458B40
  0x168 CreateTime                    [WinFileTime:CreateTime]: 0x5069AB40 (2012-10-01 14:40:00Z)
  0x170 ExitTime                      [WinFileTime:ExitTime]: 0x00000000 (-)
  0x178 RundownProtect               [_EX_RUNDOWN_REF RundownProtect] @ 0xFA8002458B58
  0x180 UniqueProcessId               [unsigned int:UniqueProcessId]: 0x00000300
  0x188 ActiveProcessLinks           [_LIST_ENTRY ActiveProcessLinks] @ 0xFA8002458B68
  0x198 ProcessQuotaUsage            <Array 2 x unsigned long long @ 0xFA8002458B78>
  0x1A8 ProcessQuotaPeak             <Array 2 x unsigned long long @ 0xFA8002458B88>
  0x1B8 CommitCharge                  [unsigned long long:CommitCharge]: 0x00000D7F
...

This view shows the layout of the _EPROCESS struct which is overlayed on the offset 0xFA80024589E0. The first column is the relative offset of each member, followed by the name of each member and a representation of each member. This representation consists of a type (e.g. _KPROCESS), a name (e.g. Pcb) and a human readable representation.

3. Interactive plugins

Rekall is more than just a framework for running plugins. It is a complete interactive environment for memory analysis. Many of the more interesting features Rekall provides exist as part of the interactive environment.

3.1. The Address Resolver.

A large part of Memory analysis is about emulating the execution environment of running code. When code is executed it can access and branch into different addresses in the Virtual Address Space. It is Rekall’s job to learn what exists at different addresses. The address space is divided into parts and Rekall can keep track of what different addresses mean.

The Address Resolver is a special component which keeps track of the memroy layout in the virtual address space. This service can answer:

  1. What is found in a specific address?

  2. What is the address of a given symbol?

In order to efficiently represent memory addresses Rekall uses a specific notation:

  • First a module name is specified. This can be the name of a kernel module, or dll.

  • Then the ! character is used to separate the module name from the symbol name.

  • A symbol name is specified in the module.

  • Possible offset from the symbol name.

The following are all valid examples:

[1] win7.elf 16:47:32> dis "nt!MmLoadSystemImage"      1
---------------------> dis("nt!MmLoadSystemImage")
Address      Rel             Op Codes                     Instruction                Comment
------- -------------- -------------------- ---------------------------------------- -------
------ nt!MmLoadSystemImage ------: 0xf80002a75060
  0xf80002a75060            0x0 488bc4               mov rax, rsp
  0xf80002a75063            0x3 48895818             mov qword ptr [rax + 0x18], rbx

[1] win7.elf 17:00:20> dis "_ssl!init_ssl"            2
---------------------> dis("_ssl!init_ssl")
Address      Rel             Op Codes                     Instruction                Comment
------- -------------- -------------------- ---------------------------------------- -------
------ _ssl!init_ssl ------: 0x10001000
      0x10001000            0x0 a1e4310710           mov eax, dword ptr [0x100731e4]  3     0x1e1ec4c0 _ssl!init_ssl+0x721e4 -> python27!PyType_Type
      0x10001005            0x5 68f5030000           push 0x3f5
      0x1000100a            0xa 6a00                 push 0
      0x1000100c            0xc 68508d0a10           push 0x100a8d50                          _ssl!+0x10d25
      0x10001011           0x11 68f08c0a10           push 0x100a8cf0                          _ssl!+0x10cc5
      0x10001016           0x16 68fc700910           push 0x100970fc                          _ssl!init_ssl+0x960fc
      0x1000101b           0x1b a3dc890a10           mov dword ptr [0x100a89dc], eax          0x1e1ec4c0 _ssl!+0x109b1 -> python27!PyType_Type
      0x10001020           0x20 ff15c0310710         call dword ptr [0x100731c0]              0x1e007152 _ssl!init_ssl+0x721c0 -> python27!Py_InitModule4
      0x10001026           0x26 83c414               add esp, 0x14
      0x10001029           0x29 85c0                 test eax, eax
1Disassembly of a kernel function. The "nt" symbol always refers to the kernel.
2Disassembly of a python extension (a dll). (The cc plugin was run to set the process context).
3The Address Resolver is also used inside plugins to resolve addresses where needed. For example here, We see a move instruction from the constantpython27!PyType_Type (i.e. the python Type type). This makes it very easy to see how assembly code interacts with the rest of the address space.

Similarly the dump plugin dumps a hexdump of some memory, but it is able to annotate information about what resides in each address:

[1] win7.elf 17:08:46> dump "nt!SeTcbPrivilege"
---------------------> dump("nt!SeTcbPrivilege")
    Offset                                   Data                                                Comment
-------------- ----------------------------------------------------------------- ----------------------------------------
0xf80002b590b8 07 00 00 00 00 00 00 00 44 02 01 00 80 f9 ff ff  ........D....... nt!SeTcbPrivilege, nt!NlsOemToUnicodeData
0xf80002b590c8 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00  ................ nt!VfRandomVerifiedDrivers, nt!TunnelMaxEntries, nt!ExpBootLicensingData
0xf80002b590d8 bc 00 00 00 00 10 00 00 00 00 ff 07 80 f8 ff ff  ................ nt!ExpLicensingDescriptorsCount, nt!CmpStashBufferSize, nt!ExpLicensingView
0xf80002b590e8 e8 f5 00 00 a0 f8 ff ff e8 45 7a 05 a0 f8 ff ff  .........Ez..... nt!CmpHiveListHead
0xf80002b590f8 1c 00 00 00 80 f9 ff ff 16 00 00 00 00 00 00 00  ................ nt!NlsAnsiToUnicodeData, nt!SeSystemEnvironmentPrivilege
0xf80002b59108 3f 00 00 00 a4 01 00 00 00 a0 06 00 a0 f8 ff ff  ?............... nt!OemDefaultChar, nt!CmBootAcceptFirstTime, nt!CmpDiskFullWorkerPopupDisplayed, nt!ExpLastTimeZoneBia

Note the names of symbols residing in nearby memory addresses.

4. The Rekall Session

Rekall uses a Session to encapsulate the analysis of a single image. The reason Rekall is so fast is because information is cached in the session. Normally when running from the interactive console, the session persists in memory and therefore, the cache remains available for subsequent modules.

For example, when running the pslist module, Rekall caches all the known processes within the image. This cache is then subsequently used by all plugins which require process listing information (e.g. those plugins which can be filtered by process id, process name, etc).

You can see this by printing the session object:

win7.elf 22:57:28> print session
Rekall Memory Forensics session Started on Mon Aug 17 14:25:26 2015.

Config:
{
  __dummy = False
  autodetect = ['nt_index', 'osx', 'pe', 'windows_kernel_file', 'rsds', 'ntfs', 'linux']
  autodetect_build_local = basic
  autodetect_build_local_tracked = set(['win32k', 'ntdll', 'nt', 'tcpip'])
  autodetect_scan_length = 1000000000
  autodetect_threshold = 1.0
  base_filename = win7.elf.E01
  buffer_size = 20971520
  cache = file
  cache_dir = .rekall_cache
  colors = auto
  debug = False
  ept = None
  filename = /home/scudette/images/win7.elf
  format = text
  help = False
  logging_level = DEBUG
  max_collector_cost = 4
  name_resolution_strategies = ['Module', 'Symbol', 'Export']
  notebook_dir = /home/scudette
  output_style = concise
  pagefile = []
  plugin = []
  profile_path = ['/home/scudette/projects/rekall-profiles/']
  quiet = False
  repository_path = []
  session_id = 1
  session_list = [<rekall.session.InteractiveSession object at 0x7fbf7a38c550>]
  session_name = Default session
  timezone = UTC
  verbose = True
}

Cache (<FileCache @ /home/scudette/.rekall_cache/sessions/de814bea95020499e782df54aa4b0d5ce4544823>):
{
  default_address_space = WindowsAMD64PagedMemory@0x17851000 (Kernel AS@0x17851000)
  dtb = 1601536
  image_fingerprint = {'tests': [(41740160L, u'7600.win7_rtm.090713-1255\x00'), (41740416L, u'7600.16385.amd64fre.win7_rtm ...
  profile = nt/GUID/F8E2A8B5C9B74BF4A6E4A48F180099942
  profile_obj = <AMD64 profile nt/GUID/F8E2A8B5C9B74BF4A6E4A48F180099942 (Nt)>
  pslist_CSRSS = set([275427705947968, 275427703135024, 275427707404384, 275427696227600, 275427701196688, 2754276960 ...
  pslist_Handles = set([275427705361984, 275427703135024, 275427688164144, 275427696227600, 275427701196688, 2754277015 ...
  pslist_PsActiveProcessHead = set([275427705947968, 275427703135024, 275427701697328, 275427707404384, 275427706255504, 2754277015 ...
  pslist_PspCidTable = set([275427706779456, 275427703135024, 275427688164144, 275427696227600, 275427701196688, 2754277015 ...
  pslist_Sessions = set([275427700679504, 275427705361984, 275427705947968, 275427688984672, 275427701697328, 2754276962 ...
}

As can be been above, the session state can be divided into the configuration, and the cache. In the above case the cache is persistent (It is a FileCache) and stores a number of useful objects (e.g. the various output, the process listing techniques). Note that the cache will only show objects currently loaded (i.e. have been used in this run). There may be other objects on disk.

If Rekall is analysing a volatile image - such as a memory device, the cache will not be persistent - instead it will be time based, to allow items to expire promptly from the cache as the live system evolves. For normal images, the cache may persist so that even if the user quits and restarts Rekall on the same image, the cache is considered valid. This significatly speeds up operations in future on the same image.

If you suspect something unusual with the cache (e.g. the cache is out of sync or was created by an earlier version) it should be quite safe to remove all files from the cache directory. Otherwise you can force the memory cache to be used by issuing the --cache memory flag.

NoteThe on disk version of the session cache is considered an ephemeral cache of the session data. We do not consider this a stable interchange format. This means that we do not guarantee compatibility with future versions of Rekall. At best the cache files can be deleted and recreated at any time. The content of the file is also not considered user viewable and should not be edited manually.

5. Automating Rekall

One of our main design goals is the automation of Rekall so it can be used from external programs easily, as well as making it easier to write custom scripts.

This section demonstrates how to automate the framework, both by embedding it completely inside another application, as well as simply automating the analysis from rekall itself.

5.1. Example: Embedding Rekall in an external python program

In this example we will run the pslist plugin on a sample image, capture the text table into a string and then print this string (In a real example, this could be served over a HTTP or whatever).

The basic sequence of steps is:

Create a session object

All interactions with the rekall library require a session object. The session object keeps information related to the same image. There can be any number of session objects valid at the same time - no data is global.

Create an appropriate renderer object

All output is rendered using a renderer. A renderer is an abstraction which is able to format the output in some way. For example, the TextRenderer outputs tables of text, while the JSONRenderer outputs json blobs.

Instantiate the plugin

The plugin is simply an object which can be instantiated using various parameters.

Render the plugin into the renderer

Calling the plugin’s render method with a valid renderer will cause it to execute its analysis and output into this renderer.

Example of embedding Rekall in a python application
    1: import logging
    2:
    3: # Setup logging as required. Rekall will log to the standard logging service.
    4: logging.basicConfig(level=logging.DEBUG)
    5:
    6: from rekall import session
    7: from rekall import plugins                           # 1
    8:
    9: s = session.Session(                                 # 2
   10:   filename="win7.elf",
   11:   autodetect=["rsds"],
   12:   logger=logging.getLogger(),
   13:   autodetect_scan_length=18446744073709551616,       # 3
   14:   profile_path=[
   15:      "http://profiles.rekall-forensic.com"
   16:   ])
   17:
   18: print s.plugins.pslist(method="PsActiveProcessHead") # 4
   19: 
1Importing the plugins is required to make rekall load all the default plugins. At this point any third party plugins will need to be imported too.
2The session object is created with initial values for some parameters. It is critical that Rekall is able to contact the profile repository at runtime, therefore you will need to provide a valid repository address here. If you plan to use autodetection (you should) here is where the relevant methods should be specified.
3Rekall will only scan some of the image for RSDS signatures, setting this limit ensures that we dont waste too much time trying to load an image we cant.
4The plugin is instantiated from the session object. When a plugin instance is printed, it renders its output to the console by default.

Alternatively one can manually render the plugin output. Suppose you want to capture the output of the plugin into a string for further processing:

    1: import logging
    2: import StringIO
    3:
    4: # Setup logging as required. Rekall will log to the standard logging service.
    5: logging.basicConfig(level=logging.DEBUG)
    6:
    7: from rekall import session
    8: from rekall import plugins
    9:
   10: from rekall.ui import text
   11:
   12: s = session.Session(
   13:   filename="win7.elf",
   14:   autodetect=["rsds"],
   15:   logger=logging.getLogger(),
   16:   autodetect_scan_length=18446744073709551616,
   17:   profile_path=[
   18:      "http://profiles.rekall-forensic.com"
   19:   ])
   20:
   21: fd = StringIO.StringIO()
   22: renderer = text.TextRenderer(session=s, fd=fd)   # 1
   23:
   24: with renderer.start():                           # 2
   25:     plugin = s.plugins.pslist()                  # 3
   26:     if plugin != None:
   27:         plugin.render(renderer)                  # 4
   28:
   29: print fd.getvalue()
1Create a custom renderer using any of the renderers provided by Rekall (e.g. TextRenderer, DataExportRenderer).
2Start the renderer to receive new input.
3Instantiate the plugin object. This does not actually run the plugin but will verify all the arguments are correct. If the plugin is not active for this profile (i.e. it is not supported for this image) this will return a NoneObject which can be compared to None.
4Rendering the plugin into the renderer will cause the renderer to collect output into its file like object. In this case we use a StringIO (memory stream) to collect output. Finally we can access the complete content as a string.

5.2. Receiving structured output.

Most plugins will now return structured output. You can see exactly what will be returned using the describe plugin (See http://rekall-forensic.blogspot.ch/2016/07/searching-memory-with-rekall.html for details).

In order to receive structured output you do not need a renderer, simply call the collect() method.

    1: import logging
    2:
    3: # Setup logging as required. Rekall will log to the standard logging service.
    4: logging.basicConfig(level=logging.DEBUG)
    5:
    6: from rekall import session
    7: from rekall import plugins
    8:
    9: s = session.Session(
   10:   filename="win7.elf",
   11:   autodetect=["rsds"],
   12:   logger=logging.getLogger(),
   13:   autodetect_scan_length=18446744073709551616,
   14:   profile_path=[
   15:      "http://profiles.rekall-forensic.com"
   16:   ])
   17:
   18: for row in s.plugins.pslist().collect():   # 1
   19:     print "%s, %s, %s" % (
   20:         row["_EPROCESS"].name,             # 2
   21:         row["_EPROCESS"].pid,
   22:         row["process_create_time"])
1Instantiating the plugin object allows us to call its collect() method. This returns a dict or a tuple of its results.
2It is possible to use the results directly now by printing it, storing to file etc.

5.3. Example: Using a custom address space

Sometimes the image files to be analyzed are not directly written to disk. For example, they may be available as a python file-like object. In this case we want to provide this "virtual image" to Rekall for analysis, but wrapping it in a FDAddressSpace.

For example consider the following code which provides the image as a python file-like object (For this example we directly open the file).

Example of embedding Rekall in a python application
    1: import logging
    2:
    3: # Setup logging as required. Rekall will log to the standard logging service or
    4: # to a provided logger. NOTE: basicConfig must be called before importing any
    5: # other modules! If any dependency sets up the logging system this call will be
    6: # ignored!
    7: logging.basicConfig(level=logging.DEBUG)
    8:
    9: from rekall import plugins
   10: from rekall import session
   11: from rekall.plugins.addrspaces import standard
   12:
   13: s = session.Session(                                      # 1
   14:     autodetect=["rsds"],
   15:     logger=logging.getLogger(),
   16:     autodetect_scan_length=18446744073709551616,
   17:     profile_path=[
   18:         "http://profiles.rekall-forensic.com"
   19:     ])
   20:
   21: s.physical_address_space = standard.FDAddressSpace(       # 2
   22:     fhandle=open(
   23:        "/home/scudette/images/win7x86.raw"),
   24:     session=s)
   25:
   26: print s.plugins.pslist(method="PsActiveProcessHead")
   27: 
1We create a session but do not provide a filename.
2We can directly add the FDAddressSpace() as the physical address space to the session.

5.4. Example: Automating the Rekall Console

Much of the time the Rekall interactive interface is sufficient for most analysis. Sometimes, however, we need to automate some of this analysis.

In this section, we see how interactive scripts can be written to automate Rekall. This is similar to Embedding in an external program, except that the script runs within the interactive session. When the script completes, the interactive session resumes.

For this example, we search for all processes with a name of python.exe and dump them into the temporary directory with their timestamps. This could be used for example to dump periodic snapshots of a process from a live system.

First we create the following python file named dumper.py:

Example interactive python script.
    1: import time
    2:
    3: pslist = session.plugins.pslist(proc_regex="python.exe")
    4: pedumper = session.plugins.pedump()
    5:
    6: for task in pslist.filter_processes():
    7:     outfd = open("/tmp/%s-%s.exe" % (time.time(), task.UniqueProcessId),
    8:                  "wb")
    9:     pedumper.WritePEFile(fd=outfd,
   10:                          address_space=task.get_process_address_space(),
   11:                          image_base=task.Peb.ImageBaseAddress)

Now we run this file directly from the interactive shell.

$ rekal -f \\.\pmem
...

In [1]: run -i dumper.py  1
In [2]: ls -l

total 300
-rw-r----- 1 user group  57344 2012-08-27 00:28 1346020082.17-4012.exe
....
1Running the script with the -i flag ensures that the script receives the same namespace as the interactive shell. This means that it can use the same session and we can see all variables defined by the script.

In the interactive shell IPython will automatically attempt to execute plugins without requiring the brackets to be present - this is merely a usability feature. In reality all input is interpreted as python code.

This means that while in the interactive shell it is sufficient to just type pslist, when running an external script, you will need to explicitly call the function as pslist().

5.5. Extending Rekall

Extending Rekall is most useful when you want to add additional functionality which should be reused by other people, or contributed to the core. If you are just interested in automating a very specific analysis, an interactive shell script is sufficient usually, and is simple to write.

There are a number of different components which can be extended these fall roughly into these categories:

Address Spaces

Address spaces are the way Rekall implements image reading. Most of the time you will want to implement some kind of support for new image files.

Renderers

All output in Rekall is performed through an abstract renderer. For specialist output (e.g. HTML or XML), a new renderer should be written. See BaseRenderer and TextRenderer as possible examples.

Profiles and specialized parsers

Profiles are used by Rekall to parse data structures. Sometimes plugin authors wish to extend the parsing system by providing definitions for new types, or additional behaviors via new object classes.

Plugins

A plugin is a reusable component which is available in the interactive session. A plugin should only be written if it can be widely useful and/or can be reused by other plugins. It is normally not necessary to write a plugin to automate Rekall (e.g. search for a specific malware).