Rekall TutorialThe main goals of the Rekall framework are to enhance user experience by making common tasks easier and more intuitive as well as provide a powerful and capable interface for automation and performing more complex operations. 1. InstallationRekall is available as a python package installable via the pip package manager. We support running Rekall under a virtualenv environment (This guarantees that the exact versions of all dependencies are met). Simply type (for example on Linux): $ virtualenv /tmp/MyEnv
New python executable in /tmp/MyEnv/bin/python
Installing setuptools, pip...done.
$ source /tmp/MyEnv/bin/activate
$ pip install rekall
To have all the dependencies installed. You still need to have python and pip installed first. To be able to run the Rekall GUI, you will need to install the rekall-gui package: For windows, Rekall is also available as a self contained installer package. Please check the download page for the most appropriate installer to use. 2. A Rekall walkthroughThis section is a quick tour of the Rekall user interface. The program can accept command line options - which we can learn more about by using the --help option (Abbreviates to show only the important options): $ rekal -h
usage: rekal [-p PROFILE] [-v] [-q] [--debug] [--output_style {concise,full}]
[--logging_level {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
...
[--version] [-]
Output control:
-v, --verbose Set logging to debug level.
--output_style {concise,full}
How much information to show. Default is 'concise'.
--plugin [PLUGIN [PLUGIN ...]]
Load user provided plugin bundle.
-h, --help Show help about global paramters.
--cache {file,memory,timed}
Type of cache to use.
--repository_path [REPOSITORY_PATH [REPOSITORY_PATH ...]]
Path to search for profiles. This can take any form
supported by the IO Manager (e.g. zip files,
directories, URLs etc)
-f FILENAME, --filename FILENAME
The raw image to load.
--live Enable live memory analysis.
--version Prints the Rekall version and exits.
Interface:
--pager PAGER The pager to use when output is larger than a screen
full.
-F {text,json,wide,xls,test,data}, --format {text,json,wide,xls,test,data}
The output format to use. Default (text)
--timezone TIMEZONE Timezone to output all times (e.g. Australia/Sydney).
Plugin shell options:
-p PROFILE, --profile PROFILE
Name of the profile to load. This is the filename of
the profile found in the profiles directory. Profiles
are searched in the profile path order.
Some of the most frequently used flags are described below: - --verbose
This is a shorthand to setting the logging level to DEBUG. Rekall will produce debug messages of its operation. You should use this if you want to know more of what Rekall is doing and also to attach output for bug reports. - --plugin
If provided, Rekall loads this python file at start time. The file may define any plugins, overlays etc which might add additional functionality to Rekall. - --cache
Rekall has a caching mechanism to be able to remember important information about images in between executions. When Rekall starts up it looks in the cache to see if this particular image has been previously analyzed. This allows Rekall to load previously derived data from cache instead of recalculating it. There are 3 cache modes: file is the usual persistant cache (by default in ~/.rekall-cache). memory cache is only present in memory for the life of this process. timed cache is used for live images which keep changing (it is essentially a memory cache which is flushed periodically). - --repository_path
Rekall normally loads the required profiles from a profile repository. By default, Rekall will use the rekall public repository. If you do not have internet access or want to host the repository locally, you can use git clone to copy the entire repository somewhere and then specify --repository_path to indicate where profiles should be loaded. - --filename
The --filename option is the name of the image to analyse. If you are using the raw device as provided by the winpmem driver, this will be \\.\pmem (windows) for example or /proc/kcore (linux). You almost always want to specify a filename to operate on. A notable exception is when using the --live option which will set the appropriate filename automatically. - --live
The --live option enables live memory analysis. Note that you would usually need to be running as root to do this. When running on Windows, Rekall will insert the winpmem driver. On OSX, Rekall will insert the macpmem driver. On Linux, Rekall will attempt to use the /proc/kcore memory device. When specifying this option you do not need to specify the --filename option because Rekall will automatically open the right device. When Rekall exits, the driver will be unloaded. - --profile
The --profile flag specifies the profile to use. The profile is a JSON file with operating system specific data used to parse the data structures in the image. The profile used must exactly match the operating system version in the image. This parameter is usually only needed if you are generating your own profile (e.g. when analyzing an unusual Linux system). Profiles are normally autodetected in Rekall and are not usually specified by the user. - --pager
The pager can be specified as the program that will be used to inspect the result of each plugin. Since many plugins produce a great deal of output text, a pager is often needed. Rekall will write the output to a temporary file, and launch this program to view it. For example, on windows it is useful to use notepad to examine the output from each plugin (e.g. --pager notepad). On linux use --pager less for example, or even --pager "gvim -f". - --format
Rekall output is produced in a number of formats. The default format is text which consists of tabulated human readable output. The wide format produces expanded rows (useful for very wide tables with many columns). Often it is useful to further process the output by machine. In that case the data format can be used to produce machine parsable JSON data. Rekall can also produce an Excel compatible spreadsheet (if you have the openpyxl package installed). - --timezone
Internally Rekall always works with times in UTC (i.e. without a timezone) however it is often convenient to output data in a particular timezone. Use this option to cause output to be produced in that timezone. Note that Rekall always fully specifies all times so it does not really matter what timezone you use for output. For example specify --timezone Australia/Sydney will cause timestamps to be written as 2012-10-02 07:39:51+1000 instead of 2012-10-01 21:39:51Z
The Rekall architecture is built around plugins. A plugin is an extension with a name (e.g. pslist) which produces some analysis tasks and generates some output. There are many Rekall plugins for many operating systems and configurations. Not all plugins are applicable for all images. For example, a plugin which is designed to work on Windows 7 will not be active on a Windows XP image. Similarly we try to keep the names of plugins consistent across operating systems. For example, the pslist plugin is named the same for all operating systems, even though it is actually implemented by different code for each OSX. - The interactive shell
If the command is followed by a plugin name, the tool will not use the interactive shell, but rather run the plugin and exit:
Example: Running plugin from command line. $ rekal -f xp-laptop-2005-06-25.img pslist
Offset (V) Name PID PPID Thds Hnds Sess Wow64 Start Exit
---------- -------------------- ------ ------ ------ -------- ------ ------ -------------------- --------------------
0x823c87c0 System 4 0 61 1140 ------ False - -
0x81fdf020 smss.exe 448 4 3 21 ------ False 2005-06-25 16:47:28 -
0x81f5a3b8 csrss.exe 504 448 12 596 0 False 2005-06-25 16:47:30 -
0x81f8eb10 winlogon.exe 528 448 21 508 0 False 2005-06-25 16:47:31 -
0x820e0da0 services.exe 580 528 18 401 0 False 2005-06-25 16:47:31 -
...
Some command plugins use additional options specific to their module. You can read plugin specific help by specifying the --help option after the name of the plugin: $ rekal -f xp-laptop-2005-06-25.img pslist --help
usage: rekal pslist [-h] [--kdbg KDBG] [--eprocess EPROCESS [EPROCESS ...]]
[--phys_eprocess PHYS_EPROCESS [PHYS_EPROCESS ...]]
[--pid PID [PID ...]] [--proc_regex PROC_REGEX]
List processes for windows.
optional arguments:
-h, --help show this help message and exit
--eprocess EPROCESS [EPROCESS ...]
Kernel addresses of eprocess structs.
--phys_eprocess PHYS_EPROCESS [PHYS_EPROCESS ...]
Physical addresses of eprocess structs.
--pid PID [PID ...] One or more pids of processes to select.
--proc_regex PROC_REGEX
A regex to select a profile by name.
;) | Because Rekall does not know which specific plugin will handle a particular plugin name until it inspects the image, it is essential that the image be specified with the --filename arg. If this is not specified, Rekall must assume there is no such plugin (so above rekal pslist -h will not work because Rekall does not know which version of pslist we are searching for). For this reason we recommend that users use the interactive mode as much as possible (see below). |
For example to list all the svchost processes, we can apply a regex to process names: $ rekal -f xp-laptop-2005-06-25.img pslist --proc_regex svc
Offset (V) Name PID PPID Thds Hnds Sess Wow64 Start Exit
---------- -------------------- ------ ------ ------ -------- ------ ------ -------------------- --------------------
0x81fa5aa0 svchost.exe 740 580 17 198 0 False 2005-06-25 16:47:32 -
0x81fa8650 svchost.exe 800 580 10 302 0 False 2005-06-25 16:47:33 -
0x81faba78 svchost.exe 840 580 83 1589 0 False 2005-06-25 16:47:33 -
0x81f8dda0 svchost.exe 984 580 6 90 0 False 2005-06-25 16:47:35 -
0x81f6e7e8 svchost.exe 1024 580 15 207 0 False 2005-06-25 16:47:35 -
0x82081da0 svchost.exe 1484 580 6 119 0 False 2005-06-25 16:47:59 -
In order to use the interactive shell, do not specify any plugin to run: $ rekal -f ~/images/win7.elf
---------------------------------------------------------------------------
The Rekall Memory Forensic framework 1.5.0 (Furka).
"We can remember it for you wholesale!"
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License.
See http://www.rekall-forensic.com/docs/Manual/tutorial.html to get started.
----------------------------------------------------------------------------
[1] win7.elf 13:54:24>
The interactive shell marks input lines with the name of the image and a time of day. In order to run a plugin simply type its name and press enter: [1] win7.elf 13:54:24> pslist
---------------------> pslist()
Offset (V) Name PID PPID Thds Hnds Sess Wow64 Start Exit
---------- -------------------- ------ ------ ------ -------- ------ ------ -------------------- --------------------
0x823c87c0 System 4 0 61 1140 ------ False - -
0x81fdf020 smss.exe 448 4 3 21 ------ False 2005-06-25 16:47:28 -
;) | The user enters the bare word pslist as a command. | ;) | IPython is configured for auto-execution and assumed the user wants to run a function called pslist() |
Plugins appear as functions in the namespace the shell is running in. IPython sees the bare word pslist and assumes you mean to run the function pslist(). Running this function will render the output to the screen. All plugins which are applicable to the current image and profile are also collected in the variable plugins in the namespace. This means we can use command line completion to discover all the plugins we could use on the current image: In [3]: plugins.[tab][tab]
plugins.callbacks plugins.handles plugins.modules plugins.raw2dmp
plugins.cmdscan plugins.hashdump plugins.mutantscan plugins.regdump
plugins.connections plugins.hivedump plugins.null plugins.sockets
plugins.connscan plugins.hivescan plugins.pas2vas plugins.svcscan
plugins.consoles plugins.impscan plugins.pedump plugins.symlinkscan
To learn more about each of these plugins we can follow the name of the plugin with a single question mark: [1] win7.elf 13:56:55> plugins.pslist?
file: /home/scudette/rekall/rekall-core/rekall/plugins/windows/taskmods.py
Plugin: WinPsList (pslist)
Parameters:
profile: Name of the profile to load. This is the filename of the profile found in the profiles directory. Profiles are searched in the profile path order.
dtb: The DTB physical address. (type: IntParser)
eprocess: Kernel addresses of eprocess structs. (type: ArrayIntParser)
phys_eprocess: Physical addresses of eprocess structs. (type: ArrayIntParser)
pid: One or more pids of processes to select. (type: ArrayIntParser)
proc_regex: A regex to select a process by name. (type: RegEx)
method: Method to list processes (Default uses all methods). (type: ChoiceArray)
Docstring: List processes for windows.
Link: http://www.rekall-forensic.com/epydocs/rekall.plugins.windows.taskmods.WinPsList-class.html
Following the name of the plugin with two question marks lists the source of the plugin as well: [1] win7.elf 13:58:57> plugins.pslist??
file: /home/scudette/rekall/rekall-core/rekall/plugins/windows/taskmods.py
Plugin: WinPsList (pslist)
Parameters:
profile: Name of the profile to load. This is the filename of the profile found in the profiles directory. Profiles are searched in the profile path order.
dtb: The DTB physical address. (type: IntParser)
eprocess: Kernel addresses of eprocess structs. (type: ArrayIntParser)
phys_eprocess: Physical addresses of eprocess structs. (type: ArrayIntParser)
pid: One or more pids of processes to select. (type: ArrayIntParser)
proc_regex: A regex to select a process by name. (type: RegEx)
method: Method to list processes (Default uses all methods). (type: ChoiceArray)
Docstring: List processes for windows.
Link: http://www.rekall-forensic.com/epydocs/rekall.plugins.windows.taskmods.WinPsList-class.html
source:
class WinPsList(common.WinProcessFilter):
"""List processes for windows."""
__name = "pslist"
eprocess = None
@classmethod
def args(cls, metadata):
super(WinPsList, cls).args(metadata)
metadata.set_description("""
Lists the processes by following the _EPROCESS.PsActiveList.
In the windows operating system, processes are linked together through a
doubly linked list. This plugin follows the list around, printing
information about each process.
To begin, we need to find any element on the list. This can be done by:
1) Obtaining the _KDDEBUGGER_DATA64.PsActiveProcessHead - debug
information.
2) Finding any _EPROCESS in memory (e.g. through psscan) and following
its list.
This plugin supports both approaches.
""")
...
This is useful in order to verify how a particular plugin works. Note that commandline completion can be used to speed up plugin selection and arguments for plugins: [1] win7.elf 14:01:37> plugins.ps[tab][tab]
plugins.pslist plugins.psscan plugins.pstree plugins.psxview
[1] win7.elf 14:01:37> plugins.pslist [tab][tab]
dtb= eprocess= method= phys_eprocess= pid= proc_regex= profile=
[1] win7.elf 14:01:37> plugins.pslist pr[tab][tab]
proc_regex= profile=
[1] win7.elf 14:01:37> plugins.pslist proc_regex="svc"
We can now specify the parameters to be passed to the plugin (This is the same as the command line example above): Example: Specifying plugin options in the interactive shell. [1] win7.elf 14:15:55> pslist proc_regex="svc"
---------------------> pslist(proc_regex="svc")
_EPROCESS Name PID PPID Thds Hnds Sess Wow64 Start Exit
-------------- -------------------- ----- ------ ------ -------- ------ ------ ------------------------ ------------------------
0xfa80024f85d0 svchost.exe 236 480 19 455 0 False 2012-10-01 14:40:01Z -
0xfa80023f6770 svchost.exe 608 480 12 352 0 False 2012-10-01 21:39:59Z -
0xfa8002522b30 svchost.exe 624 480 16 372 0 False 2012-10-01 14:40:01Z -
0xfa800242a350 svchost.exe 716 480 7 260 0 False 2012-10-01 14:40:00Z -
0xfa80024589e0 svchost.exe 768 480 23 535 0 False 2012-10-01 14:40:00Z -
The interactive shell is the most powerful and flexible interface and so the remainder of this tutorial will focus on it. At the heart of the interactive interface is the session object. The session is an object which contains information about the current image analysis. It follows that after running the pslist() plugin for the first time, a number of new objects are stored in the session, and they can be reused the second time a pslist() is run. At any time we can view the current session by simply printing it: [1] win7.elf 14:09:09> print session
Rekall Memory Forensics session Started on Sun Jan 31 13:56:55 2016.
Config:
{
__dummy = False
autodetect = ['nt_index', 'osx', 'pe', 'windows_kernel_file', 'rsds', 'ntfs', 'linux']
autodetect_build_local = basic
autodetect_build_local_tracked = set(['win32k', 'ntdll', 'nt', 'tcpip'])
autodetect_scan_length = 1000000000
...
timezone = UTC
verbose = False
version = False
}
Cache (<FileCache @ /home/scudette/.rekall_cache/sessions/v1.0/sessions/de814bea95020499e782df54aa4b0d5ce4544823>):
{
default_address_space = WindowsAMD64PagedMemory@0x00187000 (Kernel AS@0x187000)
dtb = 1601536
profile = nt/GUID/F8E2A8B5C9B74BF4A6E4A48F180099942
profile_obj = <AMD64 profile nt/GUID/F8E2A8B5C9B74BF4A6E4A48F180099942 (Nt)>
pslist_CSRSS = set([275427705947968, 275427703135024, 275427707404384, 275427696227600, 275427701196688, 2754276960 ...
pslist_Handles = set([275427705361984, 275427703135024, 275427688164144, 275427696227600, 275427701196688, 2754277015 ...
pslist_PsActiveProcessHead = set([275427705947968, 275427703135024, 275427701697328, 275427707404384, 275427706255504, 2754277015 ...
pslist_PspCidTable = set([275427706779456, 275427703135024, 275427688164144, 275427696227600, 275427701196688, 2754277015 ...
pslist_Sessions = set([275427700679504, 275427705361984, 275427705947968, 275427688984672, 275427701697328, 2754276962 ...
}
We can see that the session contains two parts - a configuration part and a cache. The cache will store files in the specified directory. The session shows us the content of the cache. - default_address_space
The default_address_space is the address space that will be used to instantiate new objects from the interactive shell by default. When reading memory we must always use an address space. For example, we can use the physical address space (i.e. the raw image) or a virtual address space (for example, the kernel or any process’s address space). Many plugins rely on the correct default address space to be used. This is typically known as the Process Context and can be changed using the cc plugin. By default the default_address_space is set to the kernel’s address space.
For example, suppose we want to examine the process listing from earlier in more details. The process list has a column with a name _EPROCESS indicating this is the address of the kernel’s _EPROCESS struct. Rekall typically lists the names and addresses of critical kernel data structures in its tabular output. Suppose we now wish to examine the _EPROCESS object reported by the pslist module above for pid 768. We read the virtual offset of the process as 0xfa80024589e0 in the default address space. We first create an instance of the _EPROCESS at the reported offset, and assign it to a variable. We then can examine all the fields of this struct by using command line completion (double tab): [1] win7.elf 14:15:58> task = session.profile._EPROCESS(0xfa80024589e0)
[1] win7.elf 14:19:57> task.
Display all 180 possibilities? (y or n)
task.AccountingFolded task.Flags task.OtherOperationCount task.SectionBaseAddress task.cast
...
[1] win7.elf 14:19:57> task.UniqueProcessId
Out<19 > [unsigned int:UniqueProcessId]: 0x00000300
[1] win7.elf 14:20:08> task.Ima
task.ImageFileName task.ImageNotifyDone task.ImagePathHash
[1] win7.elf 14:20:08> task.ImageFileName
Out<20 > [String:ImageFileName]: 'svchost.exe\x00'
When examining a member in a struct (such as _EPROCESS.ImageFileName) we receive an instance of a rekall BaseObject. This has both a type (e.g. String) and a name (e.g. ImageFileName) as well as a human readable representation (e.g. svchost.exe). For example in the task object we have a type of _EPROCESS (which is a struct), and it exists at offset 0x81fa5aa0: In [13]: task
Out[13]: [_EPROCESS _EPROCESS] @ 0x81FA5AA0
To view the entire object we can print it (you can use the shortcut p as a shortcut for print - either will work). [1] win7.elf 14:20:14> p task
---------------------> p(task)
[_EPROCESS _EPROCESS] @ 0xFA80024589E0 (pid=768)
0x00 Pcb [_KPROCESS Pcb] @ 0xFA80024589E0
0x160 ProcessLock [_EX_PUSH_LOCK ProcessLock] @ 0xFA8002458B40
0x168 CreateTime [WinFileTime:CreateTime]: 0x5069AB40 (2012-10-01 14:40:00Z)
0x170 ExitTime [WinFileTime:ExitTime]: 0x00000000 (-)
0x178 RundownProtect [_EX_RUNDOWN_REF RundownProtect] @ 0xFA8002458B58
0x180 UniqueProcessId [unsigned int:UniqueProcessId]: 0x00000300
0x188 ActiveProcessLinks [_LIST_ENTRY ActiveProcessLinks] @ 0xFA8002458B68
0x198 ProcessQuotaUsage <Array 2 x unsigned long long @ 0xFA8002458B78>
0x1A8 ProcessQuotaPeak <Array 2 x unsigned long long @ 0xFA8002458B88>
0x1B8 CommitCharge [unsigned long long:CommitCharge]: 0x00000D7F
...
This view shows the layout of the _EPROCESS struct which is overlayed on the offset 0xFA80024589E0. The first column is the relative offset of each member, followed by the name of each member and a representation of each member. This representation consists of a type (e.g. _KPROCESS), a name (e.g. Pcb) and a human readable representation. 3. Interactive pluginsRekall is more than just a framework for running plugins. It is a complete interactive environment for memory analysis. Many of the more interesting features Rekall provides exist as part of the interactive environment. 3.1. The Address Resolver.A large part of Memory analysis is about emulating the execution environment of running code. When code is executed it can access and branch into different addresses in the Virtual Address Space. It is Rekall’s job to learn what exists at different addresses. The address space is divided into parts and Rekall can keep track of what different addresses mean. The Address Resolver is a special component which keeps track of the memroy layout in the virtual address space. This service can answer: What is found in a specific address? What is the address of a given symbol?
In order to efficiently represent memory addresses Rekall uses a specific notation: First a module name is specified. This can be the name of a kernel module, or dll. Then the ! character is used to separate the module name from the symbol name. A symbol name is specified in the module. Possible offset from the symbol name.
The following are all valid examples: [1] win7.elf 16:47:32> dis "nt!MmLoadSystemImage"
---------------------> dis("nt!MmLoadSystemImage")
Address Rel Op Codes Instruction Comment
------- -------------- -------------------- ---------------------------------------- -------
------ nt!MmLoadSystemImage ------: 0xf80002a75060
0xf80002a75060 0x0 488bc4 mov rax, rsp
0xf80002a75063 0x3 48895818 mov qword ptr [rax + 0x18], rbx
[1] win7.elf 17:00:20> dis "_ssl!init_ssl"
---------------------> dis("_ssl!init_ssl")
Address Rel Op Codes Instruction Comment
------- -------------- -------------------- ---------------------------------------- -------
------ _ssl!init_ssl ------: 0x10001000
0x10001000 0x0 a1e4310710 mov eax, dword ptr [0x100731e4] 0x1e1ec4c0 _ssl!init_ssl+0x721e4 -> python27!PyType_Type
0x10001005 0x5 68f5030000 push 0x3f5
0x1000100a 0xa 6a00 push 0
0x1000100c 0xc 68508d0a10 push 0x100a8d50 _ssl!+0x10d25
0x10001011 0x11 68f08c0a10 push 0x100a8cf0 _ssl!+0x10cc5
0x10001016 0x16 68fc700910 push 0x100970fc _ssl!init_ssl+0x960fc
0x1000101b 0x1b a3dc890a10 mov dword ptr [0x100a89dc], eax 0x1e1ec4c0 _ssl!+0x109b1 -> python27!PyType_Type
0x10001020 0x20 ff15c0310710 call dword ptr [0x100731c0] 0x1e007152 _ssl!init_ssl+0x721c0 -> python27!Py_InitModule4
0x10001026 0x26 83c414 add esp, 0x14
0x10001029 0x29 85c0 test eax, eax
;) | Disassembly of a kernel function. The "nt" symbol always refers to the kernel. | ;) | Disassembly of a python extension (a dll). (The cc plugin was run to set the process context). | ;) | The Address Resolver is also used inside plugins to resolve addresses where needed. For example here, We see a move instruction from the constantpython27!PyType_Type (i.e. the python Type type). This makes it very easy to see how assembly code interacts with the rest of the address space. |
Similarly the dump plugin dumps a hexdump of some memory, but it is able to annotate information about what resides in each address: [1] win7.elf 17:08:46> dump "nt!SeTcbPrivilege"
---------------------> dump("nt!SeTcbPrivilege")
Offset Data Comment
-------------- ----------------------------------------------------------------- ----------------------------------------
0xf80002b590b8 07 00 00 00 00 00 00 00 44 02 01 00 80 f9 ff ff ........D....... nt!SeTcbPrivilege, nt!NlsOemToUnicodeData
0xf80002b590c8 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 ................ nt!VfRandomVerifiedDrivers, nt!TunnelMaxEntries, nt!ExpBootLicensingData
0xf80002b590d8 bc 00 00 00 00 10 00 00 00 00 ff 07 80 f8 ff ff ................ nt!ExpLicensingDescriptorsCount, nt!CmpStashBufferSize, nt!ExpLicensingView
0xf80002b590e8 e8 f5 00 00 a0 f8 ff ff e8 45 7a 05 a0 f8 ff ff .........Ez..... nt!CmpHiveListHead
0xf80002b590f8 1c 00 00 00 80 f9 ff ff 16 00 00 00 00 00 00 00 ................ nt!NlsAnsiToUnicodeData, nt!SeSystemEnvironmentPrivilege
0xf80002b59108 3f 00 00 00 a4 01 00 00 00 a0 06 00 a0 f8 ff ff ?............... nt!OemDefaultChar, nt!CmBootAcceptFirstTime, nt!CmpDiskFullWorkerPopupDisplayed, nt!ExpLastTimeZoneBia
Note the names of symbols residing in nearby memory addresses. 4. The Rekall SessionRekall uses a Session to encapsulate the analysis of a single image. The reason Rekall is so fast is because information is cached in the session. Normally when running from the interactive console, the session persists in memory and therefore, the cache remains available for subsequent modules. For example, when running the pslist module, Rekall caches all the known processes within the image. This cache is then subsequently used by all plugins which require process listing information (e.g. those plugins which can be filtered by process id, process name, etc). You can see this by printing the session object: win7.elf 22:57:28> print session
Rekall Memory Forensics session Started on Mon Aug 17 14:25:26 2015.
Config:
{
__dummy = False
autodetect = ['nt_index', 'osx', 'pe', 'windows_kernel_file', 'rsds', 'ntfs', 'linux']
autodetect_build_local = basic
autodetect_build_local_tracked = set(['win32k', 'ntdll', 'nt', 'tcpip'])
autodetect_scan_length = 1000000000
autodetect_threshold = 1.0
base_filename = win7.elf.E01
buffer_size = 20971520
cache = file
cache_dir = .rekall_cache
colors = auto
debug = False
ept = None
filename = /home/scudette/images/win7.elf
format = text
help = False
logging_level = DEBUG
max_collector_cost = 4
name_resolution_strategies = ['Module', 'Symbol', 'Export']
notebook_dir = /home/scudette
output_style = concise
pagefile = []
plugin = []
profile_path = ['/home/scudette/projects/rekall-profiles/']
quiet = False
repository_path = []
session_id = 1
session_list = [<rekall.session.InteractiveSession object at 0x7fbf7a38c550>]
session_name = Default session
timezone = UTC
verbose = True
}
Cache (<FileCache @ /home/scudette/.rekall_cache/sessions/de814bea95020499e782df54aa4b0d5ce4544823>):
{
default_address_space = WindowsAMD64PagedMemory@0x17851000 (Kernel AS@0x17851000)
dtb = 1601536
image_fingerprint = {'tests': [(41740160L, u'7600.win7_rtm.090713-1255\x00'), (41740416L, u'7600.16385.amd64fre.win7_rtm ...
profile = nt/GUID/F8E2A8B5C9B74BF4A6E4A48F180099942
profile_obj = <AMD64 profile nt/GUID/F8E2A8B5C9B74BF4A6E4A48F180099942 (Nt)>
pslist_CSRSS = set([275427705947968, 275427703135024, 275427707404384, 275427696227600, 275427701196688, 2754276960 ...
pslist_Handles = set([275427705361984, 275427703135024, 275427688164144, 275427696227600, 275427701196688, 2754277015 ...
pslist_PsActiveProcessHead = set([275427705947968, 275427703135024, 275427701697328, 275427707404384, 275427706255504, 2754277015 ...
pslist_PspCidTable = set([275427706779456, 275427703135024, 275427688164144, 275427696227600, 275427701196688, 2754277015 ...
pslist_Sessions = set([275427700679504, 275427705361984, 275427705947968, 275427688984672, 275427701697328, 2754276962 ...
}
As can be been above, the session state can be divided into the configuration, and the cache. In the above case the cache is persistent (It is a FileCache) and stores a number of useful objects (e.g. the various output, the process listing techniques). Note that the cache will only show objects currently loaded (i.e. have been used in this run). There may be other objects on disk. If Rekall is analysing a volatile image - such as a memory device, the cache will not be persistent - instead it will be time based, to allow items to expire promptly from the cache as the live system evolves. For normal images, the cache may persist so that even if the user quits and restarts Rekall on the same image, the cache is considered valid. This significatly speeds up operations in future on the same image. If you suspect something unusual with the cache (e.g. the cache is out of sync or was created by an earlier version) it should be quite safe to remove all files from the cache directory. Otherwise you can force the memory cache to be used by issuing the --cache memory flag. ;) | The on disk version of the session cache is considered an ephemeral cache of the session data. We do not consider this a stable interchange format. This means that we do not guarantee compatibility with future versions of Rekall. At best the cache files can be deleted and recreated at any time. The content of the file is also not considered user viewable and should not be edited manually. |
5. Automating RekallOne of our main design goals is the automation of Rekall so it can be used from external programs easily, as well as making it easier to write custom scripts. This section demonstrates how to automate the framework, both by embedding it completely inside another application, as well as simply automating the analysis from rekall itself. 5.1. Example: Embedding Rekall in an external python programIn this example we will run the pslist plugin on a sample image, capture the text table into a string and then print this string (In a real example, this could be served over a HTTP or whatever). The basic sequence of steps is: - Create a session object
All interactions with the rekall library require a session object. The session object keeps information related to the same image. There can be any number of session objects valid at the same time - no data is global. - Create an appropriate renderer object
All output is rendered using a renderer. A renderer is an abstraction which is able to format the output in some way. For example, the TextRenderer outputs tables of text, while the JSONRenderer outputs json blobs. - Instantiate the plugin
The plugin is simply an object which can be instantiated using various parameters. - Render the plugin into the renderer
Calling the plugin’s render method with a valid renderer will cause it to execute its analysis and output into this renderer.
;) | Importing the plugins is required to make rekall load all the default plugins. At this point any third party plugins will need to be imported too. | ;) | The session object is created with initial values for some parameters. It is critical that Rekall is able to contact the profile repository at runtime, therefore you will need to provide a valid repository address here. If you plan to use autodetection (you should) here is where the relevant methods should be specified. | ;) | Rekall will only scan some of the image for RSDS signatures, setting this limit ensures that we dont waste too much time trying to load an image we cant. | ;) | The plugin is instantiated from the session object. When a plugin instance is printed, it renders its output to the console by default. |
Alternatively one can manually render the plugin output. Suppose you want to capture the output of the plugin into a string for further processing: 1: import logging
2: import StringIO
3:
4: # Setup logging as required. Rekall will log to the standard logging service.
5: logging.basicConfig(level=logging.DEBUG)
6:
7: from rekall import session
8: from rekall import plugins
9:
10: from rekall.ui import text
11:
12: s = session.Session(
13: filename="win7.elf",
14: autodetect=["rsds"],
15: logger=logging.getLogger(),
16: autodetect_scan_length=18446744073709551616,
17: profile_path=[
18: "http://profiles.rekall-forensic.com"
19: ])
20:
21: fd = StringIO.StringIO()
22: renderer = text.TextRenderer(session=s, fd=fd) #
23:
24: with renderer.start(): #
25: plugin = s.plugins.pslist() #
26: if plugin != None:
27: plugin.render(renderer) #
28:
29: print fd.getvalue() ;) | Create a custom renderer using any of the renderers provided by Rekall (e.g. TextRenderer, DataExportRenderer). | ;) | Start the renderer to receive new input. | ;) | Instantiate the plugin object. This does not actually run the plugin but will verify all the arguments are correct. If the plugin is not active for this profile (i.e. it is not supported for this image) this will return a NoneObject which can be compared to None. | ;) | Rendering the plugin into the renderer will cause the renderer to collect output into its file like object. In this case we use a StringIO (memory stream) to collect output. Finally we can access the complete content as a string. |
5.2. Receiving structured output.In order to receive structured output you do not need a renderer, simply call the collect() method. 1: import logging
2:
3: # Setup logging as required. Rekall will log to the standard logging service.
4: logging.basicConfig(level=logging.DEBUG)
5:
6: from rekall import session
7: from rekall import plugins
8:
9: s = session.Session(
10: filename="win7.elf",
11: autodetect=["rsds"],
12: logger=logging.getLogger(),
13: autodetect_scan_length=18446744073709551616,
14: profile_path=[
15: "http://profiles.rekall-forensic.com"
16: ])
17:
18: for row in s.plugins.pslist().collect(): #
19: print "%s, %s, %s" % (
20: row["_EPROCESS"].name, #
21: row["_EPROCESS"].pid,
22: row["process_create_time"]) ;) | Instantiating the plugin object allows us to call its collect() method. This returns a dict or a tuple of its results. | ;) | It is possible to use the results directly now by printing it, storing to file etc. |
5.3. Example: Using a custom address spaceSometimes the image files to be analyzed are not directly written to disk. For example, they may be available as a python file-like object. In this case we want to provide this "virtual image" to Rekall for analysis, but wrapping it in a FDAddressSpace. For example consider the following code which provides the image as a python file-like object (For this example we directly open the file). Example of embedding Rekall in a python application 1: import logging
2:
3: # Setup logging as required. Rekall will log to the standard logging service or
4: # to a provided logger. NOTE: basicConfig must be called before importing any
5: # other modules! If any dependency sets up the logging system this call will be
6: # ignored!
7: logging.basicConfig(level=logging.DEBUG)
8:
9: from rekall import plugins
10: from rekall import session
11: from rekall.plugins.addrspaces import standard
12:
13: s = session.Session( #
14: autodetect=["rsds"],
15: logger=logging.getLogger(),
16: autodetect_scan_length=18446744073709551616,
17: profile_path=[
18: "http://profiles.rekall-forensic.com"
19: ])
20:
21: s.physical_address_space = standard.FDAddressSpace( #
22: fhandle=open(
23: "/home/scudette/images/win7x86.raw"),
24: session=s)
25:
26: print s.plugins.pslist(method="PsActiveProcessHead")
27: ;) | We create a session but do not provide a filename. | ;) | We can directly add the FDAddressSpace() as the physical address space to the session. |
5.4. Example: Automating the Rekall ConsoleMuch of the time the Rekall interactive interface is sufficient for most analysis. Sometimes, however, we need to automate some of this analysis. In this section, we see how interactive scripts can be written to automate Rekall. This is similar to Embedding in an external program, except that the script runs within the interactive session. When the script completes, the interactive session resumes. For this example, we search for all processes with a name of python.exe and dump them into the temporary directory with their timestamps. This could be used for example to dump periodic snapshots of a process from a live system. First we create the following python file named dumper.py: Example interactive python script. 1: import time
2:
3: pslist = session.plugins.pslist(proc_regex="python.exe")
4: pedumper = session.plugins.pedump()
5:
6: for task in pslist.filter_processes():
7: outfd = open("/tmp/%s-%s.exe" % (time.time(), task.UniqueProcessId),
8: "wb")
9: pedumper.WritePEFile(fd=outfd,
10: address_space=task.get_process_address_space(),
11: image_base=task.Peb.ImageBaseAddress) Now we run this file directly from the interactive shell. $ rekal -f \\.\pmem
...
In [1]: run -i dumper.py
In [2]: ls -l
total 300
-rw-r----- 1 user group 57344 2012-08-27 00:28 1346020082.17-4012.exe
....
;) | Running the script with the -i flag ensures that the script receives the same namespace as the interactive shell. This means that it can use the same session and we can see all variables defined by the script. |
| In the interactive shell IPython will automatically attempt to execute plugins without requiring the brackets to be present - this is merely a usability feature. In reality all input is interpreted as python code. This means that while in the interactive shell it is sufficient to just type pslist, when running an external script, you will need to explicitly call the function as pslist(). |
5.5. Extending RekallExtending Rekall is most useful when you want to add additional functionality which should be reused by other people, or contributed to the core. If you are just interested in automating a very specific analysis, an interactive shell script is sufficient usually, and is simple to write. There are a number of different components which can be extended these fall roughly into these categories: - Address Spaces
Address spaces are the way Rekall implements image reading. Most of the time you will want to implement some kind of support for new image files. - Renderers
All output in Rekall is performed through an abstract renderer. For specialist output (e.g. HTML or XML), a new renderer should be written. See BaseRenderer and TextRenderer as possible examples. - Profiles and specialized parsers
Profiles are used by Rekall to parse data structures. Sometimes plugin authors wish to extend the parsing system by providing definitions for new types, or additional behaviors via new object classes. - Plugins
A plugin is a reusable component which is available in the interactive session. A plugin should only be written if it can be widely useful and/or can be reused by other plugins. It is normally not necessary to write a plugin to automate Rekall (e.g. search for a specific malware).
|