Volatility 3 Basics
Volatility splits memory analysis down to several components. The main ones are:
Memory layers
Templates and Objects
Symbol Tables
Volatility 3 stores all of these within a Context
,
which acts as a container for all the various layers and tables necessary to conduct memory analysis.
Memory layers
A memory layer is a body of data that can be accessed by requesting data at a specific address. At its lowest level this data is stored on a phyiscal medium (RAM) and very early computers addresses locations in memory directly. However, as the size of memory increased and it became more difficult to manage memory most architectures moved to a “paged” model of memory, where the available memory is cut into specific fixed-sized pages. To help further, programs can ask for any address and the processor will look up their (virtual) address in a map, to find out where the (physical) address that it lives at is, in the actual memory of the system.
Volatility can work with these layers as long as it knows the map (so, for example that virtual address 1 looks up at physical address 9). The automagic that runs at the start of every volatility session often locates the kernel’s memory map, and creates a kernel virtual layer, which allows for kernel addresses to be looked up and the correct data returned. There can, however, be several maps, and in general there is a different map for each process (although a portion of the operating system’s memory is usually mapped to the same location across all processes). The maps may take the same address but point to a different part of physical memory. It also means that two processes could theoretically share memory, but having an virtual address mapped to the same physical address as another process. See the worked example below for more information.
To translate an address on a layer, call layer.mapping(offset, length, ignore_errors)
and it will return a list of chunks without overlap, in order,
for the requested range. If a portion cannot be mapped, an exception will be thrown unless ignore_errors is true. Each
chunk will contain the original offset of the chunk, the translated offset, the original size and the translated size of
the chunk, as well as the lower layer the chunk lives within.
Worked example
The operating system and two programs may all appear to have access to all of physical memory, but actually the maps they each have mean they each see something different:
Operating system map Physical Memory
1 -> 9 1 - Free
2 -> 3 2 - OS.4, Process 1.4, Process 2.4
3 -> 7 3 - OS.2
4 -> 2 4 - Free
5 - Free
Process 1 map 6 - Process 1.2, Process 2.3
1 -> 12 7 - OS.3
2 -> 6 8 - Process1.3
3 -> 8 9 - OS.1
4 -> 2 10 - Process2.1
11 - Free
Process 2 map 12 - Process1.1
1 -> 10 13 - Free
2 -> 15 14 - Free
3 -> 6 15 - Process2.2
4 -> 2 16 - Free
In this example, part of the operating system is visible across all processes (although not all processes can write to the memory, there is a permissions model for intel addressing which is not discussed further here).)
In Volatility 3 mappings are represented by a directed graph of layers, whose end nodes are
DataLayers
and whose internal nodes are TranslationLayers
.
In this way, a raw memory image in the LiME file format and a page file can be combined to form a single Intel virtual
memory layer. When requesting addresses from the Intel layer, it will use the Intel memory mapping algorithm, along
with the address of the directory table base or page table map, to translate that
address into a physical address, which will then either be directed towards the swap layer or the LiME layer. Should it
be directed towards the LiME layer, the LiME file format algorithm will be translate the new address to determine where
within the file the data is stored. When the layer.read()
method is called, the translation is done automatically and the correct data gathered and combined.
Note
Volatility 2 had a similar concept, called address spaces, but these could only stack linearly one on top of another.
The list of layers supported by volatility can be determined by running the frameworkinfo plugin.
Templates and Objects
Once we can address contiguous chunks of memory with a means to translate a virtual address (as seen by the programs)
into the actual data used by the processor, we can start pulling out
Objects
by taking a
Template
and constructing
it on the memory layer at a specific offset. A Template
contains
all the information you can know about the structure of the object without actually being populated by any data.
As such a Template
can tell you the size of a structure and its
members, how far into the structure a particular member lives and potentially what various values in that field would
mean, but not what resides in a particular member.
Using a Template
on a memory layer at a particular offset, an
Object
can be constructed. In Volatility 3, once an
Object
has been created, the data has been read from the
layer and is not read again. An object allows its members to be interrogated and in particular allows pointers to be
followed, providing easy access to the data contained in the object.
Note
Volatility 2 would re-read the data which was useful for live memory forensics but quite inefficient for the more common static memory analysis typically conducted. Volatility 3 requires that objects be manually reconstructed if the data may have changed. Volatility 3 also constructs actual Python integers and floats whereas Volatility 2 created proxy objects which would sometimes cause problems with type checking.
Symbol Tables
Most compiled programs know of their own templates, and define the structure (and location within the program) of these
templates as a Symbol
. A
Symbol
is often an address and a template and can
be used to refer to either independently. Lookup tables of these symbols are often produced as debugging information
alongside the compilation of the program. Volatility 3 provides access to these through a
SymbolTable
, many of which can be collected
within a Context
as a SymbolSpace
.
A Context
can store only one SymbolSpace
at a time, although a SymbolSpace
can store as
many SymbolTable
items as necessary.
Volatility 3 uses the de facto naming convention for symbols of module!symbol to refer to them. It reads them from its
own JSON formatted file, which acts as a common intermediary between Windows PDB files, Linux DWARF files, other symbol
formats and the internal Python format that Volatility 3 uses to represent
a Template
or
a Symbol
.
Note
Volatility 2’s name for a SymbolSpace
was a profile, but it could
not differentiate between symbols from different modules and required special handling for 32-bit programs that
used Wow64 on Windows. This meant that all symbols lived in a single namespace with the possibility of symbol name
collisions. It read the symbols using a format called vtypes, written in Python code directly.
This made it less transferable or able to be used by other software.
Plugins
A plugin acts as a means of requesting data from the user interface (and so the user) and then using it to carry out a
specific form of analysis on the Context
(containing whatever symbol tables and memory layers it may). The means of communication between the user interface and
the library is the configuration tree, which is used by components within the Context
to store configurable data. After the plugin has been run, it then returns the results in a specific format known as a
TreeGrid
. This ensures that the data can be handled by consumers of
the library, without knowing exactly what the data is or how it’s formatted.
Output Renderers
User interfaces can choose how best to present the output of the results to their users. The library always responds from
every plugin with a TreeGrid
, and the user interface can then determine how
best to display it. For the Command Line Interface, that might be via text output as a table, or it might output to an
SQLite database or a CSV file. For a web interface, the best output is probably as JSON where it could be displayed as
a table, or inserted into a database like Elastic Search and trawled using an existing frontend such as Kibana.
The renderers only need to know how to process very basic types (booleans, strings, integers, bytes) and a few additional specific ones (disassembly and various absent values).
Configuration Tree
The configuration tree acts as the interface between the calling program and Volatility 3 library. Elements of the
library (such as a Plugin
,
a TranslationLayer
,
an Automagic
, etc.) can use the configuration
tree to inform the calling program of the options they require and/or optionally support, and allows the calling program
to provide that information when the library is then called.
Automagic
There are certain setup tasks that establish the context in a way favorable to a plugin before it runs, removing
several tasks that are repetitive and also easy to get wrong. These are called
Automagic
, since they do things like magically
taking a raw memory image and automatically providing the plugin with an appropriate Intel translation layer and an
accurate symbol table without either the plugin or the calling program having to specify all the necessary details.
Note
Volatility 2 used to do this as well, but it wasn’t a particularly modular mechanism, and was used only for stacking address spaces (rather than identifying profiles), and it couldn’t really be disabled/configured easily. Automagics in Volatility 3 are a core component which consumers of the library can call or not at their discretion.