mquire: Linux memory forensics without debug symbols

Memory forensics on Linux is a lot more painful than it should be. Tools like Volatility work well, but they require debug symbols that match the exact kernel version you are analyzing. Those symbols are often not available: custom kernels have never had their symbols published anywhere, some distributions do not ship them at all, and on production systems nobody thinks to collect them until it is already too late. When you do find them, the process of getting the right version, for the right build, on the right distribution, is tedious.

The root cause is that these tools depend entirely on external files to understand the memory layout of the kernel. Take those away and you cannot do anything useful with the dump.

The insight behind mquire is that modern Linux kernels already carry most of the information you need, embedded directly in the kernel image. There are two pieces that together are enough to reconstruct what is in memory.

The first is BTF. I wrote about BTF before, but to summarize: it is a compact format the kernel uses to describe its own data structures, originally designed to support BPF programs that need to read kernel memory safely. Every struct, every field, every offset is in there. Since kernel 4.18, most distributions ship with BTF enabled, and the data is accessible at /sys/kernel/btf/vmlinux on a running system, or embedded in the memory dump itself.

The second is kallsyms. On a running kernel you can read /proc/kallsyms to get the address of every kernel symbol. The same data is present in a memory dump, and mquire scans for it to find the exact addresses of the structures it needs to traverse. This requires kernel 6.4 or later due to a change in how the kallsyms data is laid out.

With the type layout from BTF and the symbol addresses from kallsyms, mquire can locate and walk kernel data structures without any files that did not come from the dump itself.

Querying memory with SQL

The interface follows the same model as osquery: you run SQL queries against tables that represent different parts of the kernel state. Available tables cover processes, open files, memory mappings, network connections, kernel modules, loaded kernel logs, and more.

For example, finding which processes have SQLite databases open:

SELECT tasks.pid, task_open_files.path
FROM task_open_files
JOIN tasks ON tasks.tgid = task_open_files.tgid
WHERE task_open_files.path LIKE "%.sqlite"
LIMIT 2;

Or listing systemd-related processes with their full command lines:

SELECT comm, command_line
FROM tasks
WHERE command_line NOT NULL AND comm LIKE "%systemd%"
LIMIT 2;

One thing worth knowing about the schema: tables use virtual_address as the join key rather than pid or other user-visible identifiers. The reason is that the same PID can appear multiple times when you enumerate processes through different discovery paths, while a kernel virtual address uniquely identifies a specific object in memory. This matters more than it might seem when you are hunting for hidden processes.

There is also a .dump command that extracts files directly from the kernel page cache. This works even for files that have been deleted from disk but are still mapped in memory, which is useful for recovering artifacts during incident response.

btfparse

mquire is written in Rust, and parsing BTF is one of its core requirements. The original btfparse library I wrote is in C++, and calling into it from Rust would have meant maintaining a C++ dependency and writing FFI bindings. That was not worth doing when the format is well-specified and a clean Rust implementation would be straightforward. I rewrote it as a Rust crate, which is now published separately on crates.io: https://github.com/alessandrogario/btfparse.

Categories: Development 
Tags: forensics