Everything starts with BPF, and the portability issues that it comes with. Whenever your probe attempts to trace kernel functions or read kernel data, you need a way to import the types that you are referencing. In the not so distant past, the only solution was to parse the kernel headers one way or the other. The most common approach was to use the BPF Compiler Collection from IO Visor, which is basically a just-in-time compiler that uses Clang and LLVM to build and load BPF probes written in Python and C. While this provided a sufficient reliability in surviving minor kernel updates, the footprint of both Python, LLVM and the kernel headers has prevented this technology from being used in many environments where system administrators would keep a close eye on the packages installed.
The BPF kernel developers decided to approach this problem, in order to find ways to compile BPF probes once and run them everywhere. And this is exactly the acronym they used for this new effort: CO-RE. It works like this: you compile your C probes once, and then you implement support for relocations based on the linux kernel debug symbols. Similar to a PE or ELF executable image, the loader will apply the fixups and make sure that the data types the probe is accessing are correctly aligned.
The idea itself was sound, but the major problem was that the DWARF symbol format was not only too complex to handle but it also increased the kernel size dramatically. The next logical step was to find ways to reduce this memory usage, and it is where BTF comes in.
BTF itself is rather simple, and consists of a tree of types organized by an identifier. This identifier can then be used to reference the declared type elsewhere, like in a structure or in a typedef. All entities have a name, a size and a bit offset where applicable, making this information easy to consume both inside and outside BPF.
A new set of options were added to the pahole
command line tool, adding support for converting DWARF symbols to the new format. The aggressive deduplication algorithm applied on both the type data and string data dramatically decreased the amount of memory that was originally required to describe all the types used by the kernel.
Finally, this data was then made available through a pseudo-file under the /sys
folder, making the kernel headers no longer necessary.
I am planning on using this library to improve the BPF tracing capabilities of osquery, enhancing both the process and socket events tables.
You can find the library in the Trail of Bits repository: https://github.com/trailofbits/btfparse
External links