Aqua Blog

What is vmlinux.h and Why is It Important for Your eBPF Programs?

What is vmlinux.h and Why is It Important for Your eBPF Programs?

eBPF is a powerful and exciting technology that allows developers to add custom code to strategic points in the Linux kernel and interact with it by writing simple C or Go programs. The eBPF programs you write and run can inspect data in the memory of processes they attach to. In order to do so, however, the eBPF programs need to know what data structures they’re dealing with. You can achieve that with one simple line: #include “vmlinux.h”. In this blog, I’ll explain what vmlinux.h is and why you should start using it when writing your eBPF programs.

vmlinux.h in a nutshell

vmlinux.h is generated code. It contains all the type definitions that your running Linux kernel uses in its own source code. When you build Linux, one of the output artifacts is a file called vmlinux. It’s also typically packaged with major distributions. This is an ELF binary that contains the compiled bootable kernel inside it.

There’s a tool, aptly named bpftool, that is maintained within the Linux repository. It has a feature to read the vmlinux object file and generate a vmlinux.h file. Since it contains every type-definition that the installed kernel uses, it’s a very large header file.

The actual command is:
bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

Now when you import this header file, your bpf program can read raw memory and know which bytes correspond to which fields of structs that you want to use!

For example, linux represents the concept of a process with a type called task_struct. If you want to inspect values in a task_struct from your bpf program, you’re going to need to know the definition of it.

vmlinux.h diagram

Compile once, run everywhere

Since the vmlinux.h file is generated from your installed kernel, your bpf program could break if you try to run it without recompiling on another machine that is running a different kernel version. This is because, from version to version, definitions of internal structs change within the linux source code.

However, using libbpf enables something called “CO:RE” or “Compile once, run everywhere”. There are macros defined in libbpf (such as BPF_CORE_READ) that will analyze what fields you’re trying to access in the types that are defined in your vmlinux.h. If the field you want to access has been moved within the struct definition that the running kernel uses, the macro/helpers will find it for you. Therefore, it doesn’t matter if you compile your bpf program with the vmlinux.h file you generated from your own kernel and then ran it on a different one.

Wrap up

By generating your own vmlinux.h header containing all the Linux kernel types, you can get rid of dependency on kernel headers when writing your eBPF programs. In my next post, I will guide you through writing a bpf program that leverages both the convenience of vmlinux.h, the flexibility of libbpf, and the safety of Go – stay tuned!

This post first appeared on Grant Seltzer’s blog.

Aqua Team
Aqua Security is the largest pure-play cloud native security company, providing customers the freedom to innovate and accelerate their digital transformations. The Aqua Platform is the leading Cloud Native Application Protection Platform (CNAPP) and provides prevention, detection, and response automation across the entire application lifecycle to secure the supply chain, secure cloud infrastructure and secure running workloads wherever they are deployed. Aqua customers are among the world’s largest enterprises in financial services, software, media, manufacturing and retail, with implementations across a broad range of cloud providers and modern technology stacks spanning containers, serverless functions and cloud VMs.