Hunting Rootkits with eBPF: Detecting Linux Syscall Hooking Using Tracee
Today, cloud native platforms are increasingly using eBPF-based security technology. It enables the monitoring and analysis of applications’ runtime behavior by creating safe hooks for tracing internal functions and capturing important data for forensic purposes. Tracee is an open source runtime security and forensics tool for Linux that is powered by eBPF and is more optimized for secure tracing.
In this blog, we’ll explore the ways to control eBPF events and examine a case of using a BPF event to capture rootkits, a sophisticated type of malware that lives in the kernel space. We’ll also introduce Tracee’s new feature to detect syscall hooking, which implements a unique way of using eBPF events in the kernel.
eBPF: It’s not just for tracing
Extended Berkeley Packet Filter (eBPF) is a Linux kernel technology that allows programs to run without the need to change the kernel source code or add new modules. Thus, eBPF enables safe hooking to events without the risk of crashing the kernel.
Specifically, an eBPF program uses kernel mechanics such as kprobes, kretprobes, Linux Security Module (LSM) hooks, uprobes, and tracepoints, to create and set up hooks and to verify that the code won’t crash the kernel. eBPF technology has a verifier, the goal of which is to ensure that the eBPF program is running safely — as opposed to loading a kernel module to interact with the kernel, which can crash the system if not done properly.
Traditionally, eBPF has been used to trace kernel events. However, there’s another, lesser-known capability of eBPF that allows initiation and control of eBPF programs for safe interaction with the Linux kernel.
Why do threat actors hook kernel functions?
These days, sophisticated attacks that use rootkits tend to target the kernel space. That’s because threat actors are trying to avoid detection by security endpoint solutions and forensic tools that monitor events in the user space or analyze basic system logs. Also, embedding malware in the kernel space makes it harder for forensic and incident response teams to find it. The lower in the stack malware lives, the more challenging it is to detect.
Below, we’ll look at the example from Team TNT’s campaign and examine how they use the Diamorphine rootkit and how it can be detected with Tracee — even when the malware was installed before Tracee was run.
It's worth noting that if a kernel module is loaded prior to the loading of Tracee or any other security tool, this kernel module will have almost complete control over the system, therefore, it could disrupt Tracee's normal operation.
Function manipulation in the kernel
Threat actors are looking to target functions in the kernel level for their benefit. A common method used in the wild is function hooking, which aims to hide malicious activities by manipulating functions in the kernel. The reason for that is that the kernel functions are supposed to perform the tasks that come from the user space. If they’re compromised, malicious actors can control all the user space programs’ behavior.
A good example of function hooking is syscall hooking, when attackers seek to hook system call (syscall) functions. Those are the higher-level kernel functions that are supposed to perform the tasks that come from the user space. The primary goal of hooking them is to hide malicious contact. For example, attackers hook the getdents syscall to hide malicious files and processes from the programs that work with listing files, such as ps, top, and ls.
Commonly, hooking those syscall functions is done by reading the syscall table and obtaining syscall functions’ addresses. Once a syscall function address is obtained, threat actors will save the original address and attempt to override it with a new function that contains malicious code.
How threat actors hook kernel functions in the wild
Now let’s examine how attackers hijack kernel functions in real-world cyberattacks.
In order to hook functions, adversaries must first get access to the object that they want to hook. It can be, for example, the syscall table that holds all the system calls’ function addresses. Then, they need to save the original address of the function and override it. In some cases, due to the memory permissions in the current place, they’ll also need to override the permissions on the control register in the CPU.
A great illustration of this method is TeamTNT using Diamorphine to hide cryptomining activities as part of their attack:
Detecting syscall hooking using kernel memory boundaries
Now that we’ve established the attackers’ motivation and how they can modify the kernel behavior, the question is, how can we detect this activity?
Our goal is to find a way to distinguish between the original internal functions in the kernel, or syscalls that are affiliated to the core kernel, and a new kernel module code—or, in other words, a manipulated function.
We can achieve this with the kernel core_text boundaries.
Kernel core_text boundary
The memory in the kernel is divided into several parts. One of them is core_text, which holds the original functions in the kernel. This part is registered in a specific memory map region that is immune to changes or manipulation. Moreover, if we load a new kernel module—namely, write a new function or an override to an original function—this new function will be written in another memory region that is reserved solely for new functions.
You can see this in a virtual memory map below. Note that there’s a difference between the address range that is assigned to the original kernel code (text section, aka “core kernel text”) and the address range that is assigned to a new kernel module.
So, our current goal is to fetch a syscall address and then compare it to the kernel core_text boundaries, which, as we saw, represent the range of the original kernel sources.
Detecting syscall hooking with Tracee
Now that we understand how and why malware targets kernel functions and how to detect hooked kernel functions, let’s see how we can use eBPF to extract the functions’ addresses. We’ll use Tracee to determine if a function is hooked, even if the hook was placed before Tracee was executed.
We’ll create a program that first will trigger itself in the user space and catch that BPF event in the kernel space. In case the kernel program needs information from the user space, we’ll pass it by BPF map.
For example, we’ll create an event in Tracee that will fetch the syscall addresses from the syscall table. Then, we’ll investigate if the syscall is hooked by a kernel module. In case it’s hooked, we’ll create a derived-event (an event created due to another event that occurs in the kernel), which will alert us about the syscall hooking.
It will look like this:
First, we’ll use the libbpfgo kallsyms helpers to obtain the syscall table address and add it to the event kernel symbols dependencies.
Note that our detect_hooked_syscalls event is a derived event. This means that after we receive the syscalls’ addresses and examine them, we’ll create a new event of the detect_hooked_syscalls.
Then, we pass it with the syscall numbers to check into the kernel space using BPF map.
To check those syscalls in the kernel space, we’ll create an event based on kprobe on security_file_ioctl, which is an internal function of the ioctl syscall. The reason for that is that we can control the flow of the program by triggering the syscall with specific arguments from the user space.
As you can see, we trigger the ioctl with a specific command:
In the kernel space, we check if the ioctl command is the same and if the process that called that syscall is Tracee. This way, we can verify that the detection will happen only if the user asks Tracee to check it.
The detection part is straightforward. We iterate over the syscall map and, by using READ_KERN(), fetch the address of the syscall table as:
Then, in the user space, we compare the addresses with the libbpfgo helpers:
Hunting hour: Detecting Diamorphine with eBPF
Now, let’s run Tracee and see how it will detect the Diamorphine rootkit.
The first step is to load the kernel object file of Diamorphine (.ko) with the function insmod. Our goal is to find out what kind of detections Tracee will see.
Typically, in this scenario (loading a kernel module) and in that startup of Tracee, if the detect_hooked_syscall event is chosen, Tracee will send an event of hooked_syscalls to ensure that the system isn’t compromised:
Tracee detected three hooked syscalls, including getdents and getdents64. TeamTNT uses them to hide a high CPU, resulting from extensive cryptomining activity, and the function kill, which is normally used to send a command from the user space to kill processes. However, in this case, kill -63 is used by the rootkit as a communication channel between the user space and the kernel space.
Also, if we run Diamorphine and Tracee again but with json output, the arguments will reveal Diamorphine’s malicious hooks:
If we run Tracee-rules, we can see the new signature of the detect_hooked_syscall event:
Modern threat actors target various layers of the operating system, including the kernel level. Moreover, offensive cyber tools are becoming more available due to popular open source projects such as Diamorphine. Therefore, security practitioners need to improve their defenses and develop suitable detection methods. We at Team Nautilus focus on raising awareness about the emerging threats in the cloud native ecosystem to help in those efforts.
Security in the kernel is a game of cat and mouse, where defenders and attackers have almost full control over the system – so defenses can be bypassed, and attacks can be detected if they both know what to search for.
In the coming months, we plan to publish new blogs exploring other attack vectors in the Linux kernel space. In parallel, we’ll be releasing new relevant events in Tracee.