Intro to Fileless Malware in Containers
A fileless attack is a technique that takes incremental steps toward gaining control of your environment while remaining undetected. In a fileless attack, the malware is directly loaded into memory and executed, evading common defenses and static scanning.Often, attackers may also use compression or encryption to cloak the malware file to avoid detection. Since fileless is most commonly used against Windows, we have recently seen a growing trend in its use against Linux, and, more specifically, within containers. In this guide, we will break down a fileless attack by creating our own fileless demo and show which tools are required to detect the activity we are seeing.
Malware is malicious code intended to damage your software, steal information, or take full control of your supply chain. Malware can take on several forms: Viruses, Worms, Trojans, Ransomware, Bots, Adware, Fileless, etc., some of which are very sophisticated.
Fileless malware is an advanced kind of attack, used for loading/executing the malware from memory rather than from the file system. It is one of the most dangerous security threats today. According to Ponemon Institute report in 2017, fileless attacks are ten times more successful than file-based attacks. In fact, up to 77% successful attacks can be attributed to fileless techniques or exploits. Further, a 2020 WatchGuard report noted that this technique has increased by nearly 900% since 2019. Fileless attacks are undetectable by most antivirus software, endpoint detection and response (EDR), and traditional security tools because these usually only discover compromises based on file systems, file descriptors in unix systems, and handles in Windows. A fileless attack is executed from a memory address, making it extremely hard to collect evidence or forensic clues about what happened. For more details about this form of attack, see here.
Fileless attacks use common artifacts to hide themselves. They are often camouflaged within popular trusted software only inject malicious code into widely used applications. Quietly hidden, they are able to launch assaults on software supply chains and spread fileless attacks, exploiting trusted software relationships and networks to penetrate organizations.
In the past, most successful fileless attacks occurred in Windows via hijacked artifacts such as PowerShell, Microsoft Office macros, WMI, scripting languages (VBScript, Windows PowerShell), and other popular post-exploitation tools (PowerShell empire, Powersploit, Metasploit, cobaltstrike, etc).
Today we're seeing a sharp increase in attacks in Linux as well as in containers, a technology based on the Linux kernel that uses namespaces and cgroups. Let's take a look at one of the ways in which process injection works in Linux and see how easy it is for bad actors to perform fileless attacks in containers.
Fileless Malware Attacks on Linux
Fileless malware attacks targeting Linux systems follow a series of steps, starting with the infection and ending with the execution of malicious code. From there, the attacker can then compromise both the server and data. An attack might begin in several ways (phishing email, malicious link, etc.), but the most common and easiest way for it to be successful is by exploiting existing vulnerabilities. Using vulnerability scanning tools is the first step to identify and manage common vulnerabilities and exposures (CVE), but they are not enough to stop advanced attacks that take place in runtime.
Step 1: Infection via exploitation of a Vulnerability. First, the attack begins by discovering an unpatched vulnerability or breach such as a flaw in a network protocol. This is where you would find the greatest value in using tools for vulnerability scanning, misconfigurations, compliances, secrets.
Since exploitable vulnerabilities may be used as a gateway to gain access to the target system, vulnerability scanning is an important shift-left control in software lifecycle (Figure 2). Still, a fileless attack cannot be detected with static scanning tools because it happens in runtime.
Step 2: Modification of a Linux Process. Once access is gained through vulnerability exploitation, the malicious program could employ many techniques to perform process injection, including ptrace syscall, LD_PRELOAD environment, and more.
Step 3: Insertion of malicious code in Memory. Using a fileless technique, it’s possible to insert malicious code into memory without writing files. For example, the memfd_create create an anonymous descriptor to be used to insert in a running process.
Step 4: Execution of Malicious code. At this point, the system has been compromised after the malicious code has run. Whether through malware, crypto mining, or other malicious techniques, your system would become defenseless against the stealing of sensitive data, the damaging of servers, the encrypting critical files, and much more.
Accomplishing the Code Injection on Linux
Code injection is a common technique used by fileless attacks to modify a Linux process after gaining initial access, usually through a vulnerability or exploit. In the Linux context, a process is an instance of a program running, the ID for the process is known as PID. The code is then injected using the memory address of that active process, usually an ELF binary file. Executing code in the context of another process may require access to other resources (memory, networking, etc.). Code injection on Linux often uses syscalls such as ptrace and memfd_create, or environment variables such as LD _PRELOAD.
The Ptrace system call is used by debuggers (such as gdb and dbx) on Unix systems to inspect system calls invoked by the attached child process. The main actors are the tracer, the process that takes control of the execution of another process, and the tracee, the process controlled by the tracer. It is trickier to utilize ptrace because you need privileged access to use it.
The LD_PRELOAD environment variable loads all shared objects into the process before its main function is invoked, this is one way to inject code. However, to do this it does require that the target process be restarted. Preloading is a feature of the dynamic linker (ld) available on most Unix systems. It allows loading a user-defined shared library before any other libraries linked to an executable.
The memfd_create is a system call added in Linux kernel 3.17 (sys_356/319). This function allows for the creation of anonymous files that are located at RAM and have volatile storage. Because of this, it can be used to load arbitrary programs such as malwares and arbitratry binaries. It doesn't require superuser privileges to use it.
In short, code injection can happen many ways, from a syscall hooking to a zero-day vulnerability, as in the case of Log4j CVE-2021-44228, which exploits a breach in the Java programming language allowing for the execution of fileless malware.
Simulating a Fileless Attack
1. Memfd_create, a native Linux syscall where everything begins.
There are many ways to execute a fileless injection. To keep things simple, we have chosen to perform an injection that involves the native capabilities of the Linux kernel using the memfd_create system call.
We can mirror the process of fileless malware by creating a program that uses the syscall memfd_create, which makes an anonymous descriptor and uses it to inject code.
Memfd_create is a system call Linux uses to create an anonymous file descriptor in /proc/PID/fd/ which may use execve to execute it. This means that there is no mounting device, temporary file storage (tmpfs), or any temporary RAM storage (/dev/shm) visible for security tools based only in file system scanning. As with any programming language for Linux developers (see Figure 1), a program can be created using this system call.
Figure 3 - Programming languages supporting system call under the code 319
2. A program to accomplish fileless execution.
By calling memfd_create, we obtain an anonymous descriptor that can be used to load arbitrary binaries, such as malware. Along with the execve syscall, the program is executed pointing to the anonymous descriptor created by memfd_create in the previous step. One of the many implementations found on GitHub with the described logic is memrun (Figure 4). This program contains all the necessary steps to perform a code injection into memory using the Golang programming language. The main parameters are the ELF binary path being injected and the live process whose address memory is being used. In the sample demo, you can see a running process "nginx" being used to inject the "invasive binary" as "./memrun nginx invasive_binary".
3. Adding the program to image container.
For demo purposes, we’ve created a new image (Figure 5), adding the binary inside the official nginx image to be used when the container is running.
Now that we have simulated how fileless malware injection is accomplished, let’s move on to analyzing the code injection itself. See the note about memrun for more details.
Detecting the attack with Tracee
Using Tracee, an open-source tool that identifies suspicious behavior in security runtime, we can detect this fileless execution technique (Figure 6). Tracee analyzes events collected at the kernel level in real time using eBPF technology. In the demo (Figure 6) the main syscalls used were execve, close, openat, memfd_create, etc., along with other key events. In Tracee, you can use term “Signatures” as an abstraction to analyze and identify the security threat such as code injection, dynamic code loading, fileless execution, etc. These signatures act as behavioral indicators developed by Team Nautilus, security research experts in cloud native software. See the complete signature list here.
- Executing the demo container
- Executing Tracee
- Code injection detected by Tracee
Figure 6 - Fileless execution using the nginx process to perform code injection.
View the complete instructions to reproduce by yourself in this repository:
Detecting the Attack with Cloud Native Detection & Response
We recommend using Cloud Native Detection and Response (CNDR) to detect Fileless Malware attacks and benefit from an improved UI over our open source project Tracee. CNDR is a part of Aqua Platform’s runtime capabilities and is built on top of Tracee, with a larger database of behavioral indicators and a comprehensive, easy user interface which includes enterprise level support. For example, with Tracee you have more than 10+ default rules whereas with CNDR you will have 100+ security signatures. These behavioral indicators are written by Aqua Nautilus Security Research based on actual observed cloud native attacks in the real world.
Scanning your artifacts such as code, container images, Kubernetes manifests, infra as code, etc. is the first step to avoid misconfigurations, hardcode secrets, and vulnerabilities. This limits attackers' ability to gain an early foothold. Remember that at any moment a new threat could be discovered to hack your system, and advanced techniques such as fileless malware that are hard to detect can be present in your system. Thinking about the next log4j or spring4shell zero-day vulnerabilities? You need a way to detect sophisticated threats that can gain a foothold in your environment and evade detection.
Can you currently detect these threats? Pay attention to the Real-time Malware Protection control in the demo.
Figure 7 - Detection and Response (CNDR) to detect Fileless Malware
Related resource: View a demo video that shows the difference between detection and basic risk posture for your cloud environment (CSPM).
Fileless malware is a powerful attack technique that’s grown more in prominence because it’s incredibly difficult to detect and can be cleverly hidden from security tools. It’s vital to use security tools that will help you discover and respond to advanced attacks like fileless, identifying suspicious events during runtime in cloud native environments. Threats are constantly evolving and being discovered (Zero Days). That's why it's important to stay up-to-date in identifying malicious behavior (signatures) defined by security research experts to help you prioritize your real risks.
Trivy is your Swiss army knife for security scanning of vulnerabilities, misconfigurations, and secrets, which you can get more details about here. Tracee is a runtime security and forensics tool for Linux that, together with Aqua, gives you detection and response capabilities in a unique platform integrated into all phases of your application, from code to runtime. Get more details on how Aqua delivers a complete solution in security cloud-native here.