CVE-2016-9962: Run Container Run
RunC Like the Wind
Recently, an interesting vulnerability was discovered (CVE-2016-9962) that enables container escape to the host. The vulnerability stems from a bug found in opencontainers' runc code, which is used by several container engines, including Docker.
The vulnerability is exploited when exec-ing a command inside an already running container. When that happens, a malicious process inside the container can access a “forgotten” file descriptor of a directory that resides on the host. This in turn can be used to perform directory traversal to the host's file system, thus facilitating a nasty and easy escape.
Your FD is Showing
The issue of an open file descriptor is part of a broader problem regarding exec-ing commands inside a running container. There is a (very) small “window” of opportunity, before the runc init process execs the command inside the container, where the container has access to the runc init process on the host. This is because runc enters the namespace of the container before it execs the final command. This window could enable a container, for example, to list file descriptors on the host process, which can then lead it to the host’s file system. Because many containers run as root, this indeed has serious implications.
Let’s think for a minute that you are debugging a container in your environment. It has been “acting funny” for a while, and you decide to run a shell inside the container to better understand the issue. If that container is malicious, the process of running that shell could enable it to escape to the host. After a successful escape, the container can return to “normal” behavior, and you can go back to doing whatever it is you were doing, feeling proud, but oblivious to the exploit that just took place.
How to Reproduce
The discoverer of this vulnerability (who also authored the runc bug fixes) was kind enough to supply a patch that enables easy demonstration of the exploit. What it does is simply widen the window of opportunity by inserting a sleep instruction before exec takes place, which it uses to overwrite the runc init process:
You can also find a simple manual on how to reproduce this vulnerability, using this patch.
After a few iteration of patches, the final fix was released. The fix makes sure that there are no host file descriptors present in the runc init process. Additionally, it protects the runc init process from processes inside the container. It does so by setting the process as non-dumpable, before setns into the container. This somewhat protects runc init process from being exploited by the container. A non-dumpable process cannot be manipulated by ptrace for example.
Reports About ptrace are Greatly Exaggerated
The non-dumpable and ptrace that were thrown around this vulnerability, created the impression that it can only be exploited in a container that has the CAP_SYS_PTRACE capability. This is not the case. While it may be easier for a container with the CAP_SYS_PTRACE capability to access the file descriptors, it can be done without it. Surprising? Well, it shouldn’t be. Just a couple of paragraphs ago I linked to a reproducer patch that proves just that. It showed how one can escape a container without adding any capabilities to the container, by simply patching runc to sleep before calling exec. Therefore, an exploit just need to be timed correctly, which while not easy to do, it doesn’t mandate control of the runc init process.
Let’s Fix This
Make sure that your engine includes a patch. For Docker, that means running version 1.12.6 or higher. If you're not sure of the version, you should have a runtime container monitoring tool that can detect vulnerable versions and alert on them or block them. Aqua can detect and block container breakout attempts to the host by identifying unauthorized resource access from the container to the host - this is what the event looks like: