The Ticking Supply Chain Attack Bomb of Exposed Kubernetes Secrets
Exposed Kubernetes secrets pose a critical threat of supply chain attack. Aqua Nautilus researchers found that the exposed Kubernetes secrets of hundreds of organizations and open-source projects allow access to sensitive environments in the Software Development Life Cycle (SDLC) and open a severe supply chain attack threat. Among the companies were SAP’s Artifacts management system with over 95 million, two top blockchain companies, and various other fortune-500 companies. These encoded Kubernetes configuration secrets were uploaded to public repositories. In this blog we explore the inherent risks of mismanaged Kubernetes Secrets, the inefficacy of common secrets scanners in detecting such vulnerabilities, the reality in the wild and the possible impact of this exposure.
Kubernetes secrets configuration
On the official Kubernetes website, Kubernetes.io, there is a comprehensive section dedicated to the configuration of Secrets within Kubernetes. It explicitly mentions that "Kubernetes Secrets are, by default, stored unencrypted in the API server's underlying datastore (etcd)." Users can create a YAML file for Secrets using kubectl, or they can manually craft a file and upload it to the cluster.
As outlined in Table 1 below, Kubernetes supports eight built-in types of Secrets. Our research primarily concentrated on two types: dockercfg and dockerconfigjson. Although all Secrets types inherently contain sensitive information, the potential for exploitation varies. Secrets such as basic-auth,tls,and ssh-auth may pertain to external services linked to the cluster. However, exploiting these requires discovering the IP address or URL associated with the cluster, which can be challenging and resource-intensive. The service-account-token is more relevant to internal services and can be a valuable asset for post-exploitation and lateral movement within a network, yet it is more challenging to exploit from an external standpoint. Regarding the bootstrap.kubernetes.io/token, we conducted an in-depth investigation but did not uncover anything noteworthy. Lastly, the Opaque type is too generic to pinpoint a range of exploitative scenarios, making it a potential subject for future research.
|1||Opaque||arbitrary user-defined data|
|3||kubernetes.io/dockercfg||serialized ~/.dockercfg file|
|4||kubernetes.io/dockerconfigjson||serialized ~/.docker/config.json file|
|5||kubernetes.io/basic-auth||credentials for basic authentication|
|6||kubernetes.io/ssh-auth||credentials for SSH authentication|
|7||kubernetes.io/tls||data for a TLS client or server|
|8||bootstrap.kubernetes.io/token||bootstrap token data|
Table 1: Eight built-in types of Secrets as they appear in Kubernetes.io
As previously noted, our research ultimately centered on the dockercfg and dockerconfigjson types of Secrets, as these are specifically intended to store credentials for access to external registries. We have demonstrated in past studies that registries can hold the keys to the kingdom, representing a significant risk with potentially extensive impact.
Misplacing Kubernetes secrets configuration files on GitHub
In our research, we utilized the GitHub API, employing recursive iterations to bypass the 1,000 results limitations. We searched for various keys, employing complex regular expressions to narrowly focus our scope to specific instances of YAML files containing dockercfg or dockerconfigjson with base64-encoded secrets.
We uncovered hundreds of instances in public repositories, which underscored the severity of the issue, affecting private individuals, open-source projects, and large organizations alike. This raises the question: Why would anyone knowingly upload these secrets to GitHub?
There are several legitimate reasons, such as uploading Kubernetes YAML files for version control, sharing templates or examples, and managing public configurations. We observed plenty of such instances where, in most cases, practitioners responsibly omitted secrets from documents publicly exposed on GitHub.
We also encountered; however, instances were, due to misunderstanding or error, practitioners inadvertently uploaded secrets to publicly accessible GitHub repositories. Given the significant number of encoded secrets compared to plaintext ones, we suspect that some practitioners, due to a misunderstanding or lack of knowledge, mistakenly upload encoded secrets thinking they are secure or not easily decodable. They fail to recognize that from a security standpoint, encoding is tantamount to plaintext.
Our research findings
We conducted a search using GitHub's API to retrieve all entries containing .dockerconfigjson and .dockercfg. The initial query yielded over 8,000 results, prompting us to refine our search to include only those records that contained user and password values encoded in base64. This refinement led us to 438 records that potentially held valid credentials for registries. Out of these, 203 records, approximately 46%, contained valid credentials that provided access to the respective registries. In the majority of cases, these credentials allowed for both pulling and pushing privileges. Moreover, we often discovered private container images within most of these registries. We informed the relevant stakeholders about the exposed secrets and steps they should take to remediate the risk.
The dockerconfigjson field, as shown in Figure 1 below, is a type of Secret in Kubernetes designed to store credentials for Docker registry access. This file includes the necessary authentication data, such as tokens or credentials, enabling Docker to pull images from or push images to a registry. Within a Kubernetes environment, when a Pod needs to pull a private image from a Docker registry—like Docker Hub, Google Container Registry, or Quay.io—authentication details must be supplied so that Kubernetes can retrieve the image.
Figure 1: An example to an exposed YAML
When you decode the base64-encoded data/secret value, you obtain the JSON structure as shown in Figure 2 below.
Figure 2: The encoded secrets in the ‘dockerconfigjson’ field.
During our research, we found that many practitioners sometimes neglect to remove secrets from the files they commit to public repositories on GitHub. Consequently, this sensitive information is left exposed, merely a single base64 decode command away from being revealed as plaintext secrets.
In discussions with some of these practitioners, explanations varied: some attributed the oversight to human error, others to shadow IT practices, and some to flaws in their systems or processes.
As depicted in Figures 3 below, a significant portion of these projects (over 67%) are relatively new and have been actively maintained, with more than 67% created and over 72% receiving updates within the past three years.
Figure 3: Secrets file created and secrets file last updated (Year)
From Table 2 below, we observe that the majority of the discovered registries are hosted on private domains or directly via IP addresses. This trend prompts an intriguing question: Is it generally preferred to create and manage a private registry under an organization's domain or network? Alternatively, this data might suggest that privately maintained registries are more susceptible to credential leaks.
It's also important to highlight the findings related to AWS (row 16) and GCR (Google Container Registry, row 11) container registries. In all the instances we examined, the credentials were temporary and had, in fact, expired, rendering access to the registry impossible. This reflects a sound security practice that could be emulated by other registry service providers.
Furthermore, in numerous instances, GitHub Container Registry (row 5) required two-factor authentication (2FA), which blocked further unauthorized access. This is yet another exemplary security measure. While 2FA may not always be compatible with the applicative use of credentials, it is highly suitable for credentials issued to end-users—and many of the cases reported in this study were of this nature. Implementing 2FA can significantly enhance security for these types of credentials.
|#||Registry||Counter||Valid Creds (#)||Valid Creds (%)|
Table 2: Exposed registries analysis
In our assessment of the strength of the credentials in use, we analyzed 438 passwords and found that approximately 21.2% (93 passwords) seemed to be manually set by individuals, as opposed to the 345 that were generated by computers. We used the PESrank model to calculate password strength per each password. Among these manually set passwords, 43 were deemed weak and could be easily compromised by attackers. Alarmingly, we identified commonly known weak passwords such as password, test123456,windows12,ChangeMe,dockerhub, and others in active use. This underscores the critical need for organizational password policies that enforce strict password creation rules to prevent the use of such vulnerable passwords.
We also reviewed the creation dates of the files containing these secrets, which ranged from as far back as five years ago to more recent times. This highlights the necessity for vigilant IT department oversight. Many organizations implement policies requiring passwords to be changed periodically, a practice our findings support as valuable. In some instances, we encountered passwords that had already been invalidated, suggesting that organizations had taken timely measures to secure their registries even after potential exposure.
Additionally, our study revealed shortcomings in the performance of secrets scanners. Each scanner we tested failed to detect these leaks, indicating that they primarily search for plaintext passwords and tokens, thereby overlooking encoded secrets. This gap may be due to a lack of recognition of the risks posed by encoded secrets or an underestimation of how easily encoded text can be decoded, which poses risks equivalent to plaintext secrets. There is a clear need for open-source tools and secrets scanners to improve their detection capabilities to include encoded secrets. We will further elaborate this issue below.
Use cases worth mentioning
Among the 203 registries with valid credentials that we examined, we identified multiple cases that starkly illustrate the risks posed by an exposed registry to an organization or an open-source project. In this section, we will explore selected findings to emphasize the potential repercussions and gravity of these security lapses.
A closer review of Table 3 above reveals that the majority of valid credentials pertained to Red Hat, Quay, and Docker Hub. Consequently, we concentrated our research efforts on these registries to gather more in-depth information.
Use Case #1: SAP SE artifacts repository
We discovered valid credentials for the Artifacts repository of SAP SE. These credentials provided access to more than 95 million artifacts, along with permissions for download and limited deploy operations. The exposure of this Artifacts repository key represented a considerable security risk. The potential threats stemming from such access included the leakage of proprietary code, data breaches, and the risk of supply chain attacks, all of which could compromise the integrity of the organization and the security of its customers. We immediately reported this issue to the SAP security team, which responded in the most professional and efficient manner. They promptly closed the exposure, conducted an investigation and maintained communication with us.
Figure 5: The leaked yaml which disclosed SAP’s secret
Use Case #2: Blockchain companies
We found secrets to the registries of two top tier blockchain companies, which allow pull and push privileges to these organizations’ registries. With some container images that gained millions of pulls and impact to highly popular projects and cryptocurrencies. We reported these issues to the security teams of these organizations.
Use Case #3: Docker Hub accounts
Out of the 94 Docker Hub credentials we uncovered, 64 (equivalent to 68%) were still valid and granted full access to the Docker Hub accounts. These credentials were associated with 2,948 unique container images, which together accounted for a staggering total of 46 million image pulls. Alarmingly, 768 of these container images, representing 26%, were designated as private, implying that they should not have been accessible to unauthorized external parties.
|Docker Hub ID #||Total pull count||# of container images|
Table 3: Top 10 accounts with aggregative pull count and number of container images
Why can't my secrets scanner find these tokens?
At this juncture, one might argue that secrets scanners could be employed to detect these secrets. However, it is surprising to note that most scanners actually fail to identify such secrets, likely for the same reasons that practitioners overlook them. It appears that scanners are not configured to detect base64 encoded secrets, or at least that is our presumption.
Secrets scanning tools are inherently different one from the other. Some are slow, while other need much preparation and configuration to detect secrets causing overhead to the end user. The biggest problem of some tools is that they generate a lot of false positive results. Most secrets scanning tools don’t look for encoded (base64) secrets, we speculate that they don’t do that because they wish to minimize the false positive rate.
In this research we focused on 3 open-source secrets scanners: Gitleaks, TruffleHog, and Trivy.
We scanned a target repository. We used the default password detection configuration.
In the target repository, we inserted some yaml files with various samples of the more popular patterns of leaks we found in our research. As illustrated in Figures 7 to 9 below, none of the tools could detect these secrets with the default settings.
We utilized Trivy’s feature that allows users creating custom rules, we therefore created a simple rule that allowed us detecting these exposed encoded Kubernetes secrets, you can find this rule here.
As illustrated in Figure 11 below, when running Trivy with that custom rule, it can detect the exposed encoded Kubernetes tokens.
Summary and mitigation
In our research we sought for configuration files on GitHub containing.dockerconfigjson and .dockercfg, revealing a troubling number of public repositories inadvertently exposing base64 encoded secrets. Despite the common use of secrets scanners, our analysis showed that the ones we used failed to detect these encoded secrets, with basic secrets scanning configuration. We later configured Trivy to detect these secrets.
The implications of these findings are profound, affecting not only individual developers but also large organizations, as evidenced by our discovery of valid credentials for a Fortune 500 company's container images registries and artifact repositories. The potential for data breaches, loss of proprietary code, and supply chain attacks is a stark reminder of the need for stringent security practices.
Mitigation for configuration files on GitHub
Finding a Kubernetes secrets YAML file in your GitHub repository, especially one containing .dockerconfigjson, .dockercfg and so on - is not secure. This file contains encoded credentials (not encrypted) for Docker registry access, which could be easily decoded and misused if exposed.
Best practice seen during our research:
- GCP and AWS’s expiration date on keys. GCP and AWS is a good example for secrets and tokens that were found exposed in public repositories but weren’t usable since the time elapsed from the exposure to our research time exceeded the expiration date.
- Encrypting data: In some cases, the keys were encrypted and thus there was nothing to do with the key.
- Least privilege’s philosophy: in some cases, while the key was valid it had minimal privileges, often just to pull or download a specific artifact or image. In that case, an attacker needs to invest a lot of energy to gain something from the key and in most cases all these efforts will be in vain. For instance, many keys weren’t able to list the items in the repository.
- For human users use two factor authentication. While this suggestion won’t apply to applicative keys or secrets, it can help in case a key is issued to a human user who accidentally misplace it.
To secure this:
- Remove from GitHub files containing sensitive information: You should immediately remove from your publicly exposed repositories on SCM any Kubernetes secret YAML file in your GitHub repository, especially one containing .dockerconfigjson,.dockercfg and so on - is not secure. Ensure to remove it from the commit history as well, as sensitive data can still be accessed from old commits.
- Use a Secrets Management Tool: Store such secrets in a secure secrets management tool like HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault. Integrate these with Kubernetes to inject secrets into your deployments without exposing them in your source code.
- Use Environment Variables: When deploying applications, use environment variables to pass secrets. This method keeps the secrets out of the source code.
- Encrypt Data at Rest: Ensure that your Kubernetes setup encrypts secrets at rest. This way, even if someone gains access to your data storage, they cannot read the secrets without the decryption keys.
- Audit and Rotate Secrets: Regularly audit your secrets for exposure risks and rotate them frequently to minimize the impact of any potential leaks.