With the increasing complexity of environments and workloads of modern cloud-native applications, having the means to quickly spot misconfigurations and prevent them from hitting production is a key component to help teams move fast and with confidence that they aren’t violating any security policy.
To achieve this, companies must have well-defined policies and enforcement mechanisms in place. In this article, we show how Policy as Code achieves security policies and how it can increase security with simple policy checks to avoid Kubernetes deployment misconfiguration.
Policy as Code is a method to define and manage policy rules through code. In this approach, policies are decoupled from the application and infrastructure layers [2] and are defined and managed in a declarative way (specifying the result and not how to reach the result), analogous to Infrastructure as Code.
There are some benefits of using Policy as Code:
Policies can be in a Version Control System (e.g., Git)
The same process used during a software development lifecycle can be applied: code review, unit tests, etc.
GitOps can be used to apply the policies
Policies can run at any level (application, infrastructure, and quality checks)
Policy as Code extends the shift left model, which is the practice of moving tests, quality checks, and performance evaluation early in the development process [1]. The idea is to add compliance and security checks to catch issues early [2,3].
A typical Policy as Code workflow is depicted in Figure 1, where a policy rule is created and pushed to a version control system (Git) and then validated by a Policy validation engine like Open Policy Agent (OPA)/Gatekeeper [4,5], which uses the Rego language [6]. When developers push new code to be deployed, it is checked against all policies, and if it meets all defined rules, it gets deployed to the cloud.
Figure 1: Policy as Code workflow diagram (Yashpal Matharu, CC BY-SA 4.0).
The 2023 Kubernetes Benchmark report [9] evaluated 150,000 workloads to find that the number of misconfigured workloads regarding security has increased since 2022. According to the report, more than 90% of workloads are running with insecure capabilities in 33% of the organizations. This is a huge and unexpected change from the 2022 report, where 24% of the organizations were impacted by this issue.
This is only one of many misconfiguration issues presented in the report, and that can be easily mitigated by adopting Policy as Code. Other Policy as Code use cases are:
General Infrastructure as Code Policies
Authorization and Access Control
Government Legislation and Regulation
Let´s go over some of these security issues and see how they can be solved by applying simple and effective rules.
Regarding security, one of the most common mistakes is not setting the readOnlyRootFilesystem [7]. Kubernetes sets this as false by default. Setting this to true prevents an attacker, for example, from writing executables to disk.
The report states that 56% of organizations are unaware of this issue in their workloads [8]. This issue was seen only in 23% of the workloads in 2022.
An example rule that avoids this misconfiguration is presented in Figure 2. Using Rego language [6] it is possible to define a violation rule that will:
Receive (via the input_containers method) a container manifest document to be tested
Check if the readOnlyRootFilesystem is not set to true in PodSecurityPolicy [8]
Assign the reason for failure to the msg variable if all previous statements are true
Note the way the Rego language works: it executes each line inside the violation rule as an AND statement and stops executing when the first statement returns false.
Figure 2: Example of a read-only root filesystem rule using the Rego language.
Another setting that is set as false by default in Kubernetes is the allowPrivilegeEscalation [8,9]. This configuration defines if a process can get more privileges than its parent. A common mistake is to set runAsNonRoot as true. This is potentially dangerous, as even though a container process can run as non-root, it can escalate its privileges and gain access to host resources.
Again, by applying Policy as Code, this kind of mistake can be easily mitigated as shown in Figure 3. The rule utilizes a method called is_privileve_escalation_allowed, which verifies if the workload is missing the securityContext field or if the allowPrivilegeEscalation is not set to false. Once again, if all statements return true, the policy will be violated and the reason for the failure will be available in the msg variable.
Figure 3: Example of a privilege escalation rule using the Rego language.
Policy as Code contributes to compliance traceability by keeping security and compliance rules in a version control system. It also ensures issues are found before reaching production because they are validated before deployment.
Therefore, in the face of an ever-growing demand for security and compliance for cloud-native workloads—to cite one example—companies must adopt the best practices toward this goal. Policy as Code can help enterprises achieve this goal in a scalable, declarative, and version-controlled way.
This piece was written by Marcelo da Silva Pires, Senior DevOps Engineer and João Longo, Systems Architect and Innovation Leader at Encora’s Engineering Technology Practices group. Thanks to João Caleffi and André Scandaroli for their reviews and insights.
Fast-growing tech companies partner with Encora to outsource product development and drive growth. Contact us to learn more about our software engineering capabilities.