New Case Study! Discover how Kedify helped Tao Testing. Read more
by Jirka Kremser, Kedify
January 20, 2025
It is a good practice to specify resources for all Kubernetes pods. If no resources are specified, Kubernetes considers them to be unlimited. This can lead to a noisy neighbor scenario where one pod can consume a significant amount of CPU or, in the worst case, all node memory.
Kubernetes allows two ways of defining resources:
cgroups
for CPU and oomkiller
for memory)Example:
In addition to the well-understood memory
and cpu
, both sections can contain also ephemeral-storage
for managing local ephemeral storage or extended resources that can look like this:
For the purposes of this blog post, we will keep things simple and focus only on CPU
and memory
, neglecting other resources and the new DRA.
In an ideal world, your workload wouldn’t contain any memory leaks and would require a relatively constant amount of CPU and memory during its lifetime. However, the world is not perfect, and this is often not the case.
In case of memory leak, no amount of resources will be sufficient, as memory will grow until the container in the pod is killed. However, if the application has a predictable resource consumption profile, we can do better.
Until recently, updating requests and limits was treated as a change to the pod’s spec, resulting in a pod restart, forcing all containers to start from the beginning.
In general, adding more resources for a workload is referred to as vertical scaling and is suitable for certain types of workloads. Simply increasing the number of replicas without resolving synchronization logic may not enhance the overall performance of the distributed system. Consider a legacy monolithic application that hasn’t been broken into smaller microservices yet.
Other type of suitable use-case is the application that does some heavy lifting during its startup. Java or Python applications have to do a lot of bytecode bootstrapping, zip expanding, class loading, etc. Jobs are also strong candidates since they typically execute a sequence of tasks in a deterministic and predictable order.
Kedify Agent has a built-in controller that can help with such use-cases. Since version v0.1.4
, it also creates Custom Resource Definition for so called PodResourceProfiles
shortly PRPs
. The main idea of this Kubernetes
resource is to declaratively specify when a workload should receive a resource update. The timing here is relative to the workload’s start. In other words, it is possible to specify that only after, for example, 60 seconds of
the pod being ready should the resources be updated to lower values. This feature and all its possible knobs are well described in our documentation.
Example:
Creating such resource will make the Kedify Agent to update the resources of all pods pertaining to prometheus-server
deployment. Namely for the container called prometheus-server
. This is very useful,
because Prometheus server performs the WAL (Write Ahead Log) replay and require significantly more memory than during its normal (non-startup) phase.
Related Prometheus issues for this topic:
People often solve this issue by giving Prometheus much more resources than it really needs for its normal runtime and end up in a sub-optimal resource situation. With PRPs one can describe the required resource allocation in higher detail that better captures the workload’s resource needs.
This CRD-based API doesn’t break the immutable infrastructure nor the GitOps contract, because the controller doesn’t update the workload definition (Deployment, StatefulSet) but only makes the update to its pods that should be fungible and transient and shouldn’t be part of the declarative configuration.
The PodResourceProfile
can also target the set of pods using a well known podSelector labels. In fact it’s the same internal type as Services in Kubernetes use. However, when using .spec.selector.matchLabels
or
.spec.selector.matchExpressions
it is mutually exclusive with using the .spec.target
(which was used in the example above). Using an arbitrary label selector allows for smooth integrations also with other
Kubernetes tools/primitives such as Jobs, CronJobs, ArgoRollouts, Knative Services, KubeVela Applications, etc.
This simple API allows for addressing a single change to a container in a fleet of pods, but it can be plugged in such a way as to achieve quite complex emergent behavior. The Kedify Agent supports also use-case where multiple PRP resources can target a single workload.
Imagine a following application’s memory profile:
This chart shows the memory profile of a demo application with four replicas. Each pod exhibits the following pattern:
To match this curve using PodResourceProfile
(PRP
) API, we simply create one PRP
for each step. At the very beginning, when the application requires 500 MB, we define this in the workload itself.
That should be it. The Kedify controller recognize what PRP
is the next unapplied and applies them in a right order. Also if multiple PRPs
match a single pod, one can override the default ordering (time) using
the priority field - .spec.priority
If the internal architecture of a workload isn’t inherently suitable for horizontal scaling out of the box. Its refactoring or rewriting into a microservice architecture is too expensive, or the initial bootstrapping
can’t be done using initContainers
, you can leverage the Kubernetes In-place Resource Resize. Kedify supports this feature and provides a convenient CRD-driven API, allowing resource updates to be provided declaratively
in a gitops way.
We have demonstrated how multiple PodResourceProfiles
can effectively model any resource consumption curve. There are numerous tools that can help you with right-sizing the requests and limits for your
application, Kedify has also this feature in the dashboard, however sometimes the resource needs are just a function of time and can be modeled upfront.
An interesting use case for this feature could be implementing a standby mode. For example, you could allocate 2 CPU cores to the application for the first hour, and then throttle it to just a fraction of a core for the remainder of its lifecycle. This could be particularly useful for running a test suite against the application and debugging it after the tests have completed.
We invite you to try out Kedify’s solutions and experience the benefits firsthand. For any questions or support, please contact us at info@kedify.io.
Share: