Pod Resource Profiles
Pod Resource Profiles describe the future update of a pod’s resources (CPU, memory) in place without requiring a pod restart. Since it doesn’t change the number of replicas of a workload, but instead works at the pod level by adjusting the resources of the container running in a pod, it functions as a vertical scaler.
In-place Updates
It allows resource adjustments for a container without requiring a container or pod restart. This feature must be enabled for the
Kubernetes cluster; otherwise, the patch
operation will result in an error state. For more details, consult the
InPlacePodVerticalScaling feature gate.
Pod Resource Profile (PRP) CRD
The Kedify agent contains a controller that reconciles PRP (Pod Resource Profiles) and also manages pods annotated with the following:
Based on the rules specified in the PRP custom resource, the controller either acts immediately or schedules an event for a later time.
Example PRP:
This PodResourceProfile ensures that the container named nginx
in a pod matching the specified selector (app=nginx
) is updated
30 seconds after it becomes ready. Once the readiness probe passes, the timer starts, and the memory requests will be eventually set to
50 MB and CPU to 200 millicores.
The controller can be enabled or disabled on the Kedify Agent using the environment variable
PRP_ENABLED
. By default, it is disabled. Additionally, the requirement for annotated pods can be turned off using the
PRP_REQUIRES_ANNOTATED_PODS
environment variable. However, this may have performance implications since the controller
filters out pod events that do not change container or pod readiness status or are not referenced by a PRP resource.
Addressing Pods
In the example above, a common label selector was used. It has the same spec as a Deployment’s selector, so anything that
can appear under deployment.spec.selector
can be used here as well. Another way to target pods is by using the target
field.
For example:
Using selector
and target
is mutually exclusive. Besides deployment
, statefulset
and daemonset
kinds are also supported.
It is assumed that the workload is present in the same namespace as the created PRP
resource.
Triggers
Allowed values include:
containerReady
: (default value) specifies whether the container is currently passing its readiness check. The value will change as readiness probes continue executing. If no readiness probes are specified, this field defaults to true once the container is fully started.- field:
pod.status.containerStatuses.ready
- time:
pod.status.containerStatuses.state.running.startedAt
- field:
containerStarted
: indicates whether the container has completed its postStart lifecycle hook and passed its startup probe. Initialized as false, it becomes true after the startupProbe is considered successful. Resets to false if the container is restarted or if kubelet temporarily loses state. In both cases, startup probes will run again. Always true if no startupProbe is defined, and the container is running and has passed the postStart lifecycle hook. The null value must be treated the same as false.- field:
pod.status.containerStatuses.started
- time:
pod.status.containerStatuses.state.running.startedAt
- field:
podReady
: indicates that the pod can service requests and should be added to the load balancing pools of all matching services.- field:
pod.status.conditions[?(.type=='Ready')].status
- time:
pod.status.conditions[?(.type=='Ready')].lastTransitionTime
- field:
podScheduled
: represents the status of the scheduling process for this pod.- field:
pod.status.conditions[?(.type=='PodScheduled')].status
- time:
pod.status.conditions[?(.type=='PodScheduled')].lastTransitionTime
- field:
podRunning
: indicates that the pod has been bound to a node and all containers have started. At least one container is still running or is being restarted.- field:
pod.status.phase
- time:
pod.status.startTime
- field:
Use-cases
Pod Resource Profiles are useful in scenarios where workloads exhibit predictable resource consumption behavior. Certain application frameworks require a significant amount of memory or CPU during startup for initialization but then need less during steady operation.
Another example could be a job that runs to completion but requires different computational resources at different stages. Instead of allocating the maximum resources for all phases, the PRP can match the workload’s actual utilization profile, allowing for more efficient bin-packing by the Kubernetes scheduler.
The current design allows multiple PRP resources to target the same pods. In such cases, matching PRPs are sorted first by priority
(.spec.priority
), followed by the delay
. The PRP with the smallest unapplied delay
is selected over one with a higher delay.
If multiple PRPs still match, they are sorted alphabetically, with the “smaller” one winning. This enables multiple PRPs to be set up
for the same workload, changing resource allocations multiple times throughout the pod’s lifecycle.
Quick Start
Create a sample deployment with nginx that has one pod. It will have 45MB memory requested.
Now, let’s create a crd and after 20 seconds the resources will be changed to 30MB.
Finally check if everything works as expected.