Kedify ROI Calculator! Estimate your autoscaling ROI in under a minute.
Try it now
by Jirka Kremser
September 09, 2025
Most conversations about “scaling to zero” focus on replicas: spin every Pod down when demand drops, spin them back up later. That’s great if your app can cold-start in milliseconds. But what if you still need a single instance alive (think health-checks, long-lived connections, or stubborn frameworks) yet you don’t want it hogging 250 MB of RAM while users are away?
Vertical shrinking is the missing half of the puzzle: keep the Pod running, just ask the kernel for less. With Kedify you can now do exactly that by combining two building blocks you already know:
Starting today, .spec.target.kind: "scaledobject"
binds the two together, letting a PRP react to a ScaledObject’s activated
/ deactivated
lifecycle. The result: your last replica sips resources when idle and bulks up the moment traffic crosses your activation threshold—all without a restart.
When does a ScaledObject change state?
targetValue
for the configured cooldown periodtargetValue
(10 req/s in the example below)These state flips are now first-class PRP triggers.
apiVersion: keda.kedify.io/v1alpha1kind: PodResourceProfilemetadata: name: nginx-activespec: target: # <-- bind to ScaledObject instead of Deployment kind: scaledobject name: nginx containerName: nginx trigger: after: activated delay: 0s # apply immediately on activation newResources: requests: memory: 250M---apiVersion: keda.kedify.io/v1alpha1kind: PodResourceProfilemetadata: name: nginx-standbyspec: target: kind: scaledobject name: nginx containerName: nginx trigger: after: deactivated delay: 10s # wait 10 s before shrinking newResources: requests: memory: 30M---apiVersion: keda.sh/v1alpha1kind: ScaledObjectmetadata: name: nginxspec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx minReplicaCount: 1 maxReplicaCount: 8 triggers: - type: kedify-http metadata: hosts: www.my-app.com service: http-demo-service port: '8080' scalingMetric: requestRate targetValue: '10'
Phase | Requests/sec | Active replicas | Memory per Pod |
---|---|---|---|
Quiet time | < 10 | 1 (min) | 30 MB (thanks to nginx-standby) |
Surge | ≥ 10 | up to 8 | first Pod expands to 250 MB instantly via in-place resize; new Pods inherit normal resource spec |
Cool-down | back < 10 | scales back to 1 | after 10 s the lone survivor shrinks to 30 MB again |
No restarts, no cold-start latency, just smarter utilisation.
selector
and target
are mutually exclusive.kind
is set to scaledobject
are only activated
and deactivated
.We recorded a short terminal cast that shows kubectl top pod
in real time while bombing the nginx endpoint. Watch RAM drop to 30 MB, then shoot back up the instant hey hits 10 RPS:
Together they give you a continuum of options. From zero replicas to one tiny container up to dozens of beefy workers, all driven by the real-world signals your application emits.
InPlacePodVerticalScaling
feature-gate on your cluster if it’s not already on.-standby / -active
pair and watch your requested memory plummet.Built by the core maintainers of KEDA. Battle-tested with real workloads.
Share: