
FREE E-BOOK
From Intent to Impact:
The 2025 Kubernetes Autoscaling Playbook
Scale on the signals that matter, from HTTP intent,
push‑based telemetry, and GPU economics - without DIY glue.
Written by Zbynek Roubalik, Founder & CTO, maintainer of
the KEDA project.
Get the ebook
What you’ll learn:
Scale on the right signals. Use HTTP RPS/concurrency, backlog age, and tail latency; not just CPU.
Push beats scrape: wire OpenTelemetry into autoscaling to cut the “lag chain.”
Master all types of autoscaling on Kubernetes, safely scale out workloads and schedule jobs.
Preview to Predictive autoscaling and why we should use it.
GPU‑aware scaling: blend inflight intent with VRAM/SM headroom; hide cold starts.
Who’s it for?
DevOps / Platform / SRE leads operating multi‑cluster K8s on AWS/GCP/
Azure; event‑driven or spiky workloads; cost pressure & latency SLOs.
Inside the book
The Evolution of Kubernetes Autoscaling (2014–2025)
A quick history of HPA, VPA, CA/Karpenter, and event-driven loops; and what changed by 2025.
Why Traditional Scaling Models Break at Today’s Latency & Cost SLOs
How CPU and memory metrics lag real demand and what “intent-aware” scaling fixes.
Event-Driven Architecture, Simply
How event streams and asynchronous workloads reshape autoscaling beyond request-per-second thinking.
HTTP & gRPC Workloads: What Most Get Wrong
Designing for concurrency, cold starts, and backpressure when scaling real-time APIs.
GPU-Aware Autoscaling for AI & ML
Keeping GPU workloads efficient with pre-warm strategies and VRAM-safe scaling behavior.
Cluster & Node Autoscaling: Provisioning Capacity That Matches Your Workloads
Coordinating KEDA, HPA, and Cluster Autoscaler to balance speed, placement, and efficiency.
Predictive Autoscaling: From Reactive Loops to Forecast-Driven Capacity
Using time-series forecasts and lead times to prepare for demand before it hits.
Build vs. Buy: The True Cost of DIY Autoscaling
Where in-house scaling platforms shine and when managed or enterprise tooling wins on ROI.

About Zbynek
Founder and CTO, Kedify
Zbynek Roubalik is the co-creator of KEDA (Kubernetes Event-Driven Autoscaler) and the founding maintainer behind one of the most adopted autoscaling projects in the Kubernetes ecosystem. While at Red Hat, Zbynek helped design and scale KEDA to power thousands of production workloads worldwide, laying the foundation for modern, event-driven autoscaling.
Recognizing that enterprise teams needed more visibility, multi-cluster management, and production-grade reliability, Zbynek partnered with Open Core Ventures (founded by GitLab’s Sid Sijbrandij) to create Kedify, a commercial platform that extends KEDA’s power into enterprise environments.
As Founder & CTO, Zbynek leads Kedify’s engineering and product direction, building a platform that combines real-time observability, GPU and HTTP autoscaling, enterprise-level security, and multi-cloud simplicity. His mission is to help teams scale smarter, reduce cloud costs by 20–40%, and focus on building, not babysitting, infrastructure.
Zbynek continues to contribute to the open-source ecosystem while shaping the future of autoscaling: open-source built, enterprise tuned, and relentlessly focused on ROI.
As Founder & CTO, Zbynek leads Kedify’s engineering and product direction, building a platform that combines real-time observability, GPU and HTTP autoscaling, enterprise-level security, and multi-cloud simplicity. His mission is to help teams scale smarter, reduce cloud costs by 20–40%, and focus on building, not babysitting, infrastructure.
Zbynek continues to contribute to the open-source ecosystem while shaping the future of autoscaling: open-source built, enterprise tuned, and relentlessly focused on ROI.
Why now?
Autoscaling is being rewritten for the AI era. CPU-based heuristics can't keep up with GPU economics, sub-second SLOs, or multi-cluster workloads.
From Intent to Impact shows how to scale on real signals like HTTP, OpenTelemetry, and GPU intent, so your systems stay fast, efficient, and cost-smart in 2025.
Who Already Uses The Technology
KEDA powers autoscaling for companies you know including Microsoft, FedEx, Grab,
Qonto, Alibaba Cloud, Red Hat and many more. Kedify gives these capabilities
turnkey
to enterprises that don’t want to build and maintain it themselves.