Explore Use Cases

ai icon

Reduce AI Workload Costs & Complexity

Problem:

LLM inference / AI pipelines are GPU‑heavy, bursty, and expensive to keep warm.

Kedify solution:

GPU‑aware autoscaling and OTel‑based signals scale on real usage (RPS, concurrency, custom metrics), then scale down (including to zero when appropriate).

How it works (example signals):

  • HTTP/OTel for request rate, concurrency, or token throughput
  • PRP (vertical right‑sizing) to shrink warm pods when idle (alternative to replica=0)
ai icon

Migrate from AWS Lambda, Azure Functions, or Google Cloud Run

Problem:

Fragmented serverless + K8s leads to complexity, limited visibility, and cold‑start trade‑offs.

Kedify solution:

Bring serverless‑style, HTTP‑triggered autoscaling to Kubernetes (scale‑to‑zero supported) with unified observability and security.

ai icon

Scale‑to‑Zero Developer & Preview Environments

Problem:

Preview/dev envs often run 24/7 and waste spend.

Kedify solution:

HTTP scaler + autowiring + waiting/maintenance pages hold traffic safely during cold starts and scale down when idle.

How it works (example signals):

  • HTTP/OTel for request rate, concurrency, or token throughput
  • PRP (vertical right‑sizing) to shrink warm pods when idle (alternative to replica=0)
ai icon

Handle Spiky & Seasonal Traffic

Problem:

Launches, flash sales, closes/rollovers, or viral spikes cause over‑provisioning or outages.

Kedify solution:

Real‑time HTTP scaler with burst‑friendly behavior, backed by production‑grade Envoy‑based proxying. Combined with the Predictive scaler to forecast and scale ahead of expected traffic spikes.

scale icon

Multi‑Cluster / Multi‑Region Scaling

Problem:

Edge + multi‑region workloads need capacity close to users; cluster outages shouldn’t require manual failover.

Kedify solution:

Scale Deployments and long‑running Jobs across a fleet with weighted placement and automatic rebalancing.

ai icon

Dynamic Batch Processing

Problem:

Nightly ETL, log analysis, or periodic model training doesn’t need constant compute.

Kedify solution:

Use ScaledJobs on event queues (Kafka, SQS, Redis, etc.) to spin up capacity just‑in‑time and back to zero after.

ai icon

Optimize Event‑Driven Architectures

Problem:

Queues spike unpredictably; consumers sit idle for hours.

Kedify solution:

Scale on queue depth/lag across Kafka, RabbitMQ, Pulsar, Redis, SQS, etc. (70+ scalers supported).

ai icon

Prevent Latency & Service Delays

Problem:

Mission‑critical APIs must stay responsive under any load; cold starts can hurt UX.

Kedify solution:

HTTP scaler bursts on live traffic, and Waiting/Maintenance Pages protect UX during scale‑from‑zero or maintenance. The Predictive scaler anticipates demand to minimize cold starts.