Explore Use Cases

ai icon

Reduce AI Workload Costs & Complexity

Problem:

LLM inference / AI pipelines are GPU‑heavy, bursty, and expensive to keep warm.

Kedify solution:

GPU‑aware autoscaling and OTel‑based signals scale on real usage (RPS, concurrency, custom metrics), then scale down (including to zero when appropriate).

How it works (example signals):

  • HTTP/OTel for request rate, concurrency, or token throughput
  • PRP (vertical right‑sizing) to shrink warm pods when idle (alternative to replica=0)
ai icon

Migrate from AWS Lambda, Azure Functions, or Google Cloud Run

Problem:

Fragmented serverless + K8s leads to complexity, limited visibility, and cold‑start trade‑offs.

Kedify solution:

Bring serverless‑style, HTTP‑triggered autoscaling to Kubernetes (scale‑to‑zero supported) with unified observability and security.

ai icon

Scale‑to‑Zero Developer & Preview Environments

Problem:

Preview/dev envs often run 24/7 and waste spend.

Kedify solution:

HTTP scaler + autowiring + waiting/maintenance pages hold traffic safely during cold starts and scale down when idle.

How it works (example signals):

  • HTTP/OTel for request rate, concurrency, or token throughput
  • PRP (vertical right‑sizing) to shrink warm pods when idle (alternative to replica=0)
ai icon

Handle Spiky & Seasonal Traffic

Problem:

Launches, flash sales, closes/rollovers, or viral spikes cause over‑provisioning or outages.

Kedify solution:

Real‑time HTTP scaler with burst‑friendly behavior, backed by production‑grade Envoy‑based proxying. Combined with the Predictive scaler to forecast and scale ahead of expected traffic spikes.

ai icon

Dynamic Batch Processing

Problem:

Nightly ETL, log analysis, or periodic model training doesn’t need constant compute.

Kedify solution:

Use ScaledJobs on event queues (Kafka, SQS, Redis, etc.) to spin up capacity just‑in‑time and back to zero after.

ai icon

Optimize Event‑Driven Architectures

Problem:

Queues spike unpredictably; consumers sit idle for hours.

Kedify solution:

Scale on queue depth/lag across Kafka, RabbitMQ, Pulsar, Redis, SQS, etc. (70+ scalers supported).

ai icon

Prevent Latency & Service Delays

Problem:

Mission‑critical APIs must stay responsive under any load; cold starts can hurt UX.

Kedify solution:

HTTP scaler bursts on live traffic, and Waiting/Maintenance Pages protect UX during scale‑from‑zero or maintenance. The Predictive scaler anticipates demand to minimize cold starts.