Kedify | Predictive

Use Predictive metrics to trigger autoscaling with Kedify and KEDA

AI-powered autoscaler that uses time series models to predict future load and proactively scale Kubernetes workloads before demand spikes occur.

Book demo

Overview of Predictive Scaler in Kedify

The Predictive Scaler revolutionizes Kubernetes autoscaling by using AI-powered time series models to forecast future workload demands. Instead of simply reacting to current load, this scaler proactively prepares infrastructure for predicted traffic spikes, ensuring optimal performance and resource utilization. It continuously learns from your application metrics to build accurate forecasting models.

Key Features

1

AI-Powered Forecasting:

Uses advanced time series models trained on your actual application data to predict future load patterns.
2

Proactive Scaling:

Scales infrastructure before demand spikes occur, preventing performance degradation during traffic surges.
3

Continuous Learning:

Models automatically retrain on new data, adapting to changing application behavior and seasonal patterns.
4

Flexible Horizon:

Configurable prediction horizons from minutes to hours, allowing optimization for different scaling scenarios.
5

Hybrid Scaling:

Combines predictive forecasting with reactive metrics using scaling modifiers for balanced decision making.

How It Works

The Predictive Scaler operates in two phases:

Training Phase: The scaler collects historical metrics from your application and trains time series models to understand usage patterns. This happens automatically as your application runs.

Prediction Phase: Once trained, the model generates forecasts for future load based on the specified horizon. These predictions are combined with current metrics to make informed scaling decisions.

Model Training

Models are trained continuously on the incoming metrics data. The scaler supports:

Confidence-based predictions: Provides confidence values to determine when to use predicted versus current metrics
Seasonal pattern detection: Recognizes daily, weekly, and custom patterns
Drift adaptation: Adjusts to changing application behavior over time

Configuration Best Practices

Use horizon values that match your application’s scaling time requirements
Combine predictive triggers with reactive ones using scalingModifiers
Set appropriate stabilizationWindowSeconds to prevent oscillation

Learn More

Documentation: Kedify Predictive scaler documentation.
Introduction: Check out our blog post on predictive scaling.

Featured Use Cases

Scenario:

Predict and scale ahead of expected traffic spikes during flash sales, product launches, or seasonal events. By analyzing historical traffic patterns, the scaler can prepare infrastructure before users arrive.

Predictive Scaler Usage:

The scaler combines real-time metrics with predictive forecasting to scale proactively. It trains a model on historical data such as request rates and user activity patterns, then forecasts future load to scale before demand hits.

KEDA Usage:

Configure a ScaledObject with both kedify-http and kedify-predictive triggers. Use scalingModifiers to average current and predicted metrics with formula "(current + predicted)/2" for balanced scaling decisions.

Get Started

                      apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: ecommerce-predictive
spec:
  scaleTargetRef:
    name: ecommerce-service
  minReplicaCount: 0
  maxReplicaCount: 20
  advanced:
    scalingModifiers:
      formula: "(current + predicted)/2"
      target: "100"
      metricType: "AverageValue"
  triggers:
    - type: kedify-http
      name: current
      metadata:
        hosts: ecommerce-service.company.com
        service: ecommerce-service
        port: "8080"
        scalingMetric: requestRate
        targetValue: "100"
    - type: kedify-predictive
      name: predicted
      metadata:
        modelName: "ecommerce*traffic-model"
        horizon: 15m
        targetValue: "100"

Scenario:

Prepare GPU resources for machine learning training jobs based on predicted workload patterns. Training jobs often have predictable schedules and resource requirements that can be forecasted.

Predictive Scaler Usage:

Scale ML training infrastructure by predicting GPU utilization and job queue depth based on historical training patterns, ensuring resources are available when needed without waste.

KEDA Usage:

Use the kedify-predictive trigger with a model trained on GPU metrics combined with a rabbitmq trigger for job queue monitoring. Set horizon: 30m to forecast resource needs 30 minutes ahead, allowing time for GPU nodes to spin up. The modelName references a model trained on historical GPU usage patterns.

Get Started

                      apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: ml-training-predictive
spec:
  scaleTargetRef:
    name: ml-training-workers
  minReplicaCount: 0
  maxReplicaCount: 10
  advanced:
    scalingModifiers:
      formula: "(queue + gpuPrediction)/2"
      target: "5"
      metricType: "AverageValue"
  triggers:
    - type: rabbitmq
      name: queue
      metadata:
        host: amqp://rabbitmq.cluster:5672
        queueName: ml-training-jobs
        queueLength: "5"
    - type: kedify-predictive
      name: gpuPrediction
      metadata:
        modelName: "ml-training*gpu-usage"
        horizon: 30m
        targetValue: "5"

Scenario:

Scale API gateway instances based on predicted traffic patterns. API gateways often see predictable daily and weekly traffic patterns that can be learned and forecasted.

Predictive Scaler Usage:

Predict API gateway load using time series analysis of request patterns to scale infrastructure ahead of traffic spikes, reducing latency during peak periods.

KEDA Usage:

Configure the ScaledObject with kedify-predictive trigger that learns from historical API request data combined with kedify-otel for real-time metrics. Use horizon: 10m for near-term predictions and combine with current metrics using scalingModifiers to balance reactive and predictive scaling.

Get Started

                      apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: api-gateway-predictive
spec:
  scaleTargetRef:
    name: api-gateway
  minReplicaCount: 1
  maxReplicaCount: 15
  advanced:
    scalingModifiers:
      formula: "(current + predicted)/2"
      target: "200"
      metricType: "AverageValue"
    horizontalPodAutoscalerConfig:
      behavior:
        scaleUp:
          stabilizationWindowSeconds: 30
        scaleDown:
          stabilizationWindowSeconds: 300
  triggers:
    - type: kedify-otel
      name: current
      metadata:
        metricQuery: 'sum(rate(http_requests_total{service="api-gateway"}[1m]))'
        targetValue: "200"
    - type: kedify-predictive
      name: predicted
      metadata:
        modelName: "api-gateway*traffic-forecast"
        horizon: 10m
        targetValue: "200"