Live webinar!   Reserve your spot to explore Event-driven vs. Resource-based scaling.   Learn more Arrow icon

Arrow Left IconExplore Scalers

Use events from OpenTelemetry to trigger autoscaling with Kedify and KEDA

Autoscaler for Kubernetes workloads based on real-time metrics from Prometheus, OpenTelemetry, and other compliant sources.

Book demo
Opentelemetry Scaler Diagram

Overview of OpenTelemetry Scaler in Kedify

The OpenTelemetry (OTEL) Scaler is designed to enable precise, data-driven scaling for Kubernetes workloads. Using OpenTelemetry, it can capture a wide range of metrics, allowing the Kedify Scaler to adjust resources dynamically and improve response times while conserving costs. It utilizes a push-based approach for metrics, providing significant advantages over traditional pull-based models like Prometheus.

Key Features

  • 1

    Real-Time Metric Collection:

    Gathers metrics in real time, enabling timely scaling based on traffic demand.

  • 2

    Wide Metric Range:

    Supports a variety of metrics such as request rates and concurrency for granular scaling configurations.

  • 3

    No Need for Prometheus Server:

    This approach eliminates the need to deploy a Prometheus server, reducing infrastructure overhead and resource usage.

  • 4

    Faster Response Times:

    With a push-based model, metrics are sent directly to KEDA, minimizing delays that are typical in scrape intervals, thus allowing faster scaling responses.

  • 5

    Flexible Integration Options:

    OpenTelemetry’s support for multiple protocols and integrations enables streamlined observability setups across diverse environments, with minimal configuration.

Learn More

Featured Use Cases

Scenario:

Scale AI/ML training workloads dynamically based on metrics such as tokens per minute and GPU memory usage. A Prometheus server is not used to ensure real-time scaling that adapts to intensive computational loads.

OpenTelemetry Scaler Usage:

Scaling is determined by AI/ML-specific metrics, such as tokens per minute and GPU memory usage, allowing rapid adjustments to resources based on model training demands and without the overhead of a Prometheus setup.

KEDA Usage:

The ScaledObject is configured with kedify-otel triggers for high-frequency training metrics. For example, the metricQuery might specify "avg(model_training_tokens{model=my_model, job=training})" to adjust replicas based on real-time usage and performance, which is essential for latency-sensitive AI workloads.
Get Started
                      apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: ai-training
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ai-training-service
  minReplicaCount: 1
  maxReplicaCount: 15
  triggers:
    - type: kedify-otel
      metadata:
        metricQuery: 'avg(model_training_tokens{model=my_model, job=training})'
        operationOverTime: 'rate'
        targetValue: '500'
    - type: kedify-otel
      metadata:
        metricQuery: 'avg(gpu_memory_usage{model=my_model, job=training})'
        targetValue: '800'