New Case Study!   Discover how Kedify helped Tao Testing.   Read more Arrow icon

Arrow Left IconExplore Scalers

Use events from OpenTelemetry to trigger autoscaling with Kedify and KEDA

Autoscaler for Kubernetes workloads based on real-time metrics from Prometheus, OpenTelemetry, and other compliant sources.

Book demo
Opentelemetry Scaler Diagram

Overview of OpenTelemetry Scaler in Kedify

The OpenTelemetry (OTEL) Scaler is designed to enable precise, data-driven scaling for Kubernetes workloads. Using OpenTelemetry, it can capture a wide range of metrics, allowing the Kedify Scaler to adjust resources dynamically and improve response times while conserving costs. It utilizes a push-based approach for metrics, providing significant advantages over traditional pull-based models like Prometheus.

Key Features

  • 1

    Real-Time Metric Collection:

    Gathers metrics in real time, enabling timely scaling based on traffic demand.

  • 2

    Wide Metric Range:

    Supports a variety of metrics such as request rates and concurrency for granular scaling configurations.

  • 3

    No Need for Prometheus Server:

    This approach eliminates the need to deploy a Prometheus server, reducing infrastructure overhead and resource usage.

  • 4

    Faster Response Times:

    With a push-based model, metrics are sent directly to KEDA, minimizing delays that are typical in scrape intervals, thus allowing faster scaling responses.

  • 5

    Flexible Integration Options:

    OpenTelemetry’s support for multiple protocols and integrations enables streamlined observability setups across diverse environments, with minimal configuration.

Learn More

Featured Use Cases

Scenario:

Scale AI/ML training workloads dynamically based on metrics such as tokens per minute and GPU memory usage. A Prometheus server is not used to ensure real-time scaling that adapts to intensive computational loads.

OpenTelemetry Scaler Usage:

Scaling is determined by AI/ML-specific metrics, such astokens per minute and GPU memory usage, allowing rapid adjustments to resources based on model training demands and without the overhead of a Prometheus setup.

KEDA Usage:

The ScaledObject is configured withkedify-oteltriggers for high-frequency training metrics. For example, the metricQuery might specify"avg(model_training_tokens{model=my_model, job=training})"to adjust replicas based on real-time usage and performance, which is essential for latency-sensitive AI workloads.
Get Started
                      apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: ai-training
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ai-training-service
  minReplicaCount: 1
  maxReplicaCount: 15
  triggers:
    - type: kedify-otel
      metadata:
        metricQuery: 'avg(model_training_tokens{model=my_model, job=training})'
        operationOverTime: 'rate'
        targetValue: '500'
    - type: kedify-otel
      metadata:
        metricQuery: 'avg(gpu_memory_usage{model=my_model, job=training})'
        targetValue: '800'