New Case Study! Discover how Kedify helped Tao Testing. Read more
Speaker: Jiri Kremser & Vincent Hou
Event: KubeCon Europe 2025
April 03, 2025
Balancing resource provision for LLM workloads is critical for maintaining both cost efficiency and service quality. Kubernetes’s Horizontal Autoscaling offers a cloud-native capability to address these challenges, relying on the metrics to make the autoscaling decisions. However, the efficiency of metrics collection impacts how quickly and accurately Autoscaler responds to the LLM workload demands. This session explores strategies to enhance metrics collection for autoscaling LLM workloads with: