Kedify Achieves SOC 2 Type II Certification!
Learn more about our commitment to security
Amigo builds AI agents for the medical field, supporting hospitals, clinics, and online medical services. A key part of their mission is trust and reliability, ensuring their AI systems produce predictable outcomes even in high-stakes environments.
Amigo runs Kubernetes with one cluster per region and environment. Inside those clusters, they operate a monolithic backend server, multiple asynchronous workers, and custom AI workloads including GPU-based services such as text-to-speech engines. Each of these workloads needs to scale in a different way, and Amigo needed a consistent approach that could handle all of them without introducing complexity or unpredictable behavior.
Their main goal was to guarantee performance during bursty traffic, reacting quickly when demand spikes and ensuring their platform remains responsive.
Amigo chose Kedify to unify autoscaling across their Kubernetes workloads using one consistent framework and language, while still being able to scale each service based on the metric that best matches its behavior.
With Kedify, Amigo can define scaling rules in a predictable and repeatable way across very different workload types. Instead of treating every service the same, they can scale based on real demand signals such as HTTP request rate, WebSocket concurrency, queue depth, and Kubernetes workload state. This made it fast for the team to make new workloads autoscalable while keeping scaling behavior satisfying and reliable.
Below are examples of the autoscaling patterns Amigo uses with Kedify. This is a subset of the scaling strategies they apply across their stack.
HTTP scaling for backend services
WebSocket scaling
Amazon SQS scaling for asynchronous workers
Kubernetes workload based scaling for multi-component AI services
“We chose Kedify because it lets us autoscale everything, from HTTP to SQS to GPU AI engines, in one consistent framework.”
Yi Hong
Member of Technical Staff, Amigo
Amigo selected Kedify because they wanted one uniform autoscaling framework they could apply across services that behave very differently, while still scaling each workload using the most relevant metric. Kedify gave them predictable scaling behavior and made it quick to turn new services into autoscaled workloads without having to build custom tooling for every use case.
The team also valued how responsive Kedify was to real-world needs. Amigo required autoscaling based on a value stored inside a Kubernetes Secret, and Kedify implemented that feature quickly after the request. That flexibility helped Amigo move faster while keeping scaling logic aligned with how their platform actually operates.
“Our custom feature request was implemented extremely quickly and made a real difference.”
Yi Hong
Member of Technical Staff, Amigo
With Kedify, Amigo improved reliability and performance under bursty demand by ensuring workloads scale quickly and predictably based on the right signals for each service.
This shift reduced operational overhead for the team, improved confidence in scaling behavior, and helped ensure consistent performance across backend APIs, async workers, and GPU-based AI services.
Industry
AI platform for the healthcare sector delivering trusted and reliable agents for medical use cases
Size
Challenges
Overview
Kedify helped Amigo implement a unified, metric-driven autoscaling framework that spans multiple workload types and ensures predictable, scalable performance.
Amigo unified HTTP, WebSocket, and queue-based autoscaling under one framework
Faster reaction to load spikes with less overhead
A single scaling framework applied across highly diverse infrastructure
Looking to learn more hands on?
Let a Kedify team member show you what you have been missing
Get Started
Please reach out for more information, to try a demo, or to learn more:
www.kedify.io