Directing requests to the right model, chain, or agent based on intent and constraints.
Try cheaper models first and escalate to more capable ones only when confidence is low, reducing costs while maintaining output quality.
Route queries to the right model tier based on estimated complexity to optimize cost without sacrificing quality on harder tasks.
Classify query intent using embeddings and route to the appropriate handler, tool, or agent pipeline without relying on keyword rules.