Deeplearning.fr

You have to learn the rules of the game. And then you have to play better than anyone else

LORAX

Publié le 16 février 2024 par loic

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

LoRAX (LoRA eXchange) is a framework that allows users to serve thousands of fine-tuned models on a single GPU, dramatically reducing the cost of serving without compromising on throughput or latency.

https://github.com/predibase/lorax