:hello-dog: Hello folks, We recently published a ...
# general
hello dog Hello folks, We recently published a blog where we benchmarked three 7 billion parameter language models (LLMs) 🚀 —LLama 2, Mistral, and Gemma with 6 different inference engines(includes vLLM, TensorRT-LLM, Deepspeed Mii, Ctranslate2 ,TGI, tritonserver+vLLM). 📊 This blog can help you identify the right model for your use case and understand which model will give you the best throughput with the right inference engine. 🔗 Check out the full report here: https://www.inferless.com/learn/exploring-llms-speed-benchmarks-independent-analysis For any discussion just drop me a message 🙂
Hey @Rajdeep Borgohain, only 20% of content shared by someone in the community can be links
The rest has to be actual engagement like discussions, questions etc
since this is your first post I won’t delete it, but please keep in mind