Tutorial Intro
Let's discover LLM.
框架的基本介绍
当今最火的大语言模型和多模态模型推理部署框架,你知道有哪些吗? 从DeepSeek-V3模型的模型介绍页面应该可以看出一些东西。
可以看到几个 醒目的框架大名。
-
LMDeploy:LMDeploy 由 MMDeploy 和 MMRazor 团队联合开发,是涵盖了 LLM 任务的全套轻量化、部署和服务解决方案。
-
VLLM: A high-throughput and memory-efficient inference and serving engine for LLMs.
-
TensorRT-LLM: 英伟达开源的推理库,TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.