Skip to main content

Tutorial Intro

Let's discover LLM.

框架的基本介绍

当今最火的大语言模型和多模态模型推理部署框架,你知道有哪些吗? 从DeepSeek-V3模型的模型介绍页面应该可以看出一些东西。

可以看到几个醒目的框架大名。

  • LMDeploy:LMDeploy 由 MMDeploy 和 MMRazor 团队联合开发,是涵盖了 LLM 任务的全套轻量化、部署和服务解决方案。

  • VLLM: A high-throughput and memory-efficient inference and serving engine for LLMs.

  • TensorRT-LLM: 英伟达开源的推理库,TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.