vita_audio_on_vllm

info

vita-audio

https://github.com/VITA-MLLM/VITA-Audio

vllm

https://github.com/vllm-project/vllm

https://zhuanlan.zhihu.com/p/27662961075

vllm document Adding a New Model
https://docs.vllm.ai/en/latest/contributing/model/index.html
Adding a New Model

  1. Fork the vLLM repository.
  2. Bring your model code.
  3. Make your code compatible with vLLM. Initialization Code. Computation Code.
    1. ( Optional) Implement tensor parallelism and quantization support.
  4. Implement the weight loading logic.
  5. Register your model.
  6. Out-of-Tree Model Integration.

problems

  • vita-audio need pip install funasr omegaconf
  • predict 10 audio tokens directly from historical inputs and LLM hidden states without requiring additional LLM forward passes
  • 权重仓库中模型名填的是Qwen2MTPSenseVoiceForCausalLM
  • 注册位置分为Encoder-Decoder 和 decoder only
  • export VLLM_LOGGING_LEVEL=DEBUG VLLM DEBUG log 输出
  • export PYTHONDONTWRITEBYTECODE=1 禁止使用 pycache
  • --disable-frontend-multiprocessing 前端禁用多进程

TODO

06-10 -> 06-12

  • transformer 学习
  • VITA-audio 论文笔记
  • obsidian学习?
  • 画图?

VLLM

image

LLM
└── LLM Engine
    ├── Scheduler
    │   ├── Block Manager
    │   │   └── Block Allocator
    │   │       └── Physical Token Block
    │   └── policy
    └── executor
        └── worker
            ├── Cache engine
            │   └── KV cache Tensor
            └── Model Runner
                └── model

标签: none

添加新评论