LLM推理01-lookahead decoding性能测试

术语: LADE:lookahead decoding缩写。 简介LADA方法介绍:https://lmsys.org/blog/2023-11-21-lookahead-decoding/ LADE GitHub仓库:https://github.com/hao-ai-lab/LookaheadDecoding 测试流程下载并安装: git clone https://github.c...

发布于 LLM