๐A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. ๐๐
mla sora vllm llm-inference awesome-llm flash-attention tensorrt-llm paged-attention deepseek flash-attention-3 deepseek-v3 minimax-01 deepseek-r1
-
Updated
Feb 19, 2025