Update run-vllm.md: pin vllm version (#1990)

volsgd · web-flow · commit 33cae367628f · 2025-08-05T14:29:53.000-07:00
diff --git a/articles/gpt-oss/run-vllm.md b/articles/gpt-oss/run-vllm.md
@@ -26,7 +26,10 @@ Both models are **MXFP4 quantized** out of the box.
 ```shell
 uv venv --python 3.12 --seed
 source .venv/bin/activate
-uv pip install vllm --torch-backend=auto
+uv pip install --pre vllm==0.10.1+gptoss \
+    --extra-index-url https://wheels.vllm.ai/gpt-oss/ \
+    --extra-index-url https://download.pytorch.org/whl/nightly/cu128 \
+    --index-strategy unsafe-best-match
 ```
 
 2. **Start up a server and download the model**