Skip to content

Commit d0b5fa9

Browse files
committed
typo
1 parent 6ea5cdb commit d0b5fa9

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

inference/vllm/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ working on improving this performance so stay tuned!
9393

9494
## Pre-sharded Quantized Checkpoint
9595

96-
The main Arctic checkpoint is ~900GB of bfloat16 weights, which may be cumbersome if being moved or loaded into vLLM frequenctly. To assuage this issue, we've also created a checkpoint that's already quantized to fp8 using DeepSpeed. This checkpoint is ~460GB and is only compatible with vLLM using tensor-parallelism of size 8.
96+
The main Arctic checkpoint is ~900GB of bfloat16 weights, which may be cumbersome if being moved or loaded into vLLM frequently. To assuage this issue, we've also created a checkpoint that's already quantized to fp8 using DeepSpeed. This checkpoint is ~460GB and is only compatible with vLLM using tensor-parallelism of size 8.
9797

9898
Checkpoint: https://huggingface.co/Snowflake/snowflake-arctic-instruct-vllm
9999

0 commit comments

Comments
 (0)