Skip to content

Commit fa02a65

Browse files
authored
Update README.md
1 parent 3faf2d2 commit fa02a65

File tree

1 file changed

+7
-3
lines changed

1 file changed

+7
-3
lines changed

inference/README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@ now you will need to use our forks.
1212

1313
```bash
1414
deepspeed>=0.14.2
15-
git+git://github.com/Snowflake-Labs/transformers.git@arctic
1615
git+git://github.com/Snowflake-Labs/vllm.git@arctic
1716
huggingface_hub[hf_transfer]
1817
```
@@ -39,14 +38,19 @@ os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
3938
import torch
4039
from transformers import AutoModelForCausalLM, AutoTokenizer
4140
from transformers.models.arctic.configuration_arctic import ArcticQuantizationConfig
41+
from deepspeed.linear.config import QuantizationConfig
4242

43-
tokenizer = AutoTokenizer.from_pretrained("Snowflake/snowflake-arctic-instruct")
43+
tokenizer = AutoTokenizer.from_pretrained(
44+
"Snowflake/snowflake-arctic-instruct",
45+
trust_remote_code=True
46+
)
4447

45-
quant_config = ArcticQuantizationConfig(q_bits=8)
48+
quant_config = QuantizationConfig(q_bits=8)
4649

4750
model = AutoModelForCausalLM.from_pretrained(
4851
"Snowflake/snowflake-arctic-instruct",
4952
low_cpu_mem_usage=True,
53+
trust_remote_code=True,
5054
device_map="auto",
5155
ds_quantization_config=quant_config,
5256
max_memory={i: "150GiB" for i in range(8)},

0 commit comments

Comments
 (0)