Skip to content

Commit 686f1ba

Browse files
authored
Merge pull request Snowflake-Labs#11 from Snowflake-Labs/trust_remote_code
Update inference tutorials to use trust_remote_code instead of transformers fork
2 parents 9fe7ac7 + 2b3b6eb commit 686f1ba

File tree

2 files changed

+8
-7
lines changed

2 files changed

+8
-7
lines changed

inference/README.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@ now you will need to use our forks.
1212

1313
```bash
1414
deepspeed>=0.14.2
15-
git+git://github.com/Snowflake-Labs/transformers.git@arctic
1615
git+git://github.com/Snowflake-Labs/vllm.git@arctic
1716
huggingface_hub[hf_transfer]
1817
```
@@ -38,15 +37,19 @@ os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
3837

3938
import torch
4039
from transformers import AutoModelForCausalLM, AutoTokenizer
41-
from transformers.models.arctic.configuration_arctic import ArcticQuantizationConfig
40+
from deepspeed.linear.config import QuantizationConfig
4241

43-
tokenizer = AutoTokenizer.from_pretrained("Snowflake/snowflake-arctic-instruct")
42+
tokenizer = AutoTokenizer.from_pretrained(
43+
"Snowflake/snowflake-arctic-instruct",
44+
trust_remote_code=True
45+
)
4446

45-
quant_config = ArcticQuantizationConfig(q_bits=8)
47+
quant_config = QuantizationConfig(q_bits=8)
4648

4749
model = AutoModelForCausalLM.from_pretrained(
4850
"Snowflake/snowflake-arctic-instruct",
4951
low_cpu_mem_usage=True,
52+
trust_remote_code=True,
5053
device_map="auto",
5154
ds_quantization_config=quant_config,
5255
max_memory={i: "150GiB" for i in range(8)},

inference/requirements.txt

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,2 @@
11
deepspeed>=0.14.2
2-
git+git://github.com/Snowflake-Labs/transformers.git@arctic
3-
git+git://github.com/Snowflake-Labs/vllm.git@arctic
4-
huggingface_hub[hf_transfer]
2+
huggingface_hub[hf_transfer]

0 commit comments

Comments
 (0)