Skip to content

Commit cfe0330

Browse files
authored
Merge pull request inclusionAI#12 from inclusionAI/zhujiangang-patch-1
Update README.md
2 parents 35d38cb + 059fedc commit cfe0330

File tree

1 file changed

+13
-0
lines changed

1 file changed

+13
-0
lines changed

README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,19 @@ outputs = llm.generate([text], sampling_params)
125125

126126
```
127127

128+
We utilize YaRN in vLLM to handle long context by add a `rope_scaling` field to the `config.json` file of the model. For example,
129+
130+
```json
131+
{
132+
...,
133+
"rope_scaling": {
134+
"factor": 4.0,
135+
"original_max_position_embeddings": 16384,
136+
"type": "yarn"
137+
}
138+
}
139+
```
140+
128141
#### Online Inference:
129142

130143
```bash

0 commit comments

Comments
 (0)