You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+14-17Lines changed: 14 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,9 +26,9 @@ export REPO_DIR=<path to the llm-reranker directory
26
26
```
27
27
28
28
## 1. Retrieval
29
-
We use [contriever]() as the underlying retrieval model. The precomputed query and passage embeddings for BEIR are available [here](https://huggingface.co/datasets/rryisthebest/Contreiever_BEIR_Embeddings/tree/main)
29
+
We use [contriever](https://github.com/facebookresearch/contriever) as the underlying retrieval model. The precomputed query and passage embeddings for BEIR are available [here](https://huggingface.co/datasets/rryisthebest/Contreiever_BEIR_Embeddings).
30
30
31
-
**Note:** If you wish to not run the retrieval yourself, the retrieval results are provided [here](https://drive.google.com/drive/folders/1eMiqwiTVwJy_Zcss7LQF9hQ1aeTFMZUm?usp=sharing) and you can directly jump to
31
+
**Note:** If you wish to not run the retrieval yourself, the retrieval results are provided [here](https://drive.google.com/drive/folders/1eMiqwiTVwJy_Zcss7LQF9hQ1aeTFMZUm?usp=sharing) and you can directly jump to [Reranking](#2-reranking)
32
32
33
33
34
34
To run the contriever retrieval using the precomputed encodings
@@ -42,40 +42,39 @@ To get the retrieval scores, run:
42
42
bash bash/beir/run_eval.sh rank
43
43
```
44
44
45
-
46
-
47
45
## 2. Reranking
48
46
### 2a. Baseline Cross-encoder reranking
49
-
Cross-encoder rerankig config is at `{REPO_DIR}/bash/beir/run_rerank_CE.sh`
47
+
50
48
To run the baseline cross encoder re-ranking, run:
51
49
```
52
50
bash bash/beir/run_rerank.sh
53
51
```
54
-
### 2b. LLM Reranking
55
-
LLM results preparation config is at `{REPO_DIR}/bash/beir/run_convert_results.sh`
56
-
To prepare retrieval results for LLM reranking, run:
52
+
### 2b. FIRST LLM Reranking
53
+
54
+
To convert the retrieval results to input for LLM reranking, run:
57
55
58
56
```
59
57
bash bash/beir/run_convert_results.sh
60
58
```
61
59
62
-
LLM rerankig config is at `{REPO_DIR}/bash/beir/run_rerank_llm.sh`
63
-
To run the LLM reranking, run:
60
+
We provide the trained FIRST reranker [here](https://huggingface.co/rryisthebest/First_Model).
61
+
62
+
To run the FIRST reranking, run:
64
63
65
64
```
66
65
bash bash/beir/run_rerank_llm.sh
67
66
```
68
67
69
-
Evaluation config is at `{REPO_DIR}/bash/beir/run_eval.sh`
70
-
To verify that ranking performance has improved from reranking, run:
68
+
To evaluate the reranking performance, run:
69
+
71
70
```
72
71
bash bash/run_eval.sh rerank
73
72
74
-
Set flag --suffix to "llm_FIRST_alpha" for FIRST LLM evaluation or "ce" for cross encoder reranker
75
73
```
76
-
74
+
**Note:** Set flag --suffix to "llm_FIRST_alpha" for FIRST reranker evaluation or "ce" for cross encoder reranker
77
75
78
76
## 3. Model Training
77
+
We also provide the data and scripts to train the LLM reranker by yourself if you wish to do so.
79
78
### 3a. Training Dataset
80
79
Converted training dataset (alphabetic IDs) is on [HF](https://huggingface.co/datasets/rryisthebest/rank_zephyr_training_data_alpha). The standard numeric training dataset can be found [here](https://huggingface.co/datasets/castorini/rank_zephyr_training_data).
81
80
@@ -87,14 +86,12 @@ We support three training objectives:
87
86
-**Combined**: The Combined objective, which we introduce in our paper, is a novel weighted approach that seamlessly integrates both ranking and generation principles, and is the setting applied to the FIRST model.
88
87
89
88
90
-
Training and accelerate configs are at `{REPO_DIR}/bash/run_train.sh` and `{REPO_DIR}/train_configs/accel_config.yaml`, respectively.
91
-
92
89
To train the model, run:
93
90
```
94
91
bash bash/beir/run_train.sh
95
92
```
96
93
97
-
To train gated model, login to Huggingface and get token access at huggingface.co/settings/tokens.
94
+
To train a gated model, login to Huggingface and get token access at huggingface.co/settings/tokens.
0 commit comments