From 7d0f019562723e3ab435baae09ccd8627f7118f3 Mon Sep 17 00:00:00 2001 From: Silas Marvin <19626586+SilasMarvin@users.noreply.github.com> Date: Fri, 7 Jun 2024 11:08:19 -0700 Subject: [PATCH 1/4] Added pgml.rank docs --- pgml-cms/docs/api/sql-extension/pgml.rank.md | 34 ++++++++++++++++++++ 1 file changed, 34 insertions(+) create mode 100644 pgml-cms/docs/api/sql-extension/pgml.rank.md diff --git a/pgml-cms/docs/api/sql-extension/pgml.rank.md b/pgml-cms/docs/api/sql-extension/pgml.rank.md new file mode 100644 index 000000000..e8699b66e --- /dev/null +++ b/pgml-cms/docs/api/sql-extension/pgml.rank.md @@ -0,0 +1,34 @@ +--- +description: Rank documents against a piece of text using the specified ranking model. +--- + +# pgml.rank() + +The `pgml.rank` function is used to rank text documents against some text. This function is primarly used as the last step in a search system where the results returned from the initial search are reranked before being used. + +## API + +```postgresql +pgml.rank( + transformer TEXT, -- transformer name + query TEXT, -- text to rank against + documents TEXT[], -- documents to rank + kwargs JSON -- optional arguments (see below) +) +``` + +## Example + +```postgresql +SELECT pgml.rank('mixedbread-ai/mxbai-rerank-base-v1', 'test', ARRAY['doc1', 'doc2']); +``` + +```postgresql +SELECT pgml.chunk('mixedbread-ai/mxbai-rerank-base-v1', 'test', ARRAY['doc1', 'doc2'], '{"return_documents": false, "top_k": 10}'::JSONB); +``` + +## Supported Ranking Models + +We support the following ranking models: + +* `mixedbread-ai/mxbai-rerank-base-v1` From 4a71e4cffd5303db85e45945813999cddd56d1a0 Mon Sep 17 00:00:00 2001 From: Silas Marvin <19626586+SilasMarvin@users.noreply.github.com> Date: Fri, 7 Jun 2024 11:52:17 -0700 Subject: [PATCH 2/4] Update pgml-cms/docs/api/sql-extension/pgml.rank.md Co-authored-by: Lev Kokotov --- pgml-cms/docs/api/sql-extension/pgml.rank.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pgml-cms/docs/api/sql-extension/pgml.rank.md b/pgml-cms/docs/api/sql-extension/pgml.rank.md index e8699b66e..e34bf986f 100644 --- a/pgml-cms/docs/api/sql-extension/pgml.rank.md +++ b/pgml-cms/docs/api/sql-extension/pgml.rank.md @@ -4,7 +4,7 @@ description: Rank documents against a piece of text using the specified ranking # pgml.rank() -The `pgml.rank` function is used to rank text documents against some text. This function is primarly used as the last step in a search system where the results returned from the initial search are reranked before being used. +The `pgml.rank()` function is used to rank text documents against some text. This function is primarily used as the last step in a search system where the results returned from the initial search are re-ranked before being used. ## API From 46dfe4aa1154e1c48b23d37f051289a2effd6f65 Mon Sep 17 00:00:00 2001 From: Silas Marvin <19626586+SilasMarvin@users.noreply.github.com> Date: Fri, 7 Jun 2024 11:57:19 -0700 Subject: [PATCH 3/4] Clarify rank and go sentence case --- pgml-cms/docs/api/sql-extension/pgml.rank.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pgml-cms/docs/api/sql-extension/pgml.rank.md b/pgml-cms/docs/api/sql-extension/pgml.rank.md index e34bf986f..a8caf5efd 100644 --- a/pgml-cms/docs/api/sql-extension/pgml.rank.md +++ b/pgml-cms/docs/api/sql-extension/pgml.rank.md @@ -4,7 +4,7 @@ description: Rank documents against a piece of text using the specified ranking # pgml.rank() -The `pgml.rank()` function is used to rank text documents against some text. This function is primarily used as the last step in a search system where the results returned from the initial search are re-ranked before being used. +The `pgml.rank()` function is used to compute a relevance score between documents and some text. This function is primarily used as the last step in a search system where the results returned from the initial search are re-ranked by relevance before being used. ## API @@ -27,7 +27,7 @@ SELECT pgml.rank('mixedbread-ai/mxbai-rerank-base-v1', 'test', ARRAY['doc1', 'do SELECT pgml.chunk('mixedbread-ai/mxbai-rerank-base-v1', 'test', ARRAY['doc1', 'doc2'], '{"return_documents": false, "top_k": 10}'::JSONB); ``` -## Supported Ranking Models +## Supported ranking models We support the following ranking models: From f9c5ff2e82ecd96ffef0daaa08c92bb57099032c Mon Sep 17 00:00:00 2001 From: Silas Marvin <19626586+SilasMarvin@users.noreply.github.com> Date: Fri, 7 Jun 2024 12:13:54 -0700 Subject: [PATCH 4/4] Clarify examples and add some more information about cross-encoders --- pgml-cms/docs/api/sql-extension/pgml.rank.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/pgml-cms/docs/api/sql-extension/pgml.rank.md b/pgml-cms/docs/api/sql-extension/pgml.rank.md index a8caf5efd..897f13993 100644 --- a/pgml-cms/docs/api/sql-extension/pgml.rank.md +++ b/pgml-cms/docs/api/sql-extension/pgml.rank.md @@ -19,16 +19,22 @@ pgml.rank( ## Example +Ranking documents is as simple as calling the the function with the documents you want to rank, and text you want to rank against: + ```postgresql SELECT pgml.rank('mixedbread-ai/mxbai-rerank-base-v1', 'test', ARRAY['doc1', 'doc2']); ``` +By default the `pgml.rank()` function will return and rank all of the documents. The function can be configured to only return the relevance score and index of the top k documents by setting `return_documents` to `false` and `top_k` to the number of documents you want returned. + ```postgresql -SELECT pgml.chunk('mixedbread-ai/mxbai-rerank-base-v1', 'test', ARRAY['doc1', 'doc2'], '{"return_documents": false, "top_k": 10}'::JSONB); +SELECT pgml.rank('mixedbread-ai/mxbai-rerank-base-v1', 'test', ARRAY['doc1', 'doc2'], '{"return_documents": false, "top_k": 10}'::JSONB); ``` ## Supported ranking models -We support the following ranking models: +We currently support cross-encoders for re-ranking. Check out [Sentence Transformer's documentation](https://sbert.net/examples/applications/cross-encoder/README.html) for more information on how cross-encoders work. + +By default we provide the following ranking models: * `mixedbread-ai/mxbai-rerank-base-v1`