Skip to content

feat: Add get_by_ids and aget_by_ids to vectorstore #276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 3, 2025

Conversation

dishaprakash
Copy link
Contributor

feat: Add get_by_ids and aget_by_ids to vectorstore

@dishaprakash dishaprakash requested review from a team as code owners February 27, 2025 08:02
@product-auto-label product-auto-label bot added the api: cloudsql-postgres Issues related to the googleapis/langchain-google-cloud-sql-pg-python API. label Feb 27, 2025
@dishaprakash dishaprakash changed the base branch from main to std-tests February 27, 2025 08:02

column_names = ", ".join(f'"{col}"' for col in columns)

query = f'SELECT {column_names} FROM "{self.schema_name}"."{self.table_name}" WHERE {self.id_column} IN ({id_list_str});'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id_column should be in quotes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

Document(
page_content=row[self.content_column],
metadata=metadata,
id=row[self.id_column],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should cast this to a string just in case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we have that option, but wouldn't that negate the value of having customizable ID columns?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Document interface is enforcing the str Id in the document object https://github.com/langchain-ai/langchain/blob/33354f984fba660e71ca0039cfbd3cf37643cfab/libs/core/langchain_core/documents/base.py#L34. Users will still be able to insert and use the id data type of their choice

@@ -291,6 +291,54 @@ async def __aadd_embeddings(

return ids

async def aget_by_ids(self, ids: Sequence[str]) -> List[Document]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use lower case "list" to follow guidance of https://peps.python.org/pep-0585/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

@dishaprakash dishaprakash requested a review from averikitsch March 2, 2025 05:12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for get_by_ids ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, added a test for the sync function in the async vectorstore

@dishaprakash dishaprakash merged commit 7028b10 into std-tests Mar 3, 2025
8 checks passed
@dishaprakash dishaprakash deleted the add-get-by-ids branch March 3, 2025 19:05
dishaprakash added a commit to googleapis/langchain-google-alloydb-pg-python that referenced this pull request Mar 4, 2025
chore: Minor cleanup of vectorstore 

All the review comments from the [cloud sql pr](googleapis/langchain-google-cloud-sql-pg-python#276) are being propogated here
dishaprakash added a commit to googleapis/langchain-google-alloydb-pg-python that referenced this pull request Mar 4, 2025
* chore: Minor cleanup of vectorstore 

chore: Minor cleanup of vectorstore 

All the review comments from the [cloud sql pr](googleapis/langchain-google-cloud-sql-pg-python#276) are being propogated here

* Use `list` instead of `List`

* remove unused import

* Add sync test in async vectorstore
dishaprakash added a commit to googleapis/langchain-google-alloydb-pg-python that referenced this pull request Mar 19, 2025
…402)

* chore: Add Langchain standard vectorstore tests. (#356)

* chore: Add Langchain standard vectorstore tests.

* add header to test file

* change header to 2025

* suppress warnings

* chore: support ids in Documents and update insert SQL query to upsert (#361)

* chore: support ids in Documents and update insert SQL query to upsert

* remove pulling id from metadata

* Fix previous vectorstore tests

* Add comment

* feat: Add get_by_ids and aget_by_ids to vectorstore (#364)

* feat: Add get_by_ids and aget_by_ids to vectorstore

* remove embedding column from being fetched

* fix test

* chore: Minor cleanup of vectorstore (#371)

* chore: Minor cleanup of vectorstore 

chore: Minor cleanup of vectorstore 

All the review comments from the [cloud sql pr](googleapis/langchain-google-cloud-sql-pg-python#276) are being propogated here

* Use `list` instead of `List`

* remove unused import

* Add sync test in async vectorstore

* feat: Enable DB-Agnostic metadata filtering in vectorstores (#382)

* feat!: Enable DB-Agnostic metadata filtering in vectorstores

* Add header to util file

* Minor change

* Remove older filter tests

* Change the data path

* Linter fix

* Remove async from func definition

* Add sync test to async class

* Remove table after test

* Add fixture marker

* Test

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* Linter fix

* debug

* debug

* debug

* Make non breaking change by supporting string filter

* add text filter tests

* Unify and/or operator

* minor fix

* minor fix

* debug cloud build

* debug cloud build

* debug cloud build

* debug cloud build

* debug cloud build

* debug cloud build

* debug cloud build

* debug cloud build

* Debug

* Make sure add documents does not alter original documents.

* Minor change

* Minor change

* Review changes

* Update test_vectorstore_search.py

* Add extra test for code coverage

* Add extra negative tests
dishaprakash added a commit that referenced this pull request Mar 19, 2025
…#290)

* chore: Add Langchain vectorstore standard suite tests (#272)

* chore: Add Langchain vectorstore standard suite tests

* fix test

* chore: support ids in Documents and update insert SQL query to upsert (#274)

* chore: support ids in Documents and update insert SQL query to upsert

* chore: Add Langchain vectorstore standard suite tests (#272)

* chore: Add Langchain vectorstore standard suite tests

* fix test

* Newline

* remove Newline

* Review changes

* feat: Add get_by_ids and aget_by_ids to vectorstore (#276)

* feat: Add get_by_ids and aget_by_ids to vectorstore

* Review changes

* Add sync test in async vectorstore

* chore: Convert id field in document to str (#278)

chore: Convert id field in document to str

* feat: Enable DB-Agnostic metadata filtering in vectorstores (#289)

* feat: Enable DB-Agnostic metadata filtering in vectorstores

* minor change

* minor change

* minor change

* minor change

* Add negative tests for code coverager

* Add extra tests for code coverage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: cloudsql-postgres Issues related to the googleapis/langchain-google-cloud-sql-pg-python API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants