Skip to content

Support the passing of arrays as single arguments to BigQuery #194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
banuni opened this issue Jun 24, 2021 · 0 comments · Fixed by #253
Closed

Support the passing of arrays as single arguments to BigQuery #194

banuni opened this issue Jun 24, 2021 · 0 comments · Fixed by #253
Assignees
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@banuni
Copy link

banuni commented Jun 24, 2021

Hey guys! thanks for the awesome plugin!

Is your feature request related to a problem? Please describe.
BigQuery only supports 10K parameters to be passed in a single query. while this is enough for the most part, sometimes that could cause real issues, for example:

book_ids = [2,5,22,3123,124123,...]  # a list of 23K book ids
session.query(Book.name).filter(Book.book_id.in_(book_ids)).all()
# this returns a BadRequest (400) for having too many parameters (each id is translated to a parameter.

Describe the solution you'd like
Fortunately, BQ supports passing an array as a single parameter, but SQLAlchemy+pybigquery doesn't take advantage of that. I have made a little workaround to fulfill my own needs, I guess this could be used in the pybigquery implementation of Column.in_

import random
from sqlalchemy import bindparam, text
def _workaround_in(schema, column, iterable):
    bind_name = column.key + str(random.randint(0, 1_000_000))
    bound = bindparam(bind_name, iterable)
    return text(
        f"`{schema.__tablename__}`.`{column.key}` in UNNEST(:{bind_name})"
    ).bindparams(bound)

# usage:
book_ids = [2,5,22,3123,124123,...]  # a list of 23K book ids
session.query(Book.name).filter(_workaround_in(Book, Book.book_id, book_ids)).all()
# this will work!

Describe alternatives you've considered
I did try to split the statement in the query, but it didn't help.
the only other thing that worked was making the entire query with text.

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. label Jun 24, 2021
@yoshi-automation yoshi-automation added triage me I really want to be triaged. 🚨 This issue needs some love. labels Jun 25, 2021
@tswast tswast added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Jul 1, 2021
@jimfulton jimfulton self-assigned this Aug 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
4 participants