Closed
Description
Hey guys! thanks for the awesome plugin!
Is your feature request related to a problem? Please describe.
BigQuery only supports 10K parameters to be passed in a single query. while this is enough for the most part, sometimes that could cause real issues, for example:
book_ids = [2,5,22,3123,124123,...] # a list of 23K book ids
session.query(Book.name).filter(Book.book_id.in_(book_ids)).all()
# this returns a BadRequest (400) for having too many parameters (each id is translated to a parameter.
Describe the solution you'd like
Fortunately, BQ supports passing an array as a single parameter, but SQLAlchemy+pybigquery doesn't take advantage of that. I have made a little workaround to fulfill my own needs, I guess this could be used in the pybigquery implementation of Column.in_
import random
from sqlalchemy import bindparam, text
def _workaround_in(schema, column, iterable):
bind_name = column.key + str(random.randint(0, 1_000_000))
bound = bindparam(bind_name, iterable)
return text(
f"`{schema.__tablename__}`.`{column.key}` in UNNEST(:{bind_name})"
).bindparams(bound)
# usage:
book_ids = [2,5,22,3123,124123,...] # a list of 23K book ids
session.query(Book.name).filter(_workaround_in(Book, Book.book_id, book_ids)).all()
# this will work!
Describe alternatives you've considered
I did try to split the statement in the query, but it didn't help.
the only other thing that worked was making the entire query with text.
Activity