Skip to content

Running a chain of statements that depend on each other #74

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mistercrunch opened this issue Nov 7, 2020 · 7 comments
Closed

Running a chain of statements that depend on each other #74

mistercrunch opened this issue Nov 7, 2020 · 7 comments
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@mistercrunch
Copy link
Contributor

mistercrunch commented Nov 7, 2020

DECLARE d DATE DEFAULT CURRENT_DATE();
CREATE TEMP FUNCTION Add(x INT64, y INT64) AS (x + y);
SELECT Add(3, 4);

I'm having an issue running a chain of statements that depend on each other in a common session. I'm using a single connection and cursor behind the scene.

It appears that the statements are not chained and the session isn't persisted. Any clue on how to solve this?

Other DBAPI implementations allow this type of chaining, and using the same logic with this driver does not work properly.

@tswast
Copy link
Collaborator

tswast commented Nov 9, 2020 via email

@mistercrunch
Copy link
Contributor Author

Do you know how this chain is being executed by the underlying DB-API?

I'm not sure if you're asking about how we run it, or how the underlying DB-API in google-cloud-bigquery does it (?). Assuming the former, we open a connection object and a cursor object and cursor.execute the series of statements.

Perhaps this would be fixed by changing the executemany() function to issue the statements in one batch?
I think that executemany is a way to run the same SQL statement multiple times with different parameters (?) https://googleapis.dev/python/bigquery/latest/dbapi.html#google.cloud.bigquery.dbapi.Cursor.executemany

Now I'm realizing the issue may be in google-cloud-bigquery, do you think I should open an issue there?

@tswast
Copy link
Collaborator

tswast commented Nov 9, 2020

Yes, https://github.com/googleapis/python-bigquery would be the appropriate location for this issue.

Are you calling execute once? Or once per statement?

If once per statement, are there any transaction semantics you're using? The BigQuery API doesn't hold onto cursor/connection state, so we'll need some kind of additional information in order for the client to batch these up into a single request.

@mistercrunch
Copy link
Contributor Author

mistercrunch commented Nov 10, 2020

We're calling execute for each statement, most other DB-API driver's execute assumes a single statement. All other DB-API drivers we use in Apache Superset respond well to this approach

so we'll need some kind of additional information in order for the client to batch these up into a single request

Wondering what that may be. I saw the notion of job_id reading some source code in google-cloud-bigquery, but unclear as to how to get this to work.

@mistercrunch
Copy link
Contributor Author

I opened an issue on python-bigquery with more details, and really appreciate all the help so far!

@tswast tswast added api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. labels Mar 23, 2021
@tswast
Copy link
Collaborator

tswast commented Mar 23, 2021

Updating this issue from our discussion on the DB-API. Currently we workaround this by batching all statements in to a single query request. This triggers the BigQuery "scripting" path.

The ideal solution is for the BigQuery backend to support true sessions for scripting. If/when they add support for sessions, we can make the update to the DB-API and test it with this SQLAlchemy connector.

@mattalbr
Copy link

mattalbr commented Nov 6, 2024

Sorry to drag up an old issue, but this is where I was looking to figure out how to write such a query using bigquery sqlalchemy.

I'm assuming the SET/DECLARE statements need to be created using text(), but is there a good way to combine these SET/DECLAREs with a selectable at the end without compiling the selectable?

Relatedly, is there a way to do a CREATE TEMP TABLE foo AS () and input a selectable?

Happy to file a new issue along the lines of Document multi-statement query usage if that's easier!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-sqlalchemy API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

4 participants