-
Notifications
You must be signed in to change notification settings - Fork 138
feat: Allow jobs to be run in a different project #1180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
c0befe0
to
6392440
Compare
Could you add some tests for this feature? |
No problem. I'll change it to As far as tests go, I'll look into that. I didn't want to write the tests if it was unlikely to be merged. |
I've believe I've fixed the existing unit tests and added one that is billing_project_id aware, but not sure how to wire in any cursor/bqclient based tests, since it would mean making the existing mocks project aware. Any suggestions? |
62ab83d
to
addb925
Compare
We can change the mock dbapi to take project_id but with a default value:
|
I think this change is breaking the system tests, could you fix that too? Also it would be nice if we could add a system test for this change. You can query bigquery public dataset so it's in a different project. |
Ok. I've spent quite a bit of time looking at the system tests for the past few days and was struggling to get the tests working as I wanted them to, covering all of the different combinations. I've created a In short, I wanted to be confident that the library would function correctly with all of the URL/table scenarios listed below. REMOTE_TESTCASES = [
("bigquery://bigquery-public-data", "bigquery-public-data.fda_drug.drug_enforcement"),
("bigquery://bigquery-public-data", "fda_drug.drug_enforcement"),
("bigquery://bigquery-public-data/fda_drug", "bigquery-public-data.fda_drug.drug_enforcement"),
("bigquery://bigquery-public-data/fda_drug", "drug_enforcement"),
("bigquery://bigquery-public-data/fda_drug", "fda_drug.drug_enforcement"),
] I was especially trying to get the second entry to work. As part of this I dug deeper into this library and the native bigquery library and found the There is now a matrix of project settings, since depending what is specified it will use the project from the URL, the billing_project_id field or the default from authentication. I'm not sure if there are any tests right now that cover all three being different. If you feel these are necessary can you outline how they might look? I believe all of the pre-existing unit tests and system tests should now be working. On a related note... I'm not that familiar with bigquery-public-data datasets and especially how often they change. The existing query I am using in the above tests is just asserting against the values in a specific row of |
47d0f5b
to
4b562bc
Compare
Hello. Do you know when you're likely to cut a new tagged release? |
Thanks for asking! I will make a release now. |
EDIT: For clarity modified PR definition to reference
billing_project_id
instead ofjob_project_id
based on comments below.Fixes #1181 🦕
This implements the logic to allow jobs to be run in a different project and allows for an explicit separation of jobs and data.
If the
billing_project_id
engine parameter is specified:project_id=billing_project_id
project_id
which is the data project, specified in the bigquery:// URL.If
billing_project_id
is not specified it will just default toproject_id
from the bigquery:// URL.Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly: