Skip to content

jimfulton/python-bigquery-sqlalchemy

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SQLAlchemy dialect for BigQuery.

Usage

from sqlalchemy import *
from sqlalchemy.engine import create_engine
from sqlalchemy.schema import *
engine = create_engine('bigquery://project')
logs = Table('dataset.table', MetaData(bind=engine), autoload=True)
print(select([func.count('*')], from_obj=logs).scalar())

Project

project in bigquery://project is used to instantiate BigQuery client with the specific project ID. To infer project from the environment, use bigquery:// – without project

Table names

To query tables from non-default projects, use the following format for the table name: project.dataset.table, e.g.:

sample_table = Table('bigquery-public-data.samples.natality')

Batch size

By default, arraysize is set to 5000. arraysize is used to set the batch size for fetching results. To change it, pass arraysize to create_engine():

engine = create_engine('bigquery://project', arraysize=1000)

Requirements

Install using

  • pip install pybigquery

Testing

Load sample tables:

./scripts/load_test_data.sh

This will create a dataset test_pybigquery with tables named sample_one_row and sample

Set up an environment and run tests:
pyvenv .env source .env/bin/activate pip install -r dev_requirements.txt pytest

TODO

  • Support for Record column type

About

SQLAlchemy dialect for BigQuery

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 88.5%
  • Shell 10.9%
  • Dockerfile 0.6%