Skip to content

feat: allow to set clustering and time partitioning options at table creation #928

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Jan 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
077dc3b
refactor: standardize bigquery options handling to manage more options
nlenepveu Dec 4, 2023
aae6359
feat: handle table partitioning, table clustering and more table opti…
nlenepveu Dec 4, 2023
f1f334b
fix: having clustering fields and partitioning exposed has table inde…
nlenepveu Dec 4, 2023
16c63e9
docs: update README to describe how to create clustered and partition…
nlenepveu Dec 5, 2023
2cae630
test: adjust system tests since indexes are no longer populated from …
nlenepveu Dec 9, 2023
913c4fc
test: alembic now supports creating partitioned tables
nlenepveu Dec 10, 2023
39bbd56
test: run integration tests with all the new create_table options
nlenepveu Dec 11, 2023
9d00844
Merge branch 'main' into main
nlenepveu Dec 11, 2023
a0c3737
Merge branch 'main' into main
nlenepveu Dec 19, 2023
743d5a4
chore: rename variables to represent what it is a bit more clearly
nlenepveu Dec 20, 2023
4fd649c
fix: assertions should no be used to validate user inputs
nlenepveu Dec 20, 2023
e9aeedd
refactor: extract process_option_value() from post_create_table() for…
nlenepveu Dec 20, 2023
bbd19ce
docs: add docstring to post_create_table() and _process_option_value()
nlenepveu Dec 20, 2023
b56979f
test: increase code coverage by testing error cases
nlenepveu Dec 21, 2023
285e32d
refactor: better represent the distinction between the option value d…
nlenepveu Dec 21, 2023
f756cb1
test: adding test cases for _validate_option_value_type() and _proces…
nlenepveu Dec 21, 2023
cc81f77
chore: coding style
nlenepveu Dec 21, 2023
1795d16
chore: reformat files with black
nlenepveu Dec 21, 2023
89cf5a9
test: typo in tests
nlenepveu Dec 21, 2023
1944f95
feat: change the option name for partitioning to leverage the TimePar…
nlenepveu Dec 21, 2023
056b49b
fix: TimePartitioning.field is optional
nlenepveu Dec 21, 2023
2ae57e5
chore: coding style
nlenepveu Dec 22, 2023
9ab5acb
test: fix system test with table option bigquery_require_partition_fi…
nlenepveu Dec 22, 2023
3c69236
feat: add support for experimental range_partitioning option
nlenepveu Dec 22, 2023
f039b17
test: fix system test with new bigquery_time_partitioning table option
nlenepveu Dec 22, 2023
27992e4
docs: update README with time_partitioning and range_partitioning
nlenepveu Dec 22, 2023
6fc0354
test: relevant comments in unit tests
nlenepveu Jan 4, 2024
71531a7
test: cover all error cases
nlenepveu Jan 4, 2024
995d1e5
chore: no magic numbers
nlenepveu Jan 9, 2024
a9b8d27
chore: consistency in docstrings
nlenepveu Jan 9, 2024
37c1eb0
chore: no magic number
nlenepveu Jan 9, 2024
badece4
chore: better error types
nlenepveu Jan 9, 2024
8184c38
chore: fix W605 invalid escape sequence
nlenepveu Jan 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 52 additions & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -292,14 +292,65 @@ To add metadata to a table:

.. code-block:: python

table = Table('mytable', ..., bigquery_description='my table description', bigquery_friendly_name='my table friendly name')
table = Table('mytable', ...,
bigquery_description='my table description',
bigquery_friendly_name='my table friendly name',
bigquery_default_rounding_mode="ROUND_HALF_EVEN",
bigquery_expiration_timestamp=datetime.datetime.fromisoformat("2038-01-01T00:00:00+00:00"),
)

To add metadata to a column:

.. code-block:: python

Column('mycolumn', doc='my column description')

To create a clustered table:

.. code-block:: python

table = Table('mytable', ..., bigquery_clustering_fields=["a", "b", "c"])

To create a time-unit column-partitioned table:

.. code-block:: python

from google.cloud import bigquery

table = Table('mytable', ...,
bigquery_time_partitioning=bigquery.TimePartitioning(
field="mytimestamp",
type_="MONTH",
expiration_ms=1000 * 60 * 60 * 24 * 30 * 6, # 6 months
),
bigquery_require_partition_filter=True,
)

To create an ingestion-time partitioned table:

.. code-block:: python

from google.cloud import bigquery

table = Table('mytable', ...,
bigquery_time_partitioning=bigquery.TimePartitioning(),
bigquery_require_partition_filter=True,
)

To create an integer-range partitioned table

.. code-block:: python

from google.cloud import bigquery

table = Table('mytable', ...,
bigquery_range_partitioning=bigquery.RangePartitioning(
field="zipcode",
range_=bigquery.PartitionRange(start=0, end=100000, interval=10),
),
bigquery_require_partition_filter=True,
)


Threading and Multiprocessing
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
Loading