-
Notifications
You must be signed in to change notification settings - Fork 44
testing: fix tests against emulator and remove EmulatorCredentials hack #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testing: fix tests against emulator and remove EmulatorCredentials hack #33
Conversation
The google.cloud.auth package provides a Credentials class for use with emulators (and other systems that do not require oauth credentials). Use this rather than rolling out own. It seems a couple of people around the internet have copied this method for running code against the emulators. We should ensure that if people are going to copy these tests, that at least they do something less hacky.
We remove the requirement for the GOOGLE_APPLICATION_CREDENTIALS if we detect that we should be running against the local datastore emulator.
This unit test was overwriting the datastore client being used to be a default constructed client. This was causing it to be constructed with the default credentials object, which requires that the test is either run in GCP or with the `GOOGLE_APPLICATION_CREDENTIALS` environment variable set.
Move away from assertTrue to a helper that outputs the actual value and the expected value when the test fails.
The last test run in TestDatastoreQuery would fail as it was trying to remove more than 500 entities at once. We break the deletes up into batches to work around this limitation of the datastore api.
These tests assume that specific pieces of data exist within datastore, but nothing in the test created these pieces of data. Also, the TestDatastoreQuery.test_query_offset_timestamp_keys test assumes there is >= 10k entities in the datastore, but the create function would only add 1k entities.
The datastore emulator does not check the size of the response messages before trying to send them back to the client. For all of the tests that return large objects, this results in the response exceeding the ~4MiB grpc message limit (enforces in message_size_filter). The response size for these tests is ~10MiB. Since we do not need the actual contents of the entity, we can request just the keys, via a keys_only query. Technically this is changing they type of query, which may be considered testing a different thing. I will leave this up to the maintainers if they want to merge this change.
There is both a 500 entities per mutation limit, and a ~4MiB request/response size limit. Update both the populate and clear libraries to stay below these limits.
noxfile.py
Outdated
@@ -30,7 +30,7 @@ | |||
BLACK_PATHS.append("samples") | |||
|
|||
|
|||
@nox.session(python="3.7") | |||
@nox.session(python=["3.7", "3.8"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pushing these changes!
The things is that noxfiles are auto generated, you can see the disclaimer on the line 17.
That means we should not do any changes manually, as they will be reverted on the next code generation.
On the other hand, adding such a changes into templates from which noxfiles will be generated, means that all Google Cloud clients will be affected. I suppose, let's wait for responsible googler to approve these issues first, as it sounds like a weighty decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not see this comment (sorry, I skimmed the file and the made a minimal change so that I could actually run the tests).
I am happy to wait for an answer from someone else. I can also remove this specific commit from the PR if you prefer (though once I do, I will be unable to rerun the tests if I am requested to make changes to other parts of this PR).
@tsukasa-au I backed out the Python-version-related changes in |
@tsukasa-au The unit test failures are due to #53: its fix is in PR #54, which |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applying black
@@ -101,8 +101,10 @@ def system(session): | |||
system_test_path = os.path.join("tests", "system.py") | |||
system_test_folder_path = os.path.join("tests", "system") | |||
# Sanity check: Only run tests if the environment variable is set. | |||
if not os.environ.get("GOOGLE_APPLICATION_CREDENTIALS", ""): | |||
session.skip("Credentials must be set via environment variable") | |||
is_emulator = 'DATASTORE_DATASET' in os.environ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_emulator = 'DATASTORE_DATASET' in os.environ | |
is_emulator = "DATASTORE_DATASET" in os.environ |
is_emulator = 'DATASTORE_DATASET' in os.environ | ||
has_credentials = bool(os.environ.get("GOOGLE_APPLICATION_CREDENTIALS", "")) | ||
if not is_emulator and not has_credentials: | ||
session.skip("Credentials must be set via environment variable for non-emulator tests") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
session.skip("Credentials must be set via environment variable for non-emulator tests") | |
session.skip( | |
"Credentials must be set via environment variable for non-emulator tests" | |
) |
@@ -81,10 +81,24 @@ def remove_kind(kind, client): | |||
|
|||
|
|||
def remove_all_entities(client): | |||
BATCH_SIZE = 500 # Datastore API only allows 500 mutations in a single call. | |||
KEY_BYTES_LIMIT = 3 << 20 # grpc limit is ~4MiB, so use a 3MiB limit (to work around any encoding issues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KEY_BYTES_LIMIT = 3 << 20 # grpc limit is ~4MiB, so use a 3MiB limit (to work around any encoding issues) | |
KEY_BYTES_LIMIT = ( | |
3 << 20 | |
) # grpc limit is ~4MiB, so use a 3MiB limit (to work around any encoding issues) |
return _estimate_entity_size(value) | ||
result = 0 | ||
for key, value in entity.items(): | ||
result += len(key) # The number of runes is fine, no point forcing a utf-8 encoding here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result += len(key) # The number of runes is fine, no point forcing a utf-8 encoding here. | |
result += len( | |
key | |
) # The number of runes is fine, no point forcing a utf-8 encoding here. |
def add_large_character_entities(client=None): | ||
TOTAL_OBJECTS = 2500 | ||
NAMESPACE = "LargeCharacterEntity" | ||
KIND = "LargeCharacter" | ||
MAX_STRING = (string.ascii_lowercase * 58)[:1500] | ||
|
||
BATCH_SIZE = 500 # Datastore API only allows 500 mutations in a single call. | ||
RPC_BYTES_LIMIT = 3 << 20 # grpc limit is ~4MiB, so use a 3MiB limit (to work around any encoding issues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RPC_BYTES_LIMIT = 3 << 20 # grpc limit is ~4MiB, so use a 3MiB limit (to work around any encoding issues) | |
RPC_BYTES_LIMIT = ( | |
3 << 20 | |
) # grpc limit is ~4MiB, so use a 3MiB limit (to work around any encoding issues) |
while entities: | ||
approx_rpc_bytes = 0 | ||
batch = [] | ||
while entities and len(batch) < BATCH_SIZE and approx_rpc_bytes < RPC_BYTES_LIMIT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while entities and len(batch) < BATCH_SIZE and approx_rpc_bytes < RPC_BYTES_LIMIT: | |
while ( | |
entities | |
and len(batch) < BATCH_SIZE | |
and approx_rpc_bytes < RPC_BYTES_LIMIT | |
): |
while len(batch) < BATCH_SIZE and key_bytes < RPC_BYTES_LIMIT and all_keys: | ||
batch.append(all_keys.pop()) | ||
if batch[-1].name is None: | ||
key_bytes += 9 # It takes 9 bytes for the largest varint encoded number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
key_bytes += 9 # It takes 9 bytes for the largest varint encoded number | |
key_bytes += ( | |
9 # It takes 9 bytes for the largest varint encoded number | |
) |
batch_size = 500 | ||
assert num_batches * batch_size > 10000, 'test_query_offset_timestamp_keys requires at least 10k entries, otherwise it fails' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert num_batches * batch_size > 10000, 'test_query_offset_timestamp_keys requires at least 10k entries, otherwise it fails' | |
assert ( | |
num_batches * batch_size > 10000 | |
), "test_query_offset_timestamp_keys requires at least 10k entries, otherwise it fails" |
This PR contains many fixes for a myrid of issues with the unit tests for this project.
It started with me wanting to clean up the
EmulatorCreds()
hack (which appears to be partially removed now...), replacing it with theAnonymousCredentials
(which appears to the correct solution).I then needed to perform a lot of work to make the unit tests pass against the emulator (since it was not clear how expensive this test suite would be to run against a real datastore instance). As you can see there are/were many fixes required.
After all of these changes, I have also run these tests against datastore directly to ensure that I have not broken any tests.
NOTE: I would prefer that these changes are not squashed when they are merged.
Fixes #31 and #32