-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Record validated tests duration #12638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
tests/aws/services/lambda_/event_source_mapping/test_cfn_resource.validation.json
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, tried it out and works like a charm ✨
Thanks for tackling this Misha. 👏🏼
LocalStack Community integration with Pro 2 files ±0 2 suites ±0 1h 43m 37s ⏱️ +50s Results for commit a1477d0d. ± Comparison against base commit 893018f. ♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it ❤️ Thank you @tiurin for the making this happen 🚀
I can confirm that time reporting works as expected, both in localstack and localstack-ext. I also tested that failing tests or non-AWS test executions are ignored.
if not is_aws_cloud() or outcome.excinfo: | ||
return | ||
# For json.dump sorted test entries enable consistent diffs. | ||
# But test execution data is more readable in insert order for each step (setup, call, teardown). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: neat attention to detail ✨
validation entry is generated with durations TODO - don't sort keys in json.dumps - have phases ordered from first to last tests should still be ordered by name (nodeid) - format floats - run on existing AWS tests - remove dummy test
Alphabetic sorting between test as before but insert order for data inside.
Better readability.
setup and teardown are always successful
Writing file 3 different times as before could lead to inconsistencies if test failed or was interrupted. Only write validation data once and when sure the test has passed. Use test item's stash mechanism to store data between phases.
This reverts commit e2ced0a.
Another wrapper for pytest_runtest_makereport hook is defined in localstack-snapshot using "old-style" hookwrapper. It is not recommended to mix new and old-style wrappers in the same plugin, see a note here: https://pluggy.readthedocs.io/en/latest/index.html#wrappers
- Add time unit to key name - Move total to the same object as phases
95dd620
to
dd929d6
Compare
dd929d6
to
a1477d0
Compare
Test Results - Preflight, Unit21 601 tests ±0 19 955 ✅ ±0 6m 53s ⏱️ +45s Results for commit a1477d0d. ± Comparison against base commit 893018f. |
Test Results (amd64) - Acceptance7 tests ±0 5 ✅ ±0 3m 7s ⏱️ -2s Results for commit a1477d0d. ± Comparison against base commit 893018f. |
Test Results - Alternative Providers597 tests 419 ✅ 15m 3s ⏱️ Results for commit a1477d0d. |
Test Results (amd64) - Integration, Bootstrap 5 files 5 suites 2h 21m 21s ⏱️ Results for commit a1477d0d. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave it a quick glance and just had some minor suggestions.
Thanks for tackling this! One somewhat immediate improvement I could think of is having a pytest flag that would allow you to print statistics of these execution times or at least collect them for each known pytest node id (i.e. for each test). We could use this to track how far along we are in collecting execution data over the whole test suite as well. 🤔
Just some thoughts though, nothing immediately actionable besides the import comment 👍
|
||
When a test runs successfully against AWS, its last validation date and duration are recorded in a corresponding ***.validation.json** file. | ||
The validation date is recorded precisely, while test durations can vary between runs. | ||
For example, test setup time may differ depending on whether a test runs in isolation or as part of a class test suite with class-level fixtures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, test setup time may differ depending on whether a test runs in isolation or as part of a class test suite with class-level fixtures.
Can't we track that anyway and capture it? That way we would avoid flipping between potentially minutes of setup time and only a few microseconds otherwise 🤔
from _pytest.reports import TestReport | ||
from _pytest.stash import StashKey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from _pytest.reports import TestReport | |
from _pytest.stash import StashKey | |
from pytest import TestReport, StashKey |
these are exposed as public API
Motivation
Have more insights into how long does it take to run a parity test against AWS. Have more insights for each test execution phase.
Changes
*.validation.json
, along with already existing last validation date.Considerations
Machines on which tests are executed can vary significantly in processing power. However, in case of AWS validated tests significant amount of time is spent in I/O waiting for AWS endpoint responses. That can also be affected by geographic location. A quick test on a small sample showed under 2% difference on the test in this PR on 3 different machines between Spain and South Africa (all tests were executed against us-east-1 region). Repetitive runs in different times of day on one machine were also similarly stable.
In any case, durations are meant to give an idea about how long the test runs rather than exact numbers.