-
Notifications
You must be signed in to change notification settings - Fork 45
JSON reporting #131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON reporting #131
Conversation
This just tests which names exist without checking anything else. It also doesn't use hypothesis, so it should work even if the library doesn't have proper hypothesis support (assuming the test suite can even run at all in that case, I'm not completely sure).
That way it can be run even on libraries that fail the hypothesis sanity checks.
@@ -34,6 +34,10 @@ | |||
f for n, f in inspect.getmembers(array, predicate=inspect.isfunction) | |||
if n != "__init__" # probably exists for Sphinx | |||
] | |||
array_attributes = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about array_properties
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And in any case, if you could add this to __all__
. Maybe this is just a weird side product of how I learnt Python, but I like using it to denote the internally-public API of a module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pytest-json-report
looks nifty!
@asmeurer nice! For review/discussion it would be useful if you added an example of what the JSON output looks like (maybe in a gist?) if you have one. |
…lizable Otherwise it just shows up as a warning in the test results, which might easily be missed.
This is ready to be merged. I've extracted a lot of custom metadata from the tests in to the report. For hypothesis errors, for now it only extracts the error message (string). I could look into trying to extract the actual data itself and JSON serializing it, but I think that will be harder, so for now I'm going to wait until it looks like we actually need that. I have uploaded an example report for If there is any other metadata you can think of that should be included, let me know. |
It was actually not enabling it at all ever.
The inputs could be things that aren't generically recognizable, e.g., for plain numpy, they could be numpy ufuncs.
Something to keep in mind when doing any further developments on the test suite is to make sure that any inputs to test functions are JSON-serializable. If they aren't, the |
Maybe using |
This reverts commit f356ffd. It doesn't actually save that much space, so let's just use the default since it has more information and it keeps the code simpler.
OK, it turns out I was actually wrong here. The tracebacks do contribute a some amount of data in general, but it's not actually the reason the plain numpy report is over half a gigabyte. That's apparently because of the warnings. There are over a million warnings emitted in the run, and they account for 98% of the JSON file size. Of these 98% are the warning " Here's the full warning metadata that's repeated so many times: {
"message": "`np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.\nDeprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations",
"category": "DeprecationWarning",
"when": "runtest",
"filename": "/Users/aaronmeurer/anaconda3/envs/array-apis/lib/python3.9/site-packages/hypothesis/extra/array_api.py",
"lineno": 119
} @honno, any idea why this warning would be omitted every single time hypothesis calls I did try switching from the default traceback type to native, but the savings are actually pretty small, so I'm going to revert it because the default type has more information, and because it keeps the code simpler. |
(I assume you're testing the top-level If you meant emitted, I'd guess it's to do with the pytest reporting steps. Hypothesis/pytest might "merge" warnings together as it does errors in the reporting stage, so pytest would only show one warning, but however we capturing warnings here ignores that. Generally it makes sense that warning is emitted many times, as |
Note flaky CI failure is due to #125 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like prefering array_attributes -> array_properties
, personally I'd prefer test_has_names.py:: test_has_names -> test_has_attributes.py:: test_has_attributes
.
conftest.py
Outdated
@@ -7,8 +7,9 @@ | |||
from array_api_tests import _array_module as xp | |||
from array_api_tests._array_module import _UndefinedStub | |||
|
|||
settings.register_profile("xp_default", deadline=800) | |||
from reporting import pytest_metadata, add_extra_json_metadata # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a small comment explaining what's being imported here and why? I assume it's doing some magic when imported. Oh import reporting
is importing the file you've introduced. I'd lean towards just deleting reporting.py
and putting those fixtures in conftest.py
, but I see value in not bloating conftest.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the names have to be defined here so that pytest will see them as fixtures. I decided to keep the reporting in a separate file to keep things clean, since it's already a good bit of code and could potentially be more if we decide to add more stuff to the report in the future.
Your suggestion is to have two separate files for this? I don't really see the value in that, especially since the file is so short and uses the same logic for names and attributes. |
My point with the warnings is that by default, warnings are only omitted once per warning: >>> import numpy as np
>>> np.bool
<stdin>:1: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
<class 'bool'>
>>> np.bool
<class 'bool'> But this exact same warning is emitted every single time. So somewhere something is changing the warnings filter to enable them always. I did see some |
Ah I was being lazy heh, I meant rename |
Ooo, this feels like Numpy is doing some magic which pytest/Hypothesis isn't equipped for? Probably requires some hacking around to fix. |
It shouldn't do anything fancy; the one thing is |
NumPy's pytest.ini is irrelevant here. It looks like the pytest defaults are to store each warning invocation separately. Since the warning metadata doesn't even contain enough information to distinguish the different warnings, I think the simplest solution here is to post-process the JSON to deduplicate the warnings. (probably a lot of the stuff I'm doing here could be upstreamed to the pytest-json-report library) |
This was causing the report file sizes to be huge, especially for plain numpy.
OK, I deduplicated the warnings output. The plain numpy report is now 11 MB and numpy.array_api is 6.1 MB (both uncompressed). I've uploaded examples of each to the gist. I'd like to hear feedback from Athan on whether I need to try to reduce this file size any further. Again, these are uncompressed sizes, and they compress well (both are under 0.5 MB after basic zip compression). |
Ah my confusion here is say the following # Using ipython
>>> class foopy_factory:
... @property
... def bool(self):
... warn("don't use this", DeprecationWarning)
... return bool
...
>>> foopy = foopy_factory()
>>> foopy.bool
DeprecationWarning: don't use this
bool
>>> foopy.bool
DeprecationWarning: don't use this # warning is emitted again
boo
...
>>> np.bool
DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`...
np.bool
>>> bool
DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`... # warning is emitted again
np.bool the warning is emitted for each access of
I see now that the emit-once behaviour is seen using the default # Using python
>>> class foopy_factory:
... @property
... def bool(self):
... warn("don't use this", DeprecationWarning)
... return bool
...
>>> foopy = foopy_factory()
>>> foopy.bool
DeprecationWarning: don't use this
bool
>>> foopy.bool
bool # no warning this time Seems like Aaron's found a solution anywho, but might be worth noting that environment things can affect this behaviour... or I might be missing something ( |
Need to have this merged so I can start testing it. |
@asmeurer Looking through the sample JSON, they look fine to me. No need to reduce further, IMO. The raw files can be parsed and transformed for subsequent presentation on, e.g., a website. So not too concerned regarding initial report file sizes, particularly as we can store them on disk in compressed format. |
One thing we should consider, however, is versioning the JSON format. E.g., using JSON schema. This would allow versioning our backend server APIs to handle transformations of different versions of the JSON report. In which case, adding a |
JSON reporting will be done with the pytest-json-report package. This produces a pretty usable JSON and makes it easy to extend it with custom metadata at test run time.
The idea will be to take the JSON generated by pytest-json-report and translate it into something more usable by a reporting tool. To start with, I will focus on test_has_names, which has been reintroduced here, to list which spec names are or are not defined in the given package.