Skip to content

gh-108834: Cleanup libregrtest #108858

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Sep 4, 2023

Remove "global variables" from libregrtest: no longer pass 'ns' magic namespace which was used to get and set 'random' attributes. Instead, each class has its own attributes, and some classes are even read-only (frozen dataclass), like RunTests.

Reorganize files:

  • Rename runtest.py to single.py.
  • Rename runtest_mp.py to mp_runner.py.
  • Create findtests.py, result.py, results.py and worker.py.

Reorganize classes:

  • Copy command line options ('ns' namespace) to Regrtest and RunTests attributes: add many attributes.
  • Add Results class with methods: get_state(), display_result(), get_exitcode(), JUnit methods, etc.
  • Add Logger class: log(), display_progress(), system load tracker.
  • Remove WorkerJob class: the worker now gets a RunTests instance.

Move function and methods:

  • Convert Regrtest.list_cases() to a function.
  • Convert display_header(), display_sanitizers(), set_temp_dir() and create_temp_dir() methods of Regrtest to functions in utils.py. Rename set_temp_dir() to select_temp_dir(). Rename create_temp_dir() to make_temp_dir().
  • Move abs_module_name() and normalize_test_name() to utils.py.

cmdline.py changes:

  • Rename internal --worker-args command line option to --worker-json.
  • Rename ns.trace to ns.coverage.
  • No longer gets the number of CPUs: it's now done by Regrtest class.
  • Add missing attributes to Namespace: coverage, threshold, wait.

Misc:

  • Add test_parse_memlimit() and test_set_memlimit() to test_support.
  • Add some type annotations.

@vstinner
Copy link
Member Author

vstinner commented Sep 5, 2023

@AlexWaygood @serhiy-storchaka: I would like to merge this change as soon as possible since it's a large refactor, and it's likely that it will get conflicts if it waits for too long. This work should prepare libregrtest to get type annotations.

I would prefer to move most code at least: rename variables, functions, etc. Instead of having a serie of multiple commits.

As usual, I plan to backport this change to stable branches (3.11 and 3.12). Otherwise, backport is just impossible.

@AlexWaygood and me would like to add annotations and then run mypy on a CI, as already done for Argument Clinic.


The regrtest project has a long history. It was added in 1996 by commit 152494a. When it was created, it was 170 lines long and has 4 command line options: -v (verbose), -q (quiet), -g (generate) and-x (exclude). Slowly, it got more and more features:

  • Better command line interface with argparse (it used getopt at the begining)
  • Run tests in parallel with multiple processes (this code caused me a lot of headaches!)
  • Detect when the "environment" is altered: warnings filters, loggers, etc.
  • Re-run failed tests in verbose mode (now they are run in fresh processes)
  • Detect memory, reference and file descriptor leaks
  • Detect leaked files by creating a temporary directory for each test worker process
  • Best effort to restore the machine to its previous state: wait until threads and processes complete, remove temporary files, etc.
  • etc.

Some of these features were implemented in test.support + test.libregrtest.

A few years ago, I decided to split the giant mono regrtest.py file (for example, it was 2 200 lines of Python code in Python 2.7) into sub-files (!). To make it possible, I passed ns argument which is a bag of "global variables" (technically, it's a Namespace class, see cmdline.py).

The problem is that for type annotation, it's very unclear what a Namespace contains. It may or may not have arguments (see my commit message of this PR: Add missing attributes to Namespace: coverage, threshold, wait.), argument types are weakly defined, etc. Moreover, ns is not only used to "get" variables, but also to set variables! For example, find_tests() overrides ns.args. How is it possible to know which ns attributes are used? Are they "read-only"? We don't know just by reading a function prototype.

This large refactoring cleans up everything at once (!) to pass simple types like bool, str or tuple[str]. It's easier to guess the purpose of a function and its behavior just from its prototype.

I tried to create only short files, the longest is still sadly main.py with 891 lines.

$ wc -l *.py|sort -n
     2 __init__.py
    56 pgo.py
   124 win_utils.py
   159 setup.py
   202 refleak.py
   307 utils.py
   329 save_env.py
   451 cmdline.py
   575 runtest.py
   631 runtest_mp.py
   891 main.py
  3727 total

To understand where the ns magic bag of global variables, look at regrtest.py monster in Python 2.7. Its main() functions defines not less than 34 functions inside the main() function! Variables are defined in the main() prototype!

def main(tests=None, testdir=None, verbose=0, quiet=False,
         exclude=False, single=False, randomize=False, fromfile=None,
         findleaks=False, use_resources=None, trace=False, coverdir='coverage',
         runleaks=False, huntrleaks=False, verbose2=False, print_slow=False,
         random_seed=None, use_mp=None, verbose3=False, forever=False,
         header=False, pgo=False, failfast=False, match_tests=None):

I supposed that it was designed to be able to use regrtest as an API: pass parameters to the main() function, without having to use the command line interface. In the main branch, this feature is still supported, the magic **kwargs bag:

def main(tests=None, **kwargs):
    Regrtest().main(tests=tests, **kwargs)

Copy link
Member

@sobolevn sobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very brief review, thanks!

FilterTuple = tuple[str]


# Avoid enum.Enum to reduce the number of imports when tests are run
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now when you import dataclasses you import half of the stdlib: it imports inspect and it imports enum.

Maybe this can be made into an StrEnum as it should be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no. That's bad :-( I should maybe rewrite the code to avoid this heavy dataclasses module. I didn't know that it has so many dependencies. In the past, I worked hard to strictly limit the number of modules imported by test.libregrtest.

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two tiny nits I noticed, but I've only skim-read through so far (will try to look more in-depth tomorrow)

@serhiy-storchaka serhiy-storchaka self-requested a review September 6, 2023 11:04
Remove "global variables" from libregrtest: no longer pass 'ns' magic
namespace which was used to get and set 'random' attributes. Instead,
each class has its own attributes, and some classes are even
read-only (frozen dataclass), like RunTests.

Reorganize files:

* Rename runtest.py to single.py.
* Rename runtest_mp.py to mp_runner.py.
* Create findtests.py, result.py, results.py and worker.py.

Reorganize classes:

* Copy command line options ('ns' namespace) to Regrtest
  and RunTests attributes: add many attributes.
* Add Results class with methods: get_state(), display_result(),
  get_exitcode(), JUnit methods, etc.
* Add Logger class: log(), display_progress(), system load tracker.
* Remove WorkerJob class: the worker now gets a RunTests instance.

Move function and methods:

* Convert Regrtest.list_cases() to a function.
* Convert display_header(), display_sanitizers(), set_temp_dir() and
  create_temp_dir() methods of Regrtest to functions in utils.py.
  Rename set_temp_dir() to select_temp_dir(). Rename
  create_temp_dir() to make_temp_dir().
* Move abs_module_name() and normalize_test_name() to utils.py.
* Move capture_output() to utils.py.
* Rename dash_R() to runtest_refleak()
* Merge display_sanitizers() code into display_header().

cmdline.py changes:

* Rename internal --worker-args command line option to --worker-json.
* Rename ns.trace to ns.coverage.
* No longer gets the number of CPUs: it's now done by Regrtest class.
* Add missing attributes to Namespace: coverage, threshold, wait.

Misc:

* Add test_parse_memlimit() and test_set_memlimit() to test_support.
* Add some type annotations.
@vstinner
Copy link
Member Author

vstinner commented Sep 8, 2023

PR rebased on main to retrieve test_support changes (fix a merge conflict).

@serhiy-storchaka
Copy link
Member

This change is so larger, that it is not realistic to expect a review. If you split it on several smaller PRs, you can get reviews. Do it if you have any doubts or want to get some advices. But if these changes look straightforward to you, and you are sure in them, you can merge this PR. I approve it blindly. I believe that you know what you do, and some mistakes, if they happen, can be fixed later. Just ensure that these changes will be backported to 3.12 and 3.11.

@vstinner
Copy link
Member Author

vstinner commented Sep 8, 2023

I close this PR. Instead I will split it in a long serie of small changes: #109162

@vstinner vstinner closed this Sep 8, 2023
@vstinner vstinner deleted the regrtest_results branch September 8, 2023 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants