Skip to content

[BE][tests] show local variables on failure in tests #131151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

XuehaiPan
Copy link
Collaborator

@XuehaiPan XuehaiPan commented Jul 19, 2024

Stack from ghstack (oldest at bottom):


As per the title, add argument --locals for unittest and --showlocals --tb=long for pytest in CI.

Some failures cannot be reproduced on the local machine but exist on cloud CI. This change allows us to investigate the test failure more easily.

Example output: https://github.com/pytorch/pytorch/actions/runs/9961546996/job/27523888353?pr=130710#step:20:3361

/opt/conda/envs/py_3.8/lib/python3.8/site-packages/sympy/core/function.py:307: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cls = FloorDiv, base = -1.00000000000000, divisor = -1.00000000000000

    @classmethod
    def eval(cls, base, divisor):
        # python test/test_dynamic_shapes.py -k TestDimConstraints.test_dim_constraints_solve_full
        # Assert triggered by inequality solver
        # assert base.is_integer, base
        # assert divisor.is_integer, divisor
    
        # We don't provide the same error message as in Python because SymPy
        # makes it difficult to check the types.
        if divisor.is_zero:
            raise ZeroDivisionError("division by zero")
        if base in (int_oo, -int_oo, sympy.oo, -sympy.oo) and divisor in (
            int_oo,
            -int_oo,
            sympy.oo,
            -sympy.oo,
        ):
            return sympy.nan
        if base is sympy.nan or divisor is sympy.nan:
            return sympy.nan
    
        if base.is_zero:
            return sympy.S.Zero
        if base.is_integer and divisor == 1:
            return base
        if base.is_integer and divisor == -1:
            return sympy.Mul(base, -1)
        if (
            isinstance(base, sympy.Number)
            and isinstance(divisor, sympy.Number)
            and (
                base in (int_oo, -int_oo, sympy.oo, -sympy.oo)
                or divisor in (int_oo, -int_oo, sympy.oo, -sympy.oo)
            )
        ):
            r = float(base) / float(divisor)
            if r == math.inf:
                return int_oo
            elif r == -math.inf:
                return -int_oo
            elif math.isnan(r):
                return sympy.nan
            else:
                return sympy.Integer(math.floor(r))
        if isinstance(base, sympy.Integer) and isinstance(divisor, sympy.Integer):
            return sympy.Integer(int(base) // int(divisor))
        if isinstance(base, FloorDiv):
            return FloorDiv(base.args[0], base.args[1] * divisor)
    
        # Expands (x + y) // b into x // b + y // b.
        # This only works if floor is an identity, i.e. x / b is an integer.
        for term in sympy.Add.make_args(base):
            quotient = term / divisor
            if quotient.is_integer and isinstance(divisor, sympy.Integer):
                # NB: this is correct even if the divisor is not an integer, but it
                # creates rational expressions that cause problems with dynamic
                # shapes.
                return FloorDiv(base - term, divisor) + quotient
    
        try:
            gcd = sympy.gcd(base, divisor)
            if gcd != 1:
>               return FloorDiv(
                    sympy.simplify(base / gcd), sympy.simplify(divisor / gcd)
                )

base       = -1.00000000000000
cls        = FloorDiv
divisor    = -1.00000000000000
gcd        = 1.00000000000000
quotient   = 1.00000000000000
term       = -1.00000000000000

/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/utils/_sympy/functions.py:159: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (FloorDiv, -1.00000000000000, -1.00000000000000), kwargs = {}

    @wraps(func)
    def wrapper(*args, **kwargs):
        try:
>           retval = cfunc(*args, **kwargs)
E           RecursionError: maximum recursion depth exceeded in comparison
E           
E           To execute this test, run the following from the base repo dir:
E               python test/test_sympy_utils.py -k TestValueRanges.test_binary_ref_fn_floordiv_dtype_float
E           
E           This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

args       = (FloorDiv, -1.00000000000000, -1.00000000000000)
cfunc      = <functools._lru_cache_wrapper object at 0x7fc5303173a0>
func       = <function Function.__new__ at 0x7fc530317280>
kwargs     = {}

cc @seemethere @malfet @pytorch/pytorch-dev-infra @mruberry @ZainRizvi

[ghstack-poisoned]
@XuehaiPan XuehaiPan requested a review from a team as a code owner July 19, 2024 07:54
Copy link

pytorch-bot bot commented Jul 19, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/131151

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 6c64040 with merge base be3eba3 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@XuehaiPan XuehaiPan added module: ci Related to continuous integration module: tests Issues related to tests (not the torch.testing module) better-engineering Relatively self-contained tasks for better engineering contributors topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jul 19, 2024
XuehaiPan added a commit to XuehaiPan/pytorch that referenced this pull request Jul 19, 2024
@ezyang
Copy link
Contributor

ezyang commented Jul 25, 2024

I'll sign off on this, but I'd like someone from dev infra to be aware that this is going in. mmmmm @atalman I choose you!

@XuehaiPan
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 25, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

[ghstack-poisoned]
[ghstack-poisoned]
@XuehaiPan
Copy link
Collaborator Author

I set the default to False on CI. I also add a new label to control the behavior of local variables in a new PR:

@XuehaiPan
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

XuehaiPan added a commit to XuehaiPan/pytorch that referenced this pull request Jul 27, 2024
pytorchmergebot added a commit that referenced this pull request Jul 29, 2024
…l test log verbosity (#131981)"

This reverts commit dfa18bf.

Reverted #131981 on behalf of https://github.com/atalman due to Sorry, need to revert bottom PR, which broke CI: #131151 ([comment](#131981 (comment)))
pytorchmergebot referenced this pull request Jul 29, 2024
…og verbosity (#131981)

Add a new label `ci-test-showlocals` and add it to test config filter.
If the PR is labeled with `ci-test-showlocals` or "ci-test-showlocals"
present in the PR comment, the test config filter will set a environment
variable `TEST_SHOWLOCALS`. Then `pytest` will show local variables on
failures for better debugging.

Pull Request resolved: #131981
Approved by: https://github.com/malfet
@atalman
Copy link
Contributor

atalman commented Jul 29, 2024

@pytorchmergebot revert -c nosignal -m "Broke CI: test_testing.py::TestTestingCUDA::test_cuda_assert_should_stop_common_device_type_test_suite_cuda GH job link HUD commit link"

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

@XuehaiPan your PR has been successfully reverted.

pytorchmergebot added a commit that referenced this pull request Jul 29, 2024
This reverts commit 14158d8.

Reverted #131151 on behalf of https://github.com/atalman due to Broke CI: test_testing.py::TestTestingCUDA::test_cuda_assert_should_stop_common_device_type_test_suite_cuda [GH job link](https://github.com/pytorch/pytorch/actions/runs/10131415299/job/28014665693) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/14158d892a2bd9b34edb5637f9a05217ea0330bd) ([comment](#131151 (comment)))
[ghstack-poisoned]
[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Jul 29, 2024
…og verbosity (#131981)

Add a new label `ci-test-showlocals` and add it to test config filter.
If the PR is labeled with `ci-test-showlocals` or "ci-test-showlocals"
present in the PR comment, the test config filter will set a environment
variable `TEST_SHOWLOCALS`. Then `pytest` will show local variables on
failures for better debugging.

Pull Request resolved: #131981
Approved by: https://github.com/malfet
ghstack dependencies: #131151
XuehaiPan added a commit to XuehaiPan/pytorch that referenced this pull request Jul 30, 2024
@github-actions github-actions bot deleted the gh/XuehaiPan/130/head branch August 29, 2024 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
better-engineering Relatively self-contained tasks for better engineering contributors ciflow/trunk Trigger trunk jobs on your pull request Merged module: ci Related to continuous integration module: tests Issues related to tests (not the torch.testing module) open source Reverted topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants