Stack trace from pytest is very far away and far too find on some tests

### 🐛 Describe the bug

Sample:

```
2024-11-20T18:12:40.4550185Z =================================== FAILURES ===================================
2024-11-20T18:12:40.4550910Z _ DynamicShapesCppWrapperCpuTests.test_linear_with_pointwise_batch_size_384_in_features_196_out_features_385_bias_True_epilogue_hardsigmoid_cpu_bfloat16_dynamic_shapes_cpp_wrapper _
2024-11-20T18:12:40.4551025Z Traceback (most recent call last):
2024-11-20T18:12:40.4551509Z   File "/var/lib/jenkins/workspace/test/inductor/test_cpu_select_algorithm.py", line 321, in test_linear_with_pointwise
2024-11-20T18:12:40.4551789Z     self.assertEqual(counters["inductor"]["cpp_epilogue_fusion_counter"], 1)
2024-11-20T18:12:40.4552278Z   File "/opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/testing/_internal/common_utils.py", line 3977, in assertEqual
2024-11-20T18:12:40.4552413Z     raise error_metas.pop()[0].to_error(
2024-11-20T18:12:40.4552546Z AssertionError: Scalars are not equal!
2024-11-20T18:12:40.4552550Z 
2024-11-20T18:12:40.4552649Z Expected 1 but got 0.
2024-11-20T18:12:40.4552749Z Absolute difference: 1
2024-11-20T18:12:40.4552866Z Relative difference: 1.0
2024-11-20T18:12:40.4552872Z 
2024-11-20T18:12:40.4553073Z To execute this test, run the following from the base repo dir:
2024-11-20T18:12:40.4553991Z     python test/inductor/test_cpu_cpp_wrapper.py DynamicShapesCppWrapperCpuTests.test_linear_with_pointwise_batch_size_384_in_features_196_out_features_385_bias_True_epilogue_hardsigmoid_cpu_bfloat16_dynamic_shapes_cpp_wrapper
2024-11-20T18:12:40.4553998Z 
2024-11-20T18:12:40.4554243Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2024-11-20T18:12:40.4554452Z ----------------------------- Captured stdout call -----------------------------
2024-11-20T18:12:40.4554546Z inline_call []
2024-11-20T18:12:40.4554706Z stats [('calls_captured', 2), ('unique_graphs', 1)]
2024-11-20T18:12:40.4555855Z inductor [('pattern_matcher_nodes', 8), ('benchmarking.TritonBenchmarker.benchmark_cpu', 3), ('pattern_matcher_count', 2), ('fxgraph_cache_bypass', 1), ('select_algorithm_precompile', 1), ('benchmarking.TritonBenchmarker.benchmark', 1), ('select_algorithm_autotune', 1), ('cpp_epilogue_fusion_counter', 0)]
2024-11-20T18:12:40.4555980Z aot_autograd [('total', 1), ('ok', 1)]
2024-11-20T18:12:40.4556178Z ----------------------------- Captured stderr call -----------------------------
2024-11-20T18:12:40.4556307Z AUTOTUNE linear_unary(384x196, 385x196, 385)
2024-11-20T18:12:40.4556430Z   cpp_packed_gemm_0 0.2223 ms 100.0% 
2024-11-20T18:12:40.4556542Z   _linear_pointwise 650.3270 ms 0.0% 
2024-11-20T18:12:40.4556930Z SingleProcess AUTOTUNE benchmarking takes 0.3771 seconds and 3.6158 seconds precompiling for 2 choices
2024-11-20T18:12:40.4557206Z ----------------------------- Captured stdout call -----------------------------
2024-11-20T18:12:40.4557390Z inline_call []
2024-11-20T18:12:40.4557571Z stats [('calls_captured', 2), ('unique_graphs', 1)]
2024-11-20T18:12:40.4557685Z aot_autograd [('total', 1), ('ok', 1)]
2024-11-20T18:12:40.4558845Z inductor [('pattern_matcher_nodes', 8), ('benchmarking.TritonBenchmarker.benchmark_cpu', 3), ('pattern_matcher_count', 2), ('fxgraph_cache_bypass', 1), ('select_algorithm_precompile', 1), ('benchmarking.TritonBenchmarker.benchmark', 1), ('select_algorithm_autotune', 1), ('cpp_epilogue_fusion_counter', 0)]
2024-11-20T18:12:40.4559079Z ----------------------------- Captured stderr call -----------------------------
2024-11-20T18:12:40.4559851Z /opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/utils/_config_module.py:321: UserWarning: Skipping serialization of skipfiles_inline_module_allowlist value {}
2024-11-20T18:12:40.4559954Z   warnings.warn(
2024-11-20T18:12:40.4560095Z AUTOTUNE linear_unary(384x196, 385x196, 385)
2024-11-20T18:12:40.4560206Z   cpp_packed_gemm_1 0.2327 ms 100.0% 
2024-11-20T18:12:40.4560319Z   _linear_pointwise 646.5265 ms 0.0% 
2024-11-20T18:12:40.4560709Z SingleProcess AUTOTUNE benchmarking takes 0.3742 seconds and 3.5428 seconds precompiling for 2 choices
2024-11-20T18:12:40.4560909Z ----------------------------- Captured stdout call -----------------------------
2024-11-20T18:12:40.4561024Z inline_call []
2024-11-20T18:12:40.4561171Z stats [('calls_captured', 2), ('unique_graphs', 1)]
2024-11-20T18:12:40.4561301Z aot_autograd [('total', 1), ('ok', 1)]
2024-11-20T18:12:40.4562456Z inductor [('pattern_matcher_nodes', 8), ('benchmarking.TritonBenchmarker.benchmark_cpu', 3), ('pattern_matcher_count', 2), ('fxgraph_cache_bypass', 1), ('select_algorithm_precompile', 1), ('benchmarking.TritonBenchmarker.benchmark', 1), ('select_algorithm_autotune', 1), ('cpp_epilogue_fusion_counter', 0)]
2024-11-20T18:12:40.4562696Z ----------------------------- Captured stderr call -----------------------------
2024-11-20T18:12:40.4563440Z /opt/conda/envs/py_3.11/lib/python3.11/site-packages/torch/utils/_config_module.py:321: UserWarning: Skipping serialization of skipfiles_inline_module_allowlist value {}
2024-11-20T18:12:40.4563547Z   warnings.warn(
2024-11-20T18:12:40.4563693Z AUTOTUNE linear_unary(384x196, 385x196, 385)
2024-11-20T18:12:40.4563804Z   cpp_packed_gemm_2 0.2212 ms 100.0% 
2024-11-20T18:12:40.4563931Z   _linear_pointwise 345.2910 ms 0.1% 
2024-11-20T18:12:40.4564305Z SingleProcess AUTOTUNE benchmarking takes 0.3721 seconds and 3.5511 seconds precompiling for 2 choices
2024-11-20T18:12:40.4565008Z - generated xml file: /var/lib/jenkins/workspace/test/test-reports/python-pytest/inductor.test_cpu_cpp_wrapper/inductor.test_cpu_cpp_wrapper-9bab65762a70856a.xml -
2024-11-20T18:12:40.4565168Z =========================== short test summary info ============================
2024-11-20T18:12:40.4566272Z FAILED [12.0949s] inductor/test_cpu_cpp_wrapper.py::DynamicShapesCppWrapperCpuTests::test_linear_with_pointwise_batch_size_384_in_features_196_out_features_385_bias_True_epilogue_hardsigmoid_cpu_bfloat16_dynamic_shapes_cpp_wrapper - AssertionError: Scalars are not equal!
2024-11-20T18:12:40.4566278Z 
2024-11-20T18:12:40.4566378Z Expected 1 but got 0.
2024-11-20T18:12:40.4566482Z Absolute difference: 1
2024-11-20T18:12:40.4566601Z Relative difference: 1.0
2024-11-20T18:12:40.4566605Z 
2024-11-20T18:12:40.4566805Z To execute this test, run the following from the base repo dir:
2024-11-20T18:12:40.4567726Z     python test/inductor/test_cpu_cpp_wrapper.py DynamicShapesCppWrapperCpuTests.test_linear_with_pointwise_batch_size_384_in_features_196_out_features_385_bias_True_epilogue_hardsigmoid_cpu_bfloat16_dynamic_shapes_cpp_wrapper
2024-11-20T18:12:40.4567731Z 
2024-11-20T18:12:40.4567974Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
```

I'm not sure why there's so many captured stdout/stderr entries here, but they push the backtrace far far away.

### Versions

main

cc @seemethere @malfet @pytorch/pytorch-dev-infra @chauhang @penguinwu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stack trace from pytest is very far away and far too find on some tests #141204

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stack trace from pytest is very far away and far too find on some tests #141204

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions