Skip to content

pprint with compact indent #112632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
hyzyla opened this issue Dec 2, 2023 · 8 comments
Open

pprint with compact indent #112632

hyzyla opened this issue Dec 2, 2023 · 8 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@hyzyla
Copy link

hyzyla commented Dec 2, 2023

Feature or enhancement

Proposal:

Add a new parameter to the pprint function that inserts a newline character (\n) after the opening parenthesis, and sets the indent to level * indent. With this parameter, pprint will format nested objects in a manner similar to how formatters like Black or Ruff format them.

>>> t = {'a':2, 'b':{'x':3, 'y':{'t1': 4, 't2':5}}}
>>> pprint(t, indent=4, some_parameter=True)
{
    "a": 2,
    "b": {
        "x": 3,
        "y": {
            "t1": 4,
            "t2": 5
        }
    }
}

With this change, pprint will print in a more width-compact manner, making it easier to read and copy-page pretty printed object to codebase wich follows similar formmating rule.

This feature was hevily inspired by this question on Stackoverflow

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

Linked PRs

@hyzyla hyzyla added the type-feature A feature request or enhancement label Dec 2, 2023
@AlexWaygood AlexWaygood added the stdlib Python modules in the Lib dir label Dec 3, 2023
@harshit-singhania2000
Copy link

Hey! would love to contribute here, can this issue be assigned to me?

@stodoran
Copy link

Any updates on this? I'm left scratching my head wondering why this feature doesn't already exist, I mean the current output for a large nested dictionary with pprint is basically unusable and definitely not pretty, which is a shame because large dictionaries are probably the primary use case for pprint.

If it is a bandwidth thing I'd be happy to try and contribute here if somebody could give me some pointers.

@tomasr8
Copy link
Member

tomasr8 commented Jan 16, 2025

@stodoran Please, feel free to work on this! You might find the devguide useful: https://devguide.python.org/
As for the implementation itself, you'll probably need to modify the PrettyPrinter class (and more specifically PrettyPrinter._pprint_dict) in Lib/pprint.py. Feel free to ask if you have any questions! :)

@StefanTodoran
Copy link

@tomasr8 I did have two questions

  • What requirements are there in terms of style?
  • Modifying the existing PrettyPrinter to support this sort of block style printing requires quite a few changes, I honestly think the code would be much simpler (and make more sense) if it was simply another class, perhaps BlockPrinter. Do you think I should try to hack PrettyPrinter to be able to support this block style printing via an argument, or just create another class?

@picnixz
Copy link
Member

picnixz commented Jan 24, 2025

I would recommend first opening a DPO thread. There are alternatives such as json.dumps which may not be perfect or 3rd-party packages such as rich (https://rich.readthedocs.io/en/stable/pretty.html).

I don't think pprint was meant to have lots of extensions, as demonstrated by its change history:

$ git log --format=ref --date=iso Lib/pprint.py
42d9bec98fd (gh-118761: Improve import time of `pprint` (#122725), 2024-08-07 22:46:54 +0300)
2a3c37c2737 ([pprint]: Add docstring about `PrettyPrinter.underscore_numbers` parameter (#112963), 2023-12-13 15:04:17 +0300)
c5140945c72 (gh-92546: Move pprint benchmark into pyperformance (GH-94613), 2022-07-25 21:30:13 +0300)
087f089e5e0 (bpo-45557: Fix underscore_numbers in pprint.pprint(). (GH-29129), 2021-10-21 16:42:55 -0400)
aab1899c9d7 (bpo-41546: make pprint (like print) not write to stdout when it is None (GH-26810), 2021-07-19 10:19:02 +0100)
11159d2c9d6 (bpo-43080: pprint for dataclass instances (GH-24389), 2021-04-14 00:59:24 +0100)
3ba3d513b1e (bpo-42914: add a pprint underscore_numbers option (GH-24864), 2021-03-24 09:23:20 +0100)
ff420f0e08a (bpo-28850: Fix PrettyPrinter.format overrides ignored for contents of small containers (GH-22120), 2020-11-23 13:31:31 +0000)
582f13786bb (bpo-39994: Fix pprint handling of dict subclasses that override __repr__ (GH-21892), 2020-08-30 18:29:53 +0100)
06a8916cf46 (bpo-37376: pprint support for SimpleNamespace (GH-14318), 2019-06-27 01:13:18 +0200)
96831c7fcf8 (bpo-30670: Add pp function to the pprint module (GH-11769), 2019-03-22 18:22:20 +0100)
8db5b544631 (bpo-35513, unittest: TextTestRunner uses time.perf_counter() (GH-11180), 2018-12-17 11:30:34 +0100)
6a7b3a77b4b (Issue #26778: Fixed "a/an/and" typos in code comment and documentation., 2016-04-17 08:32:47 +0300)
8eb1f077c2b (Issue #18682: Optimized pprint functions for builtin scalar types., 2015-05-16 21:38:05 +0300)
bedbf96e848 (Issue #23870: The pprint module now supports all standard collections except named tuples., 2015-05-12 13:35:48 +0300)
62aa7dc7c9b (Issue #22721: An order of multiline pprint output of set or dict containing orderable and non-orderable elements no longer depends on iteration order of set or dict., 2015-04-06 22:52:44 +0300)
aa4c36fbbb4 (Issue #23775: pprint() of OrderedDict now outputs the same representation as repr()., 2015-03-26 08:51:33 +0200)
f3fa308817a (Issue #23776: Removed asserts from pprint.PrettyPrinter constructor., 2015-03-26 08:43:21 +0200)
87eb482e30e (Issue #23502: The pprint module now supports mapping proxies. In particular the __dict__ attributes of building types., 2015-03-24 19:31:50 +0200)
022f20376a2 (Issue #17530: pprint now wraps long bytes objects and bytearrays., 2015-03-24 19:22:37 +0200)
8e2aa88a40c (Issue #23741: Slightly refactor the pprint module to make it a little more extesible.  No public API is added., 2015-03-24 18:45:23 +0200)
a750ce3325b (Issue #19105: pprint now more efficiently uses free space at the right., 2015-02-14 10:55:19 +0200)
fe3dc376fa7 (Issue #19104: pprint now produces evaluable output for wrapped strings., 2014-12-20 20:57:15 +0200)

The past 10 years have seen small improvements and changes. The same can be said for reprlib:

$ git log --format=ref --date=iso Lib/reprlib.py
04d6dd23e2d (gh-113570: reprlib.repr does not use builtin __repr__ for reshadowed builtins (GH-113577), 2024-10-17 17:34:37 +0100)
f65f9e80fe7 (gh-109818: `reprlib.recursive_repr` copies `__type_params__` (#109819), 2023-09-28 05:26:42 +0300)
4845b9712f2 (gh-107409: set `__wrapped__` attribute in `reprlib.recursive_repr` (#107410), 2023-08-10 11:55:49 +0500)
c06c001b308 (gh-92734: Add indentation feature to reprlib.Repr (GH-92735), 2022-09-08 20:51:44 +0200)
b6558d768f1 (gh-94343: Ease initialization of reprlib.Repr attributes (GH-94581), 2022-07-07 16:55:33 +0200)
8c21941ddaf (bpo-39549: reprlib.Repr uses a “fillvalue” attribute (GH-18343), 2021-09-22 16:45:58 -0400)
a6a4dc816d6 (bpo-31370: Remove support for threads-less builds (#3385), 2017-09-07 18:56:24 +0200)
b3b366d8032 (Issue #26634: recursive_repr() now sets __qualname__ of wrapper. Patch by Xiang Zhang., 2016-04-26 09:30:44 +0300)
a34cd0c781d (Issue #22824:  Simplify reprlib output format for empty arrays, 2014-11-15 10:58:58 -0800)
ffd842e1d67 (Issue #22824:  Updated reprlib output format for sets to use set literals., 2014-11-09 22:30:36 -0800)
0c937b3ed6d (Issue #22031: Reprs now always use hexadecimal format with the "0x" prefix when contain an id in form " at 0x..."., 2014-07-22 12:14:52 +0300)

We should first open a DPO thread before accepting such feature IMO. However, anyone can present a PoC so that we can estimate how complex the addition would be and whether we want to keep it in the standard library or not.

@tomasr8
Copy link
Member

tomasr8 commented Jan 24, 2025

As a counter point, pprint can already do indents, this is just about adding newlines here and there, so I think the implementation shouldn't be that complex (famous last words 😅 )

@picnixz
Copy link
Member

picnixz commented Jan 24, 2025

If it's not that complex, I wouldn't mind skipping the DPO discussion. However it must really be simple! But now I see value in this because:

>>> pprint.pprint(t, indent=4, width=1, compact=0)
{   'a': 2,
    'b': {   'x': 3,
             'y': {   't1': 4,
                      't2': 5}}}

This one is really not a nice way to read stuff.. Maybe we could change the meaning of width or add a balanced=True to make it a balanced output. But I would still be interested in how the implementation eventually looks like.

@StefanTodoran
Copy link

I may have been overthinking it, the solution was not that complex. Here is a basic implementation.

Please let me know how I can improve it, this is my first time contributing. Also, I didn't include a fix for this in my PR but I discovered a bug during testing; pprint throws an error if you try to print a nested Counter inside of another Counter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

7 participants