Skip to content

[FEATURE/DISCUSSION] Representations of TG objects #2491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Bibo-Joshi opened this issue Apr 28, 2021 · 13 comments
Closed

[FEATURE/DISCUSSION] Representations of TG objects #2491

Bibo-Joshi opened this issue Apr 28, 2021 · 13 comments
Milestone

Comments

@Bibo-Joshi
Copy link
Member

Bibo-Joshi commented Apr 28, 2021

This came up in church and my comment on it is too lengthy to post a question in dev chat, so here we go. Also this is not urgent in any way …

Current state

Currently str(tg_objec) will give str(tg_object.to_dict()) and repr(tg_object) will give somthing like <TGObject at …>.

The first one can be confusing for newbies as they might thing that tg_object is a dictionary and they will start using subscription instead of dot-notation and so on …

The second one has

  • the downside that we don't know the content of the object
  • the upside that while debugging we can easily differentiate different objects without seeing the whole blob of data. Then again, if we need that, we can alsways call id(tg_object).

Desired behavior

In view of above description, I think that it would be desirable to have a more helpful representation of TG Objects. We should keep in mind that

  • __repr__ goal is to be unambiguous
  • __str__ goal is to be readable
  • str() calls __repr__, if __str__ is not defined

I think in our case "unambiguous" and "readable" largely collide. I.e. a representation of the form ClassName(argname=value, …) is both unambiguous and good to read. So this is what I propose we should do in detail:

Proposal:

  1. Add a method TO.pprint(linebreak: bool=False, indent: int=2). This will return ClassName(argname=value, …). When linebreak is True it will give f'ClassName(\n{indent*" "}argname=value,\n{indent*" "}…)', so making it something like pythons built-in pprint for dicts. I think this should be rather straight forward. only thing that could make trouble is attributes with list-values …
  2. Make TgO.__repr__ use TgO.pprint. Here we could argue, if you want to skip arguments that are not in _id_attrs, as _id_attrs alone can be argumed to be unambiguous. But I think we're better of just including everything
  3. Depending on the decision in 2., we could drop TgO.__str__, which would then fallback to TgO.__repr__

Questions:

  • What do you think of this?
  • Would this change be big enough to consider it breaking? I wouldn't say so, but being safe also doesn't hurt …
@Poolitzer
Copy link
Member

Poolitzer commented May 4, 2021

Then again, if we need that, we can alsways call id(tg_object)

But this just returns a number, thats not helpful at all when I want to know the type of the object. Did you mean type?

proposal

I like that idea, good to read while making clear its not a dict. And we can just let it fall back to repr, I would be fine with that.

breaking

no. its not.

@Bibo-Joshi
Copy link
Member Author

Then again, if we need that, we can alsways call id(tg_object)

But this just returns a number, thats not helpful at all when I want to know the type of the object. Did you mean type?

No, I mean that there are some cases like [object_1, object_2], where both are of the same type and have the same attribute values and you need to know if object_1 is object_2. In that case a representation like <TGObject at …> immediately gives the desired information. This info would be lost by prettyfying the representation, but if needed can be recovered by printing [id(object_1), id(object_2)].

@Poolitzer
Copy link
Member

oh, I see. They should just use an if clause then smh

@Bibo-Joshi
Copy link
Member Author

Bibo-Joshi commented Aug 12, 2021

Just a quick thought: v14 will drop py3.6, which means we can start using dataclasses at some point. dataclasses come with built-in support to print objects as ClassName(argname=value, …). And for py3.10+, the pprint module natively supports dataclasses as well, which would cover line breaks and indents. So we could

  • wait with a custom __repr__ until we use dataclasses
  • wait with a pprint method until we drop support for py3.9
  • do a custom __repr__ manually for now and replace it with the dataclass-builtin method once we use dataclasses
  • do a custom pprint manually for now/don't do it for now and replace it with the pprint module once we drop py3.9
  • do everything manually for ever :)

Haven't thought yet about which options I like best, just putting this here for reference

Edit: py3.9 reaches EOL ~October 2025

@Bibo-Joshi
Copy link
Member Author

Another idea: Instead of waiting for py3.10 to happen, we could just borrow the code from https://github.com/python/cpython/blob/3.10/Lib/pprint.py. See also python/cpython#24389. If we do this, we should double check how this works out with licenses, how we need to achknowledge python etc

@Poolitzer
Copy link
Member

Poolitzer commented Oct 2, 2021

so you want to override pprint in our library? Or how would you borrow it?

Copy link
Member Author

@Poolitzer Not really override, rather use a vendored version instead. But for it's just an idea. Noam also pointed out the pprint36 library. Maybe it's worth asking if they want to update to 3.10

@Poolitzer
Copy link
Member

okay. but vendoring it means the linebreaks and indention for dataclasses will only be shown whent he user uses pprint, and not print (which means one import more). So vendoring pprint or using pprint36 if they update to 3.10 means the dataclass representation will look a bit nicer automatically, if the user uses pprint.

I would argue thats not on us. The user can decide to do that, or to use 3.10. I think prints are often used as fast debug statements, so its not important that it looks perfect, it should just show everything needed.

So I will say that converting the objects to dataclasses is enough. We can add a note in the docs somewhere saying how to make the representation look even better.

@Bibo-Joshi
Copy link
Member Author

I thinks we're talking past each other. The original intention of this issue was to improve the __str__ and __repr__ methods of TelegramObject and maybe add a method TO.prrint (or other naming) that behaves like the built in pprint but doesn't necessarily use the built in pprint.

Now my ideas was that if we convert everything to data classes anyway, we could leverage that for this issue. Vendoring pprint from 3.10 or using an updated version of pprint36 would be solely implementation detail of TO.pprint/__str__/__repr__.

Converting to dataclass ofc has the positive side effect that in py 3.10+ the built in pprint works well on TO. But the main point of this issue is to provide helpful string representations of TO independent of how TO is implemented

@Poolitzer
Copy link
Member

Okay. But the converting of all objects to data classes already improves the representation, no?

@Bibo-Joshi
Copy link
Member Author

Right, I didn't put that correctly. Still, how exactly we get __repr__ to do what we want should be considered implementation detail IMO. If dataclasses do that for use, that's convenient, but if we decide against dataclasses or drop them at some point, we should still make sure that str(tg_object) gives something that is useful.

Having a method to.pprint as shortcut for py310.pprint(to) is still a nice touch IMO - at least until py3.9 reaches EOL.

@Poolitzer
Copy link
Member

alright, I can agree to that. I see that you switched stages, so nothing imminent anyway, and we can discuss the implementation detail once we get to the stage

@harshil21 harshil21 added this to the v14 milestone Oct 8, 2021
@harshil21 harshil21 modified the milestones: v20, v20.0a1 May 12, 2022
@Bibo-Joshi Bibo-Joshi modified the milestones: v20.0a1, v20.0 May 21, 2022
@Bibo-Joshi
Copy link
Member Author

Closed by #3234

@github-actions github-actions bot locked and limited conversation to collaborators Nov 7, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants