Skip to content

Refactor how InputMedia* subclasses handle inits #2573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Bibo-Joshi opened this issue Jun 25, 2021 · 36 comments · Fixed by #2717
Closed

Refactor how InputMedia* subclasses handle inits #2573

Bibo-Joshi opened this issue Jun 25, 2021 · 36 comments · Fixed by #2717
Assignees
Labels
ℹ️ good-first-issue information: good-first-issue 🛠 refactor change type: refactor
Milestone

Comments

@Bibo-Joshi
Copy link
Member

Currently all the subclasses handle the init completely on their own, while it would make more sense to move all attributes shared between the subclasses to InputMedia, i.e. call super().__init__ like we do for InlineQueryResult*. This includes the attributes caption_entities, parse_mode, type, media & caption

See https://t.me/c/1494805131/16382 + replies for discussion.

@Bibo-Joshi Bibo-Joshi added the 🛠 refactor change type: refactor label Jun 25, 2021
@Bibo-Joshi Bibo-Joshi added this to the v14 milestone Jun 25, 2021
@Bibo-Joshi Bibo-Joshi added ℹ️ good-first-issue information: good-first-issue hacktoberfest labels Sep 17, 2021
@eldbud
Copy link
Contributor

eldbud commented Sep 28, 2021

i'm on it

@eldbud
Copy link
Contributor

eldbud commented Sep 28, 2021

question regrading caption_entities - if i understand correctly, the InputMedia.caption_entities class variable is never actually used, because each subclass instantiates an instance variable under the same name, which is the one that is actually being accessed, so i believe InputMedia.caption_entities is in fact redundant. am i correct, or am i missing some usage of this base class variable?

@Bibo-Joshi
Copy link
Member Author

IIRC we added caption_entities to InputMedia so that InputMedia.to_dict would be guaranteed to work. But yes, this should be refactored to an instance variable that's set in InputMedia.__init__.

@eldbud
Copy link
Contributor

eldbud commented Sep 28, 2021

Why are the Video/Animation/Audio/Document/PhotoSize classes not inheritors of the File class?
If there is a reason, then shouldn't they still all be inheriting from a common base class? they all have many identical attributes as well.
In fact, when looking at each of the classes' de_json method, it seems that they were all copy-pasted from the PhotoSize class, because they all call PhotoSize.de_json, which provides a strong evidence to my above claim.

So i suggest to extend the scope of this task to creating a base class, or using the File class combined with some kind of
MediaFile TypeVar, to further narrow the expected types, which in turn will reflect in a much clearer and more concise type-hinting and documentation in the InputMedia class (and of course to provide a more OOP oriented inheritance hierarchy).

Copy link
Member

@eldbud A File is what's returned when get_File is called on Animation/Audio etc. This is why they are seperated. I guess one could make them inherite from a BaseClass though.

Copy link
Member Author

@Poolitzer Mh, I see how a common base class for all objects that have file_id & file_unique_id as well as optional file_size & thumb would save some code duplication - one could move the de_json and the get_file methods to the base class.

But IMO this class should be private and be considered an implementation detail, because
a) there is no justification for it from the official TG docs (in contrast to e.g. the InputMedia or ChatMember classes, which are documented)
b) I don't see how it would be of any real use for the user. E.g. if you want a base class for something that's "downloadable", this theoretic base class doesn't cover ChatPhoto. If you want a base class for a "message attachment", this doesn't cover Contact, Location, Venue or List[PhotoSize].

So if we want to do this, the docs of Document/Video(Note)/PhotoSize/Sticker/... should
a) not list the private base class as base class
b) explicitly show the docs for get_file (and probably even de_json) as they do now.

@Poolitzer can you agree with my thoughts or do you have a different opinion on this?

@eldbud
Copy link
Contributor

eldbud commented Sep 29, 2021

@Bibo-Joshi This is pretty much the EXACT implementation i have been working on since @Poolitzer 's comment.
There's a _medium.py module which contains the implementation of the new Medium class.
Of course not part of the API, merely a design improvement.

Medium's attributes do not include thumb because thumb is a PhotoSize instance, which itself will be a class deriving from Medium. If PhotoSize itself were to have a thumb attribute, which it does not, i could implement the Medium using the composition design pattern. But i feel this is somewhat of an overkill, allowing an endless chain of nested thumbnails 🙃

@Bibo-Joshi
Copy link
Member Author

Right, I forgot about PhotoSize not having a thumb. Still, it should be possible to move de_json to the base class by including a check a lá if 'thumb' in data:.
You'll have to change how the classes in sticker.py override de_json, then. Actually, if you do that anyway, you can just remove MaskPosition.de_json, as it doesn't do anything that TelegramObject.de_json does not do already.

Regarding the namings: I think the class name should be private rather then the module. Also IMO naming it e.g. _BaseMedium or _MediumObject makes it more clear that it's supposed to be used as base class for something …

@Poolitzer
Copy link
Member

Im good with that implementation jup

@eldbud
Copy link
Contributor

eldbud commented Sep 29, 2021

regarding module/class as protected - i made the module protected, because that way already on import it will raise a warning the a protected module is being imported.
By naming the class as protected , only when importing that class from the module the warning will be raised, which normally happens later while coding. just my two cents.

@Bibo-Joshi
Copy link
Member Author

I see your point. However, none of the existing modules are marked private even though the module structure is not considered to be part of the public API. If we want to change that, I suggest to do that coordinated in a single PR. For now, I'd vote to stick to the existing convention :)

@Bibo-Joshi
Copy link
Member Author

FYI: I've just openend #2680 ;)

@eldbud
Copy link
Contributor

eldbud commented Sep 30, 2021

oh i got my eyes on it alright. if no one will take it by then, i may give it a try after completing the current task.
Glad you liked the protected module approach!

@eldbud
Copy link
Contributor

eldbud commented Oct 1, 2021

question regarding usage of parse_file_input() in the inputfile module classes:

I see that in some of the subclasses the tg_type is provided, and in others it is not. For example in InputMediaPhoto and InputMediaDocument it provides the PhotoSize and Document class correspondingly, but in InputMediaAnimation InputMediaAudio and InputMediaVideo no tg_type is provided.
Is this intentional? if so, why so?

@Bibo-Joshi
Copy link
Member Author

IISC in the cases where it's not provided, this is already handled in another if clause, e.g. for InputMediaAnimation the first if-clause handles the case of Animation-type input:

if isinstance(media, Animation):
self.media: Union[str, InputFile] = media.file_id
self.width = media.width
self.height = media.height
self.duration = media.duration
else:
self.media = parse_file_input(media, attach=True, filename=filename)

the point is here that we can extract additional information from the TG-Type-input.

@eldbud
Copy link
Contributor

eldbud commented Oct 5, 2021

so as i mentioned in the Dataclasses discussion, i tried using mixin classes to refrain from repeating width height duration thumb file_name mime_type title over and over.
Planned on showing it to you after running tests.
I ran tests.
I assume you will not be surprised to hear it did not fare well.
So i will keep the boilerplate code, so it seems.

@Bibo-Joshi
Copy link
Member Author

Okay, I understand the idea 👍
There are some ways of making slots working with multiple inheritance (this SO answer is a very nice write up). But my gut tells that implementing workarounds for this probably is more complicated than adding a few lines of boilerplate :D If we go for dataclasses, this would even reduce the boiler plate lines.
I suggest to stick to boiler plate for now and keep the idea in mind :) It's nice to see how you are digging into the library, though! 😃

@eldbud
Copy link
Contributor

eldbud commented Oct 5, 2021

yep, boilerplate it will remain.
I did start investigating about how maybe i will after all be able to incorporate multiple inheritance with slots; and then i found the SO link you shared, which lead to the understanding that incorporating this workaround will be too big of a weapon for the needs at hand.
And thank you for the compliment 🙂 just a part of trying to comprehend the realm i am in and how i can help to improve it.

@Poolitzer
Copy link
Member

Since we are giving out compliments here, let me agree to that one: I noticed the PRs from you, well done. Its a great feeling if you get a new face in an OS project because they want to contribute. Thank you very much.

@eldbud
Copy link
Contributor

eldbud commented Oct 6, 2021

wow thank you very much @Poolitzer! We all benefit from better apps, and my added benefit is that i get to practice and learn a lot in the process of contributing, for which i get to thank everyone maintaining here 😃

@eldbud
Copy link
Contributor

eldbud commented Oct 6, 2021

going back to the subject, i'm still having some __slots__ issues:
When calling the de_json method of the _BaseMedium class, for example when initializing an Audio object, i am getting this error, upon the return cls(bot=bot, **data) line:
TypeError: __init__() missing 1 required positional argument: 'duration'

I checked, and the data dictionary does include the duration key. So as i understand it, the reason this exception arises, is that _BaseMedium.__init__() order of arguments is different from that of Audio. I am unable to change the order, because the duration argument is mandatory and has no default value, whereas arguments which go to _BaseMedium's __init__() (e.g., file_size) and should supposedly come before the subclass's attributes, is initialized with None, and thus cannot be moved to be before the subclass's attributes (such as the duration example i brought here.

I hope the articulation above is not too confusing...

Any suggestions as to how to approach this?

@Bibo-Joshi
Copy link
Member Author

I'm not sure if I can follow. Why do you want to call _BaseMedium.de_json? shouldn't you rather call Audio.de_json, which calls Audio.__init__(**data, bot=bot)? In Audio.__init__, you can call super().__init__(file_id=file_id, …) and by using kwargs the order shouldn't matter …

@eldbud
Copy link
Contributor

eldbud commented Oct 6, 2021

i'm not calling either, de_json calls cls(bot, **data).
But yes, i was missing the naming of the arguments in super().__init__(). i believe that should fix it. thanks!

@Bibo-Joshi
Copy link
Member Author

I'm a bit confused on which class this method is in and where it's called from. Maybe you can publish the changes that you have so far?

@eldbud
Copy link
Contributor

eldbud commented Oct 6, 2021

nevermind joshi, i found my error, just deleted my last comment after realizing my mistake... sorry for the hassle, i'm on it.

@eldbud
Copy link
Contributor

eldbud commented Oct 6, 2021

@Bibo-Joshi when i try to commit i am getting this message:

telegram\bot.py:231: error: unused 'type: ignore' comment
telegram\bot.py:232: error: unused 'type: ignore' comment

which is not code i wrote, not even a module i touched. not sure what to do about this or how to commit, any idea?
btw, when i remove this comment, the error persists!

@Bibo-Joshi
Copy link
Member Author

Mh, the author of #2700 has a similar situation … You can commit with git commit … --no-verify to skip the pre-commit step. THen we'll have a look at the automated tests and see how they behave :)

@eldbud
Copy link
Contributor

eldbud commented Oct 7, 2021

getting this:

git commit . --no-verify
error: 'telegram/vendor/ptb_urllib3' does not have a commit checked out
fatal: updating files failed

@Bibo-Joshi
Copy link
Member Author

Sounds like something went wrong with that submodule … Try running $ git submodule update --init --recursive. If you keep having problems, feel free to ping me on TG (@BiboJoshi)

@eldbud
Copy link
Contributor

eldbud commented Oct 7, 2021

yeah still same issue, i'll try to reach you later on on TG

eldbud added a commit to eldbud/python-telegram-bot that referenced this issue Oct 7, 2021
eldbud added a commit to eldbud/python-telegram-bot that referenced this issue Oct 7, 2021
@eldbud
Copy link
Contributor

eldbud commented Oct 7, 2021

could use a bit of help, i am getting failure in all test_get_file_instance_method tests of the various media types.
This is the error stdout:

 def check_shortcut_signature(
        shortcut: Callable,
        bot_method: Callable,
        shortcut_kwargs: List[str],
        additional_kwargs: List[str],
    ) -> bool:
        """
        Checks that the signature of a shortcut matches the signature of the underlying bot method.

        Args:
            shortcut: The shortcut, e.g. :meth:`telegram.Message.reply_text`
            bot_method: The bot method, e.g. :meth:`telegram.Bot.send_message`
            shortcut_kwargs: The kwargs passed by the shortcut directly, e.g. ``chat_id``
            additional_kwargs: Additional kwargs of the shortcut that the bot method doesn't have, e.g.
                ``quote``.

        Returns:
            :obj:`bool`: Whether or not the signature matches.
        """
        shortcut_sig = inspect.signature(shortcut)
        effective_shortcut_args = set(shortcut_sig.parameters.keys()).difference(additional_kwargs)
        effective_shortcut_args.discard('self')

        bot_sig = inspect.signature(bot_method)
        expected_args = set(bot_sig.parameters.keys()).difference(shortcut_kwargs)
        expected_args.discard('self')

        args_check = expected_args == effective_shortcut_args
        if not args_check:
            raise Exception(f'Expected arguments {expected_args}, got {effective_shortcut_args}')

        # TODO: Also check annotation of return type. Would currently be a hassle b/c typing doesn't
        # resolve `ForwardRef('Type')` to `Type`. For now we rely on MyPy, which probably allows the
        # shortcuts to return more specific types than the bot method, but it's only annotations after
        # all
        for kwarg in effective_shortcut_args:
            if bot_sig.parameters[kwarg].annotation != shortcut_sig.parameters[kwarg].annotation:
                if isinstance(bot_sig.parameters[kwarg].annotation, type):
                    if bot_sig.parameters[kwarg].annotation.__name__ != str(
                        shortcut_sig.parameters[kwarg].annotation
                    ):
                        raise Exception(
                            f'For argument {kwarg} I expected {bot_sig.parameters[kwarg].annotation}, '
                            f'but got {shortcut_sig.parameters[kwarg].annotation}'
                        )
                else:
>                   raise Exception(
                        f'For argument {kwarg} I expected {bot_sig.parameters[kwarg].annotation}, but '
                        f'got {shortcut_sig.parameters[kwarg].annotation}'
                    )
E                   Exception: For argument api_kwargs I expected typing.Dict[str, typing.Any], but got JSONDict

@Bibo-Joshi
Copy link
Member Author

This tells you that one of the shortcut method has a signature that looks like

def get_file(…, api_kwargs: JSONDict, …):

while it should look like

def get_file(…, api_kwargs: Dict[str, Any], …):

This methods makes sure that the arguments of shortcut methods have the same type hints as the underlying bot methods (in this case Bot.get_file)

@eldbud
Copy link
Contributor

eldbud commented Oct 7, 2021

ok, this is strange, because i did not change the signature of get_file. When i check in the v14 branch, same signature of get_file as in my branch, i.e

def get_file(self, timeout: ODVInput[float] = DEFAULT_NONE, api_kwargs: JSONDict = None) -> File:

what could be the reason for this error then?
fyi, get_file now exists only in the base class, and all subclasses use the base class's method. could that be connected to the error?
also, isn't JsonDict in fact a Dict[str, Any]?

@Bibo-Joshi
Copy link
Member Author

Mh, maybe it was a bit late for me yesterday 😬 You're right, api_kwargs should be annotated as JSONDict and JSONDict is just an alias for Dict[str, Any].
Then I'm not sure what's happening here and I would probably need to see the changes & run them myself to dig into it. I'd recommend to finish the changes up to this problem and PR. If the failing tests annoy you, you can decorate them with @pytest.xfail for now

@eldbud
Copy link
Contributor

eldbud commented Oct 8, 2021

if it helps in any way, this is the failing line in check_shortcut_signature:
if isinstance(bot_sig.parameters[kwarg].annotation, type): (conftest.py, line 445).
And upon checking what is the type of bot_sig.parameters[kwarg].annotation - it is <class 'typing._UnionGenericAlias'>
I think i'll submit a pr with the failing tests, maybe you'd be able to figure out what i broke...

@eldbud
Copy link
Contributor

eldbud commented Oct 11, 2021

how can i link this issue to the pr? there is "closes #2573" text in the pr description, but the link is not made

also now added commit with message as per instructions, but does not link.

eldbud added a commit to eldbud/python-telegram-bot that referenced this issue Oct 11, 2021
eldbud added a commit to eldbud/python-telegram-bot that referenced this issue Oct 11, 2021
@Bibo-Joshi Bibo-Joshi linked a pull request Oct 12, 2021 that will close this issue
@github-actions github-actions bot locked and limited conversation to collaborators Oct 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
ℹ️ good-first-issue information: good-first-issue 🛠 refactor change type: refactor
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants