Skip to content

New Mechanism for Avoiding Flood Limits #3018

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Bibo-Joshi opened this issue May 9, 2022 · 4 comments · Fixed by #3148
Closed

New Mechanism for Avoiding Flood Limits #3018

Bibo-Joshi opened this issue May 9, 2022 · 4 comments · Fixed by #3148
Milestone

Comments

@Bibo-Joshi
Copy link
Member

supercedes & closes #2139

Below I write down the different thoughts that I had on how we can introduce a new rate limiting mechanism. Pretty much everything is up for discussion. What do you want to have designed differently? What did I miss?

Feature/Mechanism-Wishlist

Integration into PTB

One of the major shortcomings of MessageQueue was that it was not properly integrated with the framework. IMO that's one of the main points that should be considered. I image that the setup be such that something like

ExtBot(…, rate_limiter=RateLimiter(…))

or

Application.bulder().token('token').rate_limiter(RateLimiter(…)).build()

suffices. That is, ExtBot would have a (protected?) attribute holding the rate limiter and all requests would automatically go through that.
Note that I wrote ExtBot - this functionality should be purely for telegram.ext and not in telegram.

getUpdates should not get passed through the rate limiter.

Abstract Interface

I'd like to make this a pluggable interface similar to BaseRequest. that way we give users a chance to implement custom, elaborate logic on their own. Such an interface class will probably just have once method - maybe process_request or so that Bot will call.

However, I'm somewhat unsure what this method should do/return. I see two options:

  1. Return the actual return value of the api request. In this case, ExtBot would probably do something like await self._rate_limiter.process_request(self._post(…)) and in process_update one could do somethling like async with self._semaphore: await coroutine
  2. Alternatively, procless_update could return an async context manager, such that ExtBot would do async with self._rate_limiter.process_request(…): await self._post(…)

I guess the second one might be a bit harder for python-novices to implement. Then again, doing that is a rather advanced task anyway.
The first one may also allow for a bit more flexibility in the implementations. OTOH, the context manager approach separates the Bot from the RateLimiter a bit more clearly, which I potentially like.

Open question: Retries

One thing that one can think of is whether the rate limiter should be allowed/expected to retry a request in case a RetryAfter exception is raised. My gut feeling is that it shouldn't, as it violates the chain of command in the PTB framework. But I'm not completely sure.
In case we want to allow for retries, we'd have to provide way for that to be possible, e.g. by passing a tuple (method, kwargs) such that the rate limiter can call method(**kwargs) as often as needed.

Per-Request Decisions/Priorities

IMO it's important to try and allow for requests to be processed differently depending on

  • the endpoint (answerInlineQuery, sendMessage, etc)
  • the parameters passed to that endpoint
  • maybe even additional arguments that can be passed on a per-call basis.

This would allow to e.g. give answering inline queries a higher priority than sending messages and giving batch-notifications a lower priority than other messages (there even was a Discussions entry requesting something like this at some point, but I can't find it now)

So a signature of process_request could look like

async def process_request(self, (coroutine ,)endpoint: str, data: Dict[str, Any], rate_limit_kwargs: Dict[str, Any]):
    ...

Here IMO the data should not be the RequestData that we pass to BaseRequest but basically the plain parameters that are passed to the bot method (after inserting DefaultValues.

Naming

Maybe BaseRateLimiter?

Possible built-in implementation

We could provide a basic implementation out of the box. For that we could use a 3rd party dependency like aiolimiter or pyrate-limiter. Both are published under the MIT license, so that shouldn't be a problem. aiolimiter has only one stable release but to me it looks bit bit cleaner, with a small, well defined feature set - though that may just be a first impression.
I imagine that such an implementation could be relatively close to what MessageQueue formerly did, i.e. enforce a global delay and a per-group/chat delay. One could think of providing that in a priority-queue style fashion such that user can specify a priority though the rate_limit_kwargs.

Implementation

The main task of calling process_request in the right spot should not be too difficult. The main issue that I see is to keep the logic away from tg.Bot. For that I have a few ideas:

Part 1

For passing endpoint and data it should suffice to split up tg.Bot._post a bit so that ExtBot has a chance to link into the spot between inserting DefaultValues and calling Bot.request._post.

Part 2

For passing rate_limit_kwargs, that's a bit more tricky. We'd have to add a new parameter to all methods of ExtBot, but we don't want to add them to Bot.

One straightforward option would be to split up Bot.send_message into Bot._send_message and Bot.send_message, where only the protected one has the additional parameter. Bot.send_message would then just pass everything along to _send_message and ExtBot.send_message would call Bot._send_message directly. This would result in many additional loc in tg.Bot. Additionally, this doesn't play well with shortcut methods.

The other idea that is to piggyback on api_kwargs: We add a parameter rate_limit_kwargs to ExtBot.send_message, but call super().send_message(…, api_kwargs=api_kwargs |= {secret_key: rate_limit_kwargs). Then before passing everything to the rate limiter, we extract them as rate_limit_kwargs = api_kwargs.pop(secret_key, {}). This looks like a dirty hack at first glance, but the longer I think about it, the more I actually like it:

  1. api_kwargs doesn't interfere with any of the other parameters, so it's rather safe to use for that
  2. keeps Bot and ExtBot apart without introducing endless boilerplate
  3. allows the shortcut methods to also have a paremeter rate_limit_kwargs - we can just throw an error if they are used with a Bot instead of an ExtBot
@Bibo-Joshi Bibo-Joshi added this to the v20.0a1 milestone May 9, 2022
@Bibo-Joshi
Copy link
Member Author

@harshil21 and I just talked about this a bit. let me try to summarize the key results, so that others can agree or disagree with them :D

  • naming (Base)RateLimiter is okay
  • rate_limit_kwargs via the piggiback-on-api_kwargs sounds okay, but still introduces quite an amout of boilerplate to ExtBot b/c we have to override every api method from tg.Bot. We could skip this in a first version and see how great the demand for a feature like this actually is
  • for a built-in implementation: aiolimiter apparently has some issues with httpx, so that may not be the best choice (maybe harshil can provide a link to the corresponding issue)
  • regarding retrying on RetryAfter: We can provide the necessary means for that but skip it in a built-in implementation + document that depending on the use case it might not be the best design choice. If providing the means for retrying is hard (can coroutines be re-build or similar?), we can skip this feature, or postpone it until users actually request it.
  • for the return value of process_request there was no clear consensus, but a tendencey that context managers will probably be harder to handle for more inticate logic like assigning priorities to the requests or retrying on RetryAfter

@harshil21
Copy link
Member

provide a link to the corresponding issue

mjpieters/aiolimiter#73 a big discussion, haven't yet seen what parts could apply to us too.

can coroutines be re-build

found the inspect function which helps with that - getcoroutinelocals

so its something like:

>>> async def x(a, b, c):
...     this = 2
...     return this
...
>>> f = x(1, 2, 3)
>>> inspect.getcoroutinelocals(f)
{'a': 1, 'b': 2, 'c': 3}

@Bibo-Joshi
Copy link
Member Author

found the inspect function which helps with that - getcoroutinelocals

so its something like:

ah, nice 👍 it doesn't give you access to x though, right?

@harshil21
Copy link
Member

ah, nice 👍 it doesn't give you access to x though, right?

nope

@Bibo-Joshi Bibo-Joshi modified the milestones: v20.0a1, v20.0a2 Jun 8, 2022
@github-actions github-actions bot locked and limited conversation to collaborators Sep 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants