Skip to content

Possible race condition in throttling #5181

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
5 of 6 tasks
bitcity opened this issue May 26, 2017 · 15 comments
Closed
5 of 6 tasks

Possible race condition in throttling #5181

bitcity opened this issue May 26, 2017 · 15 comments

Comments

@bitcity
Copy link

bitcity commented May 26, 2017

The throttle_success method of SimpleRateThrottle updates the cache with a new value of history. If there are a number of concurrent requests, there may be a race condition where certain values of history would be overwritten by stale data.

Checklist

  • I have verified that that issue exists against the master branch of Django REST framework.
  • I have searched for similar issues in both open and closed tickets and cannot find a duplicate.
  • This is not a usage question. (Those should be directed to the discussion group instead.)
  • This cannot be dealt with as a third party library. (We prefer new functionality to be in the form of third party libraries where possible.)
  • I have reduced the issue to the simplest possible case.
  • I have included a failing test as a pull request. (If you are unable to do so we can still accept the issue.)

Steps to reproduce

  • Able to reproduce the issue with LocMemCache and memcached
  • Enable throttling on a simple view & set the rate to say 500/hour
  • Send concurrent requests with ab, e.g.
    Setup 1 : ab -l -c 20 -n 500 http://127.0.0.1:8000
    Setup 2 : ab -l -c 1 -n 500 http://127.0.0.1:8000

Expected behavior

  • In both setup (1 & 2), the view should be throttled for any further request because of the 500 requests sent by ab (once in batches of 20 & second time as a single request).

Actual behavior

  • Setup 1 : View is not throttled yet (possible another 20-50 requests can be served, varies each time).
  • Setup 2 : View is throttled for any subsequent request. Concurrency is 1, so history is recorded correctly in the cache.

It appears as though some concurrent requests are being ignored by the cache.

@tomchristie
Copy link
Member

Okay, would consider fixes for this on their own merit, but I'm not going to consider it problematic otherwise, given the transient nature of throttling.

@bitcity
Copy link
Author

bitcity commented May 26, 2017

At the moment I couldn't arrive at a quick fix. However, I feel that this might be a good place for people to discuss and propose possible solutions. Won't that be facilitated by keeping it open ?

@xordoquy
Copy link
Collaborator

We'll redirect people willing to work on it here.
We tend to close issues because otherwise they usually stand opened forever and will only add more noise and more work to our maintenance work.
Provided you are the first to mention it and given our experience with the issues, it's unfortunately unlikely others will be working on it.

@jturmel
Copy link

jturmel commented Jun 21, 2017

@bitcity you're probably going to have to fix this by writing your own throttle using a cache backend that allows you to do atomic list commands (push/append), something like Redis would probably work.

@bitcity
Copy link
Author

bitcity commented Jun 21, 2017

@jturmel Thanks for the tip. I ended up writing my custom throttle class that takes care of race condition (I use the atomic incr operation and keep counters instead of timestamps). The solution works for my use case. Unfortunately, it's not generic enough to be used as a patch though.

@MattBlack85
Copy link

we've seen this problem during the last 3 days, after inspecting quickly the throttling code I realized the same things already discussed and got to the root and I'll probably going to implement my own throttling usig redis distributed lock. Now one question, I'd like to contribute to this but I'm not sure there is a "common" way of solving the problem saying we are using the cache abstraction layer and every backend could implement specific method to get around this issue. Maybe an additional call to the cache before inserting the data could help mitigate this a bit but doesn't sound like a proper fix. Does anybody have some ideas on which would be the best ways to fix this?

@dmitry-mukhin
Copy link
Contributor

@tomchristie this is ridiculous.
nobody will want to submit any more issues with approach
"if there's no PR this is not an issue and it doesn't happen often on my projects anyway"

@xordoquy
Copy link
Collaborator

@dmitry-mukhin there's no point in being aggressive.
Beside we didn't said this was not an issue.
We said we'll consider PR about it. This is a community project. Core team members contribute on a regular basis but they also manage hopefully increasing community contributions.
If this is a serious issue, take some time and write a PR. You can also make it a third party so one is free to drive how it goes on its own.

@dmitry-mukhin
Copy link
Contributor

dmitry-mukhin commented Feb 12, 2018

@xordoquy
I understand all of this except the part when closing an issue is acknowledging and not swiping it under a rug.
Seems like keeping issues count low for stats' sake.

Community member @bitcity did a great job of filing and describing the issue.
And all he's got is "we'll not consider this, please do more work".

If the issue (that is real by the way) stays open, somebody in the community can just take it from here and prepare a PR.
If the issue is closed, project misguides community:

  • there is no race condition issues in throttling
  • anybody who stumbles on it again will most likely have to file this again
  • there is no point in submitting issues if you don't have solution for it

I'm sorry if I come out aggressive. I try to point out that this will decrease community contributions and/or waste other people's time on doing double work.

@rpkilby
Copy link
Member

rpkilby commented Feb 12, 2018

I've added the "help wanted" label to this issue. This is useful for tracking issues that are currently out of scope for maintainers, but in which contributions would be welcome.

@dmitry-mukhin
Copy link
Contributor

dmitry-mukhin commented Feb 12, 2018

@rpkilby I believe that labels on closed issues are ignored by vast majority of github users ¯\_(ツ)_/¯

@rpkilby
Copy link
Member

rpkilby commented Feb 12, 2018

Agreed, we should probably add a section to the contributing docs about the label.

@xordoquy
Copy link
Collaborator

We already have about 100 opened issues. Anyone interested in contributing already has the choice.
This issue isn't easy otherwise someone would already have submitted a PR.
Dont forget that http://www.commitstrip.com/en/2014/05/07/the-truth-behind-open-source-apps/ is pretty accurate.

@tomchristie
Copy link
Member

Honestly, don’t especially consider it an outstanding, addressable bug. Application level throttling will only get you so far, and getting throttle rates that are slightly off is just a constraint of the system. Fuzinesses in distributed systems can be okay in some cases. We’ll point at third party packages if anyone wants to implement something stricter.

@MattBlack85
Copy link

Partially agree with that, but slightly off really depends on the network which is by nature, unreliable. Nothing can avoid the first request of a series to get in, being the latest to update the cache in the actual scenario because of the network itself. I agree tho that a proper fix is really beyond the scope of the framework but we can add one line to the docs describing the limit of the standard implementation.

akx added a commit to akx/django-rest-framework that referenced this issue Sep 26, 2019
akx added a commit to akx/django-rest-framework that referenced this issue Oct 1, 2019
adamchainz added a commit that referenced this issue Apr 24, 2022
Refs #5181

Co-authored-by: Adam Johnson <me@adamj.eu>
sigvef pushed a commit to sigvef/django-rest-framework that referenced this issue Dec 3, 2022
Refs encode#5181

Co-authored-by: Adam Johnson <me@adamj.eu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants