Skip to content

fix usage analytics to properly register and not count if disabled #11848

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 18, 2024

Conversation

bentsku
Copy link
Contributor

@bentsku bentsku commented Nov 14, 2024

Motivation

While looking to implement the usage analytics for the API Gateway service, I've noticed three things:

  • we were not sending usage analytics anymore since remove legacy runtime in localstack/service/infra.py #11202, as we removed the code to send it, but didn't register the hook back
  • the usage counters were still counting even though analytics would be disabled, and would only skip sending the data over
  • the usage counters are using lists that are growing unbounded, possibly using large amount of memory for very long lasting sessions (we're aware of 20+ hours sessions)

This PR mainly does 2 things:

  • re-register the hook so that we send analytics over
  • it does not count if analytics is disabled
  • refactor the counter a bit to avoid growing a list, instead keeping the "set" entries and counting with itertools.count which is thread safe in CPython

I don't really like the boolean check for every call to increment and such, we could probably refactor our counters to be no-op classes if the analytics are disabled? But this at least prevents the memory leaks for users with DISABLE_EVENTS=1 as a quick fix before v4, while re-enabling sending the analytics over.

As a sanity check, I've run some small benchmarks with timeit:
Setting 1 000 000 values in the state:
Performance:

  • 0.236s for the new counter
  • 0.199s for the old counter with append

Memory:

  • 384 bytes for the new counter
  • 8448728 bytes for the old counter with append

Changes

  • register aggregate_and_send with @hooks.on_infra_shutdown(should_load=lambda: not config.DISABLE_EVENTS)
  • don't count if DISABLE_EVENTS is true
  • small refactor of the UsageSetCounter
  • small change: added the count aggregation method
  • fix the median calculation, as it would have been incorrect without sorting the list first

Follow-up

I'm following with #11854 which might contain more controversial changes, related to our counters. I believe the UsageCounter could be simplified into a really simple counter for less complex use case with no aggregation (like lambda:hotreload), and rename the current counter into a somewhat "timing" related counter with aggregation for more complex use cases.

Multithreading testing:

With this small test, we would always end up with the right amount of counting, just for sanity check.

iter_counter = UsageSetCounterIter("namespace")

def incr_iter(*args, **kwargs):
    iter_counter.record(random.choice(["test", "test2", "test3", "test4"]))

with ThreadPoolExecutor(max_workers=1000) as tp:
    p = tp.map(incr_iter, range(100000))

    for _ in p:
        pass

    print(sum(iter_counter.state.values()))
    > always prints out 100000

\cc @cloutierMat thanks for the help and pairing on this! 🙏

@bentsku bentsku added the semver: patch Non-breaking changes which can be included in patch releases label Nov 14, 2024
@bentsku bentsku self-assigned this Nov 14, 2024
Copy link

github-actions bot commented Nov 14, 2024

LocalStack Community integration with Pro

    2 files  ±0      2 suites  ±0   1h 42m 46s ⏱️ - 1m 16s
3 526 tests ±0  3 133 ✅ ±0  393 💤 ±0  0 ❌ ±0 
3 528 runs  ±0  3 133 ✅ ±0  395 💤 ±0  0 ❌ ±0 

Results for commit 41fd489. ± Comparison against base commit 939b99d.

♻️ This comment has been updated with latest results.

@bentsku bentsku added this to the 4.0 milestone Nov 15, 2024
@bentsku bentsku marked this pull request as ready for review November 15, 2024 14:15
@bentsku bentsku requested a review from thrau as a code owner November 15, 2024 14:15
Copy link
Member

@thrau thrau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! nice catch, can't believe i missed that one!

Comment on lines 115 to 109
@hooks.on_infra_shutdown(should_load=lambda: not config.DISABLE_EVENTS)
def aggregate_and_send():
"""
Aggregates data from all registered usage trackers and immediately sends the aggregated result to the analytics service.
"""
if config.DISABLE_EVENTS:
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow thanks for adding this, can't believe i missed this 😅

i personally probably would have kept the guard in the method and not use should_load, simply because it makes it a bit easier to debug in case something's up. it also allows us to change config.DISABLE_EVENTS after the runtime has initiated, though we really don't have a case for that now it would be a bit more consistent with being able to set certain config values at runtime via the config endpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, yes, I thought about this as well! Will update.

Only thing is that with the current logic to avoid "counting" if not enabled, even if we update at runtime, we will still avoid counting for now, or I should update the code to use the config variable directly. We can always update later for that part. 👍

@bentsku bentsku merged commit 5b44747 into master Nov 18, 2024
31 checks passed
@bentsku bentsku deleted the fix-usage-counters branch November 18, 2024 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver: patch Non-breaking changes which can be included in patch releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants