-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Introduce usage counters #8116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce usage counters #8116
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually this looks pretty good to me! We can start collecting these metrics at shutdown for now and maybe in the future think about how we can make it a bit more robust against ungraceful termination (but first investigate whether that's even worth it).
the UsageSetCounter
seems pretty solid. The UsageLogger
i'm not too sure about.
documenting the internal API to get a better grip on how the constructs are supposed to be used would be great!
15cab04
to
02c0339
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good now! a bit of documentation couldn't hurt, but other than that: let's give this a go :-) cc @ackdav
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 thx for pushing this from prototype to integrating it. I'm sure this enables some really interesting insights!
By the way - is this respecting the DISABLE_EVENTS
config flag?
one nit - I would like to introduce some namespacing in the analytics events. Something like product:event_name
- this will make sifting through data much easier in the future. (also calling it "usage" isn't really telling us much 😉). So could we rename this event ls:usage_analytics
? (happy for better ideas - naming is hard and we're not enforcing naming schemes yet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking forward to these insights 🔮 🚀
main comment: potential demo hotreload
instrumentation in list_functions
and tests 🔴
otherwise just nits
@@ -336,6 +338,8 @@ def start(self, env_vars: dict[str, str]) -> None: | |||
for source, target in container_config.copy_folders: | |||
CONTAINER_CLIENT.copy_into_container(self.container_name, source, target) | |||
|
|||
end_time = time.perf_counter() | |||
usage.initduration.record_value(end_time - start_time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that's the easiest way to reliably measure container initialization overhead. Terminology-wise, this is something completely different compared to what AWS refers to INIT
https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtime-environment.html
In the config, we use LAMBDA_RUNTIME_ENVIRONMENT_TIMEOUT
. Potential alternatives are container_init_duration
or more generic env_init_duration
.
I understand initduration
; just wanted to mention the different meaning compared to AWS terminology so we are aware of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually that should be removed right now, since I also removed the initduration
counter in a previous commit 😬
But your point is still valid 👍 We can think about adding this back at a later date
efbc7dd
to
eeaabc3
Compare
Based upon the analytics logger, @ackdav and I prototyped a fairly simple usage logger a few weeks ago.
One of the constraints for the prototype was to not add further pressure on the analytics backend, so we aggregate all captured data & "flush the logs" only on shutdown. This can be somewhat unreliable though, since in some cases the LocalStack instance might not be properly terminated and thus never wait for the shutdown to actually execute.
A few examples of what kind of information we wanted to collect:
Most of this data can also be captured via structured logging and log aggregation but for our purposes (forwarding to the analytics backend) this seemed impractical.