Add "global tracer"/backend objects (#15) #29

Oberon00 · 2019-06-25T17:26:38Z

~~I've also gone ahead and added pytest-based unit tests, but I'll gladly remove them/put them in another PR until #28 is resolved.~~

The environment variable stuff is useful for something like system-wide configuration. It could e.g. be used to (re)configure the SDK or return a different implementation.

opentelemetry-api/opentelemetry/backend.py

c24t · 2019-06-25T22:32:47Z

opentelemetry-api/opentelemetry/backend.py

+    try:
+        # Note: We use such a long name to avoid calling a function that is not intended for this
+        # API.
+        backend_fn: Callable[[Type[_T]], object] = getattr(backend_mod, 'get_opentelemetry_backend_impl')


Why this function instead of just having the env var point to the module that implements Tracer?

It's just more flexible. E.g. the tracer implementation's constructor could require some arguments but when using get_opentelemetry_backend_impl, a tracer could be constructed that is configured by reading environment variables or configuration files, etc.
Also, with this we allow the tracer to be named differently e.g. MyVendorTracer instead of just Tracer.

The class naming is a good point, but my complaint is about requiring this specific function in code to get a tracer.

@reyang did some work in OC to configure tracing (and etc.) from static config where the config looked a lot like executable python code: census-instrumentation/opencensus-python#621 (comment).

Since we're going to have to do the same thing for multiple components I think there's good reason to do something like this. That is the env var could point to a config file instead of pointing to a file that has a factory function to create the component.

The default SDK loader can still access different environment variables or load some config files in its get_opentelemetry_backend_impl function. I wanted to keep the basic loader as simple as possible, but maybe it does make sense to have a more complex logic that allows for easier configuration.

I could imagine a config file that looks like this (JSON):

{ "trace": { "constructor": "my.vendor.trace:load_from_config", // Note the colon to separate module and function name; leave it out to use the SDK implementation "config": // Any JSON. The parsed object (dict, list, whatever) is passed on as the single argument to the constructor function } }

Or like this:

{ "vendor": "my.vendor" "trace": // Any JSON. The parsed object (dict, list, whatever) is passed on as the second argument to the constructor function, which must be named `my.vendor.configparser.load_from_config`. The first argument would be the API's tracer type object. }

I thought we might want to be able to use the same config file between the different language clients, but after talking to @songy23, @mayurkale22, and @rghetia it looks like there's likely to be enough language-specific config that this won't be feasible. We have some flexibility in the static config format.

c24t · 2019-06-25T22:54:17Z

opentelemetry-api/opentelemetry/backend.py

+        raise ValueError('The object does not implement the required base class.')
+    globals()['_' + api_type.__name__.lower()] = impl_object
+
+### Public code (basically copy & paste for each type) {{{1


I think it would make sense to separate the loader and "backend" modules -- the first to get a SDK class given an API class based on env vars, the second to hold the global Tracer, Meter, etc.

I also assumed the global tracer would go in the trace package, as e.g. opentelemetry.trace.tracer. Is there an argument for keeping the tracer and meter (and etc.) in the same module instead?

The academic argument against putting the global objects alongside the API class definitions is that this makes these modules contain global mutable state. I don't know if that would be a real problem in practice. Now we have the problem that the backend module needs to implement each API module for which a global instance should be provided.

One idea I have that solves both problems but is probably over-engineered is the following:

Move the existing immutable modules to an api subpackage.

Create new modules with the name of the old ones that do a wildcard import from the api submodule and additionally provide the global instance.

That way, we have immutable API modules and conveniently accessible global tracers. The dependencies would be (with the example of tracer):

trace depends on api.trace and the backend/loader module

api.trace depends on nothing (nothing backend-related at least)

The loader module depends on nothing (nothing api related at least)

As far as quarantining the global mutable state, I think it would be fine to use a singleton (like java's OpenTelemetry) or dedicated container modules (e.g. opentelemetry.trace.tracer.tracer, where the first tracer is the module and the second is the object), but I think we have different intuitions about how "bad" it is to put the global tracer in the same module as the API class. One problem I can imagine is that it would prevent us from creating the tracer from config when the module is first imported; we'd have to modify it after the module was loaded since the implementation would need to import it first. What do you see as other problems?

The wildcard import is interesting, but it's not clear that the extra complexity is less bad than having an ugly import path for the tracer.

What do you see as other problems?

I haven't been able to think of any particular ones tbh, that's why I wrote "academic". 😃 Let's try putting the tracer getter in the api module. The "create from config" problem you mention doesn't apply now because we would create the tracer when the getter is first called.

Actually I have found an annoyance: See comment in unit test.

c24t · 2019-06-25T22:54:44Z

opentelemetry-api/opentelemetry/backend.py

+        return _tracer
+    return _selectimpl(Tracer)
+
+def set_tracer(tracer_implementation: Tracer) -> None:


What's the use case for swapping out the global tracer?

I'm fine with disallowing that after a Tracer has been initially selected. It is mostly there to explicitly set a particular instance (maybe configured in an app-specific way) of the Tracer.

I now only provide a setter for the factory function.

Actually I like offering both a Tracer and a factory, as long as they cannot be re-set.

Another thing is that usually code has higher priority than environment variables (hence, if the user were to use this function explicitly, it would have precedence over the rest). Opinions on this?

opentelemetry-api/opentelemetry/backend.py

opentelemetry-api/tests/test_backend.py

opentelemetry-api/opentelemetry/backend.py

carlosalberto · 2019-06-27T14:41:06Z

Hey hey, sorry for the delayed feedback.

One thing that I don't completely agree with is the backend naming - based on https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/tracing-api.md I'd be more inclined to use something like registry, or global registry:

Tracer provider is an internal class used by the global registry

Another thing is that, at least in Java, the used Tracer goes like this:

Try to use the user-specified Tracer (through an external variable/setting).
Fallback to the Default (no-op) implementation.

That is simply, but works great IHMO. In that regard, I feel we shouldn't try to load the default SDK as a fallback case.

c24t · 2019-06-28T01:00:42Z

Try to use the user-specified Tracer (through an external variable/setting).

Fallback to the Default (no-op) implementation.

That sounds like a good approach to me too; if your custom implementation fails to load I think it would be more surprising to get a different implementation than no implementation at all.

Oberon00 · 2019-06-28T13:53:50Z

if your custom implementation fails to load I think it would be more surprising to get a different implementation than no implementation at all

The current loader implementation will use the no-op/API implementation in that case (and has always done so).

carlosalberto · 2019-07-03T14:42:37Z

The current loader implementation will use the no-op/API implementation in that case (and has always done so).

Sorry, meant to say that if there's NO explicit setting of a Tracer at all, we go directly to the no-op Tracer. The SDK (the artifact/package) registers itself explicitly for this case.

c24t

One open question we need to resolve before merging this is whether config in code should take precedence over env vars.

The loading mechanism here still seems awfully complex, and I'd still like to avoid 'get_opentelemetry_implementation', but I don't have a good alternative in mind.

c24t · 2019-07-03T19:15:45Z

opentelemetry-api/opentelemetry/loader.py

+
+_UntrustedImplFactory = Callable[[Type[_T]], object]
+
+#ImplementationFactory = Callable[[Type[_T]], _T]


Do you mean to leave this in? And why bother calling the type "untrusted"?

It's also not clear why we need object instead of the generic type here, but I'm willing to believe that we do. FWIW I don't think we should use type annotations outside of the overridable API components unless we have reason to believe that that bit of code in particular would benefit from type checks.

It's called untrusted and returns an object because we don't trust it to return the type we expect, since it is set by code that is out of our control. Using object forces us to check the type (as object has basically no methods, is not convertible to anything, etc.).

ImplementationFactory was meant for the other side, the set_preferred_tracer_implementation*_impl functions. But due to bug python/mypy#7092, I had to repeat the definition in the setter.

I'd actually prefer to leave it here as documentation, but I should add a comment explaining this.

I added comments, is that OK?

FWIW I don't think we should use type annotations outside of the overridable API components

Presently we run mypy with disallow_untyped_defs = True so it wouldn't even pass travis. And personally I like static typing.

And personally I like static typing

No accounting for taste :P

c24t · 2019-07-03T19:20:23Z

opentelemetry-api/opentelemetry/trace/__init__.py

+    This function may not be called after a tracer is already loaded.
+
+    Args:
+            factory: A function that, when called with the :class:`Tracer` type


Why two levels of indentation here? Also I think it's fine to omit the types in this line since they're in the signature.

I fixed the indentation. I also made the description more terse, is that what you meant?

Yup, looks good.

c24t · 2019-07-03T19:26:35Z

opentelemetry-api/opentelemetry/trace/__init__.py

+        raise RuntimeError("Tracer already loaded.")
+
+    #pylint:disable=protected-access
+    loader._CALLBACKS_BY_TYPE[Tracer] = factory


Setting the tracer factory in the trace class modifies the callbacks in loader? I'd expect it to do something like set a module-level constant instead and only fall back to the loader if that was never set. But that only works if we treat configuration in code as higher priority than env vars.

Since the tracer is set lazily it's possible to set this factory multiple times, the last call would win. Is that intentional?

Actually a good point. Thinking about it, the dictionary in loader.py is uselessly complicated. The only advantage is that one can call loader._load_impl directly instead of using trace.tracer() but I guess that isn't really buying us much.

I removed the dictionary.

Regarding calling the setter multiple times: I have no strong opinion on that. I thought maybe some vendor may call set_preferred_impl themselves and then the application would have the final say if it called that setter again. Besides, the current behavior is the simplest to implement. 😄

opentelemetry-api/setup.py

Oberon00 · 2019-07-05T15:10:55Z

@c24t:

One open question we need to resolve before merging this is whether config in code should take precedence over env vars.

I choose this precedence because everyone that uses a non-SDK vendor (or really everyone depending on the resolution of open-telemetry/opentelemetry-specification#171) will set it from code which would make the environment variables effectively useless. We could add another setter that overrides this (maybe a set_tracer that directly sets the tracer object, since in that case the indirection over a factory function is useless), but that runs counter to the next point:

The loading mechanism here still seems awfully complex, and I'd still like to avoid 'get_opentelemetry_implementation', but I don't have a good alternative in mind.

I'm open for suggestions 😃. We could of course remove features, e.g. instead of the default/ distinction, provide only one. Or change the code to trust the externally set factory functions, which would remove some checks an exception handling. What do you think? (@ all)

I was too "clever", the getter needs to be rewritten

We should be able to ensure clean envvars for the test run.

Gets rid of the dictionary in loader.py.

c24t

Thanks for addressing all my comments @Oberon00, this LGTM. It'll be easier to make judgements about the loader behavior once we've got an actual SDK to load.

Oberon00 · 2019-07-10T08:48:59Z

OK, then I'm merging this and I'll create issues for the open questions, which seem to be related to setter behavior:

Should it be allowed to set the factory multiple times
Should the environment variable override the factory setter?

Also I noticed one more problem regarding the loader behavior which I will also move to a new issue:

The current lazy-loading behavior makes it easy to accidentally create a no-op tracer that sticks around. E.g. if the configured API implementation is decided after making an HTTP request and the used HTTP library is instrumented, the no-op tracer would be installed by the loader (and it's not exchangeable).

But there is one public API that hopefully stays the way it is now: The opentelemetry.trace.tracer() getter. And that getter should unblock a lot of other things.

Oberon00 requested review from c24t, carlosalberto and reyang as code owners June 25, 2019 17:26

reyang reviewed Jun 25, 2019

View reviewed changes

opentelemetry-api/opentelemetry/backend.py Outdated Show resolved Hide resolved

reyang reviewed Jun 25, 2019

View reviewed changes

opentelemetry-api/opentelemetry/backend.py Outdated Show resolved Hide resolved

reyang reviewed Jun 25, 2019

View reviewed changes

opentelemetry-api/opentelemetry/backend.py Outdated Show resolved Hide resolved

reyang reviewed Jun 25, 2019

View reviewed changes

opentelemetry-api/opentelemetry/backend.py Outdated Show resolved Hide resolved

c24t reviewed Jun 25, 2019

View reviewed changes

This was referenced Jun 27, 2019

Set pylint's line length to 79 chars #35

Merged

Standardize import style #36

Closed

carlosalberto reviewed Jun 27, 2019

View reviewed changes

opentelemetry-api/opentelemetry/backend.py Outdated Show resolved Hide resolved

Oberon00 requested review from carlosalberto, c24t and reyang July 2, 2019 06:25

carlosalberto mentioned this pull request Jul 3, 2019

Tracer resolution default open-telemetry/opentelemetry-specification#171

Closed

c24t reviewed Jul 3, 2019

View reviewed changes

reyang reviewed Jul 3, 2019

View reviewed changes

opentelemetry-api/setup.py Outdated Show resolved Hide resolved

Oberon00 and others added 7 commits July 8, 2019 13:09

WIP

85c4543

WIP

c952681

I was too "clever", the getter needs to be rewritten

Global tracer registry: First somewhat satisfying iteration.

719df6e

Add API docs for backend module

8631647

Fix documentation (and overlong lines within).

893a449

Sort imports.

ea42a79

Remove _UNIT_TEST_IGNORE_ENV.

f0ab57f

We should be able to ensure clean envvars for the test run.

Oberon00 added 11 commits July 8, 2019 13:09

Misc cleanup.

5fa65eb

Fix backend.py not being included in wheel.

1d5c219

Rewrite backend/loader.

2804f9f

Rewrite backend/loader more, fix pylint, mypy.

1d5d1b7

Ditch 'default' in _load_default_impl, it's just wrong now.

dc08af9

Fix docs.

b983466

Apply new package structure.

e0f6ca0

Remove unit tests (for now).

8ebf94a

Document the factory type aliases.

c5dbd03

Store factory functions in respective modules.

e4a1dd3

Gets rid of the dictionary in loader.py.

Fix factory return type (Optional) & improve docs.

3f9a6cd

Oberon00 force-pushed the iss15-global-tracer branch from 81c816a to 3f9a6cd Compare July 8, 2019 14:21

Revert accidental changes to setup.py.

a8c1da1

This was referenced Jul 8, 2019

Unit tests #42

Merged

More lint #43

Merged

c24t approved these changes Jul 9, 2019

View reviewed changes

This was referenced Jul 10, 2019

[loader] Decide setter/envvar interaction & behavior with multiple setter calls #44

Closed

[loader] Too easy to configure no-op implementation by mistake #45

Closed

Remove unused global.

0fb5d52

Oberon00 merged commit 9f0527d into open-telemetry:master Jul 10, 2019

Oberon00 mentioned this pull request Sep 6, 2019

Global object for configuration #123

Closed

c24t mentioned this pull request Sep 24, 2019

Metrics Implementation #160

Merged


		_UntrustedImplFactory = Callable[[Type[_T]], object]

		#ImplementationFactory = Callable[[Type[_T]], _T]

Add "global tracer"/backend objects (#15) #29

Add "global tracer"/backend objects (#15) #29

Uh oh!

Conversation

Oberon00 commented Jun 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Oberon00 Jun 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Oberon00 Jun 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

carlosalberto commented Jun 27, 2019

Uh oh!

c24t commented Jun 28, 2019

Uh oh!

Oberon00 commented Jun 28, 2019

Uh oh!

carlosalberto commented Jul 3, 2019

Uh oh!

c24t left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Oberon00 commented Jun 25, 2019 •

edited

Loading

Oberon00 Jun 26, 2019 •

edited

Loading

Oberon00 Jun 27, 2019 •

edited

Loading

Oberon00 commented Jul 5, 2019 •

edited

Loading