Skip to content

feat(tracing): Handle tracestate HTTP headers/correlation context envelope headers #971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

lobsterkatie
Copy link
Member

@lobsterkatie lobsterkatie commented Jan 13, 2021

This PR is an initial implementation of the official tracestate spec (detailed here), which specifies a multi-vendor, vendor-neutral method of propagating trace data between services. The basic format of a tracestate header is vendor1=value1,vendor2=value2,vendor3=value3,....

Key Behaviors:

  • When starting a transaction representing an incoming request*, the request's tracestate header is parsed, and Sentry and third-party portions are stored separately on the transaction
  • If no Sentry portion is found, or if a transaction is not being started from headers, a new Sentry tracestate value is calculated and stored on the transaction
  • Outgoing requests made while the transaction is open contain a tracestate header of the form sentry=sentry_value,vendor1=value1,vendor2=value2 (in other words, wherever the Sentry data was in the incoming list, it's moved to be the first entry)
  • Transactions (and errors with attachments** sent while a transaction is open) include data from Sentry's tracestate value in their envelope headers

* Now that there's more than one header, the from_traceparent method mentioned in an older version of our tracing spec (since replaced by from_sentrytrace in the spec, but not the SDK) is deprecated in favor of two new helper functions called from continue_from_headers.
** Error events with attachments just because those are the only ones currently sent in envelopes.

Sentry's tracestate Value:

  • The value in the HTTP header is the result of taking data of the form { "trace_id": ... , "release": ... , "environment": ... , "public_key": ... }, base64-encoding it, and stripping off any trailing =. (release and environment reflect the values set in sentry_sdk.init(), and public_key comes from the part of the DSN between the https:// and the @xxx.ingest.sentry.io.)
  • The value in the envelope header is that same data, but without any encoding.

Known Limitations:

Compared to the spec, this does not handle:

  • Multiple incoming tracestate headers
  • Incoming third-party values which are empty (not not-there, but rather empty like the second entry in x=y,,z=w)
  • Whitespace in the incoming tracestate header value
  • Limiting the length of either the Sentry tracestate value or the overall tracestate header

It also does not consume, create, or propagate traceparent headers. (We continue to use our own similar sentry-trace header.)

Follow-up Tasks

  • Decide if we like the new continue_from_headers/helper function model, and update dev docs
  • Decide which, if any, of the known limitations listed above are worth fixing
  • Update JS SDK to consume incoming tracestate headers and handle third-party tracestate values
  • Sort out whether and how to include end-user data

@lobsterkatie lobsterkatie force-pushed the kmclb-trace-context-in-envelope-header branch 3 times, most recently from a352af7 to 4f188d3 Compare January 15, 2021 02:33
Copy link
Contributor

@RaduW RaduW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice implementation.
I'm not really familiar with the SDK so I might miss more subtle points.
I think you have a small bug on extracting the elements from the tracestate header.

Since it is a bit tricky I would add a parameterized test for extract_tracestate_data in which I would pass a few plausible headers to make sure it does what we need.
I would also do a separate test for extract_sentrytrace_data to make sure I'm not missing anything

Copy link
Contributor

@RaduW RaduW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

I would move the massaging of '=' out of capture_event and into from_base64 but it's you call.

@lobsterkatie lobsterkatie force-pushed the kmclb-trace-context-poc branch from 3389284 to 2377371 Compare January 25, 2021 16:45
@lobsterkatie lobsterkatie force-pushed the kmclb-trace-context-in-envelope-header branch 2 times, most recently from 9a878fc to 41dc24a Compare January 25, 2021 16:49
@lobsterkatie lobsterkatie force-pushed the kmclb-trace-context-poc branch from 2377371 to eed2ca1 Compare January 25, 2021 22:02
@lobsterkatie lobsterkatie force-pushed the kmclb-trace-context-in-envelope-header branch from 41dc24a to 91cfd62 Compare January 25, 2021 22:08
@lobsterkatie lobsterkatie marked this pull request as ready for review January 26, 2021 02:48
@lobsterkatie lobsterkatie changed the title [WIP] feat(tracing): Add tracestate correlation context to envelope headers feat(tracing): Handle tracestate HTTP headers/correlation context envelope headers Jan 26, 2021
@lobsterkatie lobsterkatie force-pushed the kmclb-trace-context-in-envelope-header branch from 91cfd62 to 464886e Compare January 26, 2021 03:08
@dashed
Copy link
Member

dashed commented Feb 19, 2021

@lobsterkatie This is off-topic to this PR; would it be possible to pass around parent transaction name within tracestate? This could help unlock some product features such as knowing the next transaction(s) from a given transaction.

For example, from /issues transaction, query (and filter on) the breakdown of transactions whose parent transaction is /issues.

@untitaker
Copy link
Member

@dashed we don't persist tracestate in the db. let's discuss this somewhere else.

@lobsterkatie lobsterkatie force-pushed the kmclb-trace-context-poc branch from eed2ca1 to 84b225d Compare February 26, 2021 01:45
@lobsterkatie lobsterkatie force-pushed the kmclb-trace-context-in-envelope-header branch from 464886e to 97a4aaf Compare February 26, 2021 01:46
@lobsterkatie lobsterkatie merged commit c118a5b into kmclb-trace-context-poc Feb 26, 2021
@lobsterkatie lobsterkatie deleted the kmclb-trace-context-in-envelope-header branch February 26, 2021 04:28
lobsterkatie added a commit that referenced this pull request Mar 2, 2021
lobsterkatie added a commit that referenced this pull request Mar 23, 2021
lobsterkatie added a commit that referenced this pull request Aug 23, 2021
lobsterkatie added a commit that referenced this pull request Sep 2, 2021
lobsterkatie added a commit that referenced this pull request Sep 13, 2021
lobsterkatie added a commit that referenced this pull request Sep 16, 2021
This introduces handling of the `tracestate` header, as described in the W3C Trace Context spec[1] and our own corresponding spec[2].

 Key features:

- Deprecation of `from_traceparent` in favor of `continue_from_headers`, which now propagates both incoming `sentry-trace` and incoming `tracestate` headers.
- Propagation of `tracestate` value as a header on outgoing HTTP requests when they're made during a transaction.
- Addition of `tracestate` data to transaction envelope headers.

Supporting changes:

- New utility methods for converting strings to and from base64.
- Some refactoring vis-à-vis the links between transactions, span recorders, and spans. See #1173 and #1184.
- Moving of some tracing code to a separate `tracing_utils` file.

Note: `tracestate` handling  is currently feature-gated by the flag `propagate_tracestate` in the `_experiments` SDK option. 

More details can be found in the main PR on this branch, #971.

[1] https://www.w3.org/TR/trace-context/#tracestate-header
[2] https://develop.sentry.dev/sdk/performance/trace-context/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants