-
Notifications
You must be signed in to change notification settings - Fork 1.2k
OTel Correlation: DBM #30825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTel Correlation: DBM #30825
Conversation
…blue/otel-correlation
…blue/otel-correlation
…blue/otel-correlation
Co-authored-by: Jade Guiton <jade.guiton@datadoghq.com>
…blue/otel-correlation
…blue/otel-correlation
…blue/otel-correlation
✅ Documentation Team ReviewThe documentation team has approved this pull request. Thank you for your contribution! |
Preview links (active after the
|
…blue/otel-correlate-dbm
|
||
## Overview | ||
|
||
Correlate backend traces to detailed database performance data in Datadog Database Monitoring (DBM). This allows you to link spans from your OpenTelemetry-instrumented application directly to query metrics and execution plans to identify the exact queries that are slowing down your application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might soften the wording here. Saying "directly" sounds like we can link from a span to specific samples / execution plans, which we can't do currently for OTel.
Conceptually, we can link from the aggregate query in APM to the aggregate query in DBM - think of it as a one-to-many link from span to DBM query executions.
I might say "... to link spans from your OpenTelemetry-instrumented application to related query metrics and execution plans." Just trying to properly set expectations with customers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Reworded per your suggestion.
3. In the trace's flame graph, select a database span (for example, a span with `span.type: sql`) | ||
4. In the details panel, click the **SQL Queries** tab. You should see the host metrics, like CPU and memory utilization, from the host that executed that part of the request. | ||
|
||
## Further reading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to have a list of some known gotchas at this point? The correlation here is best-effort and not guaranteed, so I want to make sure people are aware of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Added a troubleshooting section at the bottom that we can add to for common scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One tiny language suggestion, LGTM otherwise!
|
||
#### Auto instrumentation | ||
|
||
If you are using an OpenTelemetry auto-instrumentation library, you can add required attributes without changing your application code. Most OpenTelemetry auto-instrumentation libraries already add `db.system` and `db.statement`. For DBM correlation, you typically only need to add the Datadog-specific `span.type` attribute. You can do this by using the OpenTelemetry Collector's `attributes` processor to enrich your spans. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
span.type
shouldn't need to be manually set, we derive it based on the other values in the OTel sdk or in the agent if that's being used.
I think we should also stress that when auto-instrumentation is being used, you shouldn't have to set any attributes, and only the agent configuration is necessary.
Updated the description of the `span.type` attribute and clarified its automatic setting by OpenTelemetry SDK or Datadog Agent. Revised sections on auto and manual instrumentation for better clarity.
What does this PR do? What is the motivation?
Preview Link
See #30354. This PR pulls DBM content into a separate branch to avoid blocking ready-to-merge content.
Merge instructions
Merge readiness:
For Datadog employees:
Your branch name MUST follow the
<name>/<description>
convention and include the forward slash (/
). Without this format, your pull request will not pass CI, the GitLab pipeline will not run, and you won't get a branch preview. Getting a branch preview makes it easier for us to check any issues with your PR, such as broken links.If your branch doesn't follow this format, rename it or create a new branch and PR.
[6/5/2025] Merge queue has been disabled on the documentation repo. If you have write access to the repo, the PR has been reviewed by a Documentation team member, and all of the required checks have passed, you can use the Squash and Merge button to merge the PR. If you don't have write access, or you need help, reach out in the #documentation channel in Slack.
Additional notes