Skip to content

feat: Move from datadog to generic otel #1567

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 19, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add telemetry pkg
  • Loading branch information
f0ssel committed May 18, 2022
commit e0e433ccb17a53d6e8bfcbc48a0398ed8ffda77c
40 changes: 40 additions & 0 deletions telemetry/exporter.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
package telemetry

import (
"context"

"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
"go.opentelemetry.io/otel/sdk/resource"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
semconv "go.opentelemetry.io/otel/semconv/v1.4.0"
"golang.org/x/xerrors"
)

// Exporter creates a grpc otlp exporter and sets it as the global trace provider.
// Caller is responsible for closing exporter to ensure all data is flushed.
func Exporter(ctx context.Context, service string) (func(), error) {
res, err := resource.New(ctx,
resource.WithAttributes(
// the service name used to display traces in backends
semconv.ServiceNameKey.String(service),
),
)

otlptracegrpc.NewClient()
exporter, err := otlptrace.New(ctx, otlptracegrpc.NewClient())
if err != nil {
return nil, xerrors.Errorf("creating otlp exporter: %w", err)
}

tracerProvider := sdktrace.NewTracerProvider(
sdktrace.WithBatcher(exporter),
sdktrace.WithResource(res),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this actually send data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a comment, but they get shipped to a local otel exporter by default. You can change where it gets shipped with otel env vars like OTEL_COLLECTOR_ENDPOINT and such, so we don't need to write any code to support the configuration. The datadog agent we have deployed on dev.coder.com has a collector built in, so this will automatically start sending the otel data there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I see! Even just an external link would help. I was expecting --trace to accept an endpoint to write traces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

otel.SetTracerProvider(tracerProvider)

return func() {
_ = tracerProvider.Shutdown(ctx)
}, nil
}
52 changes: 52 additions & 0 deletions telemetry/httpmw.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
package telemetry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name telemetry might be confusing. I think tracing would be very reasonable. We might want to place this in coderd/tracing, because I'm sure we'll probably only trace there for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was on the fence between the two

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put this higher level BECAUSE I didn't want to exclude the provisioner apps. Those reasonably could be traced as well, and those distributed traces would be connected in otel with the right glue code.

That being said, we are not currently using any tracing in the other apps, so I'm also fine to move to coderd if we want to isolate it for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we isolate for now. Nothing else has an HTTP server, and if that's our primary point of tracing I'd be hesitant to say we'll do it for other things soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


import (
"fmt"
"net/http"

"github.com/go-chi/chi/middleware"
"github.com/go-chi/chi/v5"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/codes"
)

// HTTPMW adds tracing to http routes.
func HTTPMW(tracer string) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(rw http.ResponseWriter, r *http.Request) {
// start span with default span name. Span name will be updated once request finishes
_, span := otel.Tracer(tracer).Start(r.Context(), "http.request")
defer span.End()

wrw := middleware.NewWrapResponseWriter(rw, r.ProtoMajor)

// pass the span through the request context and serve the request to the next middleware
next.ServeHTTP(rw, r)

// set the resource name as we get it only once the handler is executed
resourceName := chi.RouteContext(r.Context()).RoutePattern()
if resourceName == "" {
resourceName = "unknown"
}
resourceName = r.Method + " " + resourceName
span.SetName(resourceName)

// set the status code
status := wrw.Status()
// 0 status means one has not yet been sent in which case net/http library will write StatusOK
if status == 0 {
status = http.StatusOK
}
span.SetAttributes(attribute.KeyValue{
Key: "http.status_code",
Value: attribute.IntValue(status),
})

// if 5XX we set the span to "error" status
if status >= 500 {
span.SetStatus(codes.Error, fmt.Sprintf("%d: %s", status, http.StatusText(status)))
}
})
}
}