Trace HTTPClient request execution #320

slashmo · 2020-12-05T17:08:05Z

Motivation:

Context Propagation

In order to instrument distributed systems, metadata such as trace ids
must be propagated across network boundaries through HTTP headers.
As HTTPClient operates at one such boundary, it should take care of
injecting metadata into HTTP headers automatically using the configured
instrument.

Built-in tracing

Furthermore, HTTPClient should create a Span for executed requests
under the hood, so that users benefit from tracing effortlessly.

Modifications:

Propagate service context to server via HTTP request headers
Create Span for executed HTTP request

Result:

AsyncHTTPClient automatically creates a Distributed Tracing span per request
AsyncHTTPClient propagates service context to server, making server spans children of client spans

swift-server-bot · 2020-12-05T17:08:07Z

Can one of the admins verify this patch?

swift-server-bot · 2020-12-05T17:08:07Z

Can one of the admins verify this patch?

swift-server-bot · 2020-12-05T17:08:07Z

Can one of the admins verify this patch?

swift-server-bot · 2020-12-05T17:08:08Z

Can one of the admins verify this patch?

swift-server-bot · 2020-12-05T17:08:08Z

Can one of the admins verify this patch?

slashmo · 2020-12-05T17:11:00Z

I chatted with @ktoso earlier to discuss the manual context propagation, and we agreed that we probably shouldn't deprecate the "old" API accepting a Logger for each request overload, as we don't want to push users too much into the direction of manual context passing because that's ideally not necessary once the mentioned language changes have been made: https://github.com/apple/swift-distributed-tracing#important-note-on-adoption

ktoso

So since technically we're 0.1 and something may change... how do we want to tackle adoption here.

I was thinking to kick off a branch like tracing for now, so we can polish up there and once we're all confident merge into mainline? We could also tag those tracing releases, they'd follow normal releases e.g. 1.2.2-tracing.

I don't really expect anything breaking in the core APIs but the open telemetry support which we may want to use here could still fluctuate a little bit until they're final hmmm...

Sources/AsyncHTTPClient/HTTPHeadersInjector.swift

Sources/AsyncHTTPClient/HTTPClient.swift

ktoso · 2020-12-07T00:31:03Z

Package.swift

    ],
    targets: [
        .target(
            name: "AsyncHTTPClient",
            dependencies: ["NIO", "NIOHTTP1", "NIOSSL", "NIOConcurrencyHelpers", "NIOHTTPCompression",
-                           "NIOFoundationCompat", "NIOTransportServices", "Logging"]
+                           "NIOFoundationCompat", "NIOTransportServices", "Logging", "Instrumentation"]


Can we right away go with Tracing and do the full thing in a single PR?

That's my intention. I've added a checklist to the PR including creating a Span. I first wanted to get the instrumentation part down and then continue with tracing, but all inside this PR.

Sources/AsyncHTTPClient/HTTPClient.swift

Lukasa · 2020-12-08T09:15:54Z

@swift-server-bot add to whitelist

Lukasa · 2020-12-08T09:16:40Z

I'd like to punt this to a side-branch for iterative development if we can.

slashmo · 2020-12-08T09:18:42Z

I'd like to punt this to a side-branch for iterative development if we can.
@Lukasa

Sure, sounds like a good approach. I can change the target branch once it's created.

Lukasa · 2020-12-08T09:20:19Z

I've opened up the tracing-development branch.

slashmo · 2020-12-08T09:25:33Z

@ktoso The CI seems to fail because the Baggage repo cannot be cloned through the Git URL. Should we pin Tracing to 0.1.1 here in order to get the fix? (apple/swift-distributed-tracing/pull/25)

ktoso · 2020-12-08T09:26:48Z

No, we need to tag a 0.1.1, I'll do that in a moment.

ktoso · 2020-12-08T10:39:57Z

0.1.1. tagged, please depend on that.

Thanks Cory for the development branch, sounds good 👍

ktoso · 2020-12-08T10:55:49Z

@swift-server-bot test this please

ktoso · 2020-12-08T10:56:13Z

Can drafts get CI validation? 🤔

Lukasa · 2020-12-08T11:17:04Z

Yes, they can: I think the CI isn't targeting that branch at the moment.

czechboy0 · 2025-02-10T13:58:46Z

Sources/AsyncHTTPClient/AsyncAwait/HTTPClient+execute.swift

@@ -30,18 +32,20 @@ extension HTTPClient {
    public func execute(
        _ request: HTTPClientRequest,
        deadline: NIODeadline,
-        logger: Logger? = nil
+        logger: Logger? = nil,
+        context: ServiceContext? = nil


Adding a parameter is technically breaking, so we'll need one more variant of this function with the original signature and @_disfavoredOverload and a deprecation warning, calling out to this new method.

czechboy0 · 2025-02-10T13:59:18Z

Sources/AsyncHTTPClient/AsyncAwait/HTTPClient+execute.swift

@@ -55,12 +59,14 @@ extension HTTPClient {
    public func execute(
        _ request: HTTPClientRequest,
        timeout: TimeAmount,
-        logger: Logger? = nil
+        logger: Logger? = nil,
+        context: ServiceContext? = nil


Same here re re-adding the previous variant for backwards compatibility.

czechboy0 · 2025-02-10T14:00:47Z

Sources/AsyncHTTPClient/AsyncAwait/HTTPClient+execute.swift

+            var request = request
+            request.head.headers.propagate(span.context)
+            span.updateAttributes { attributes in
+                attributes["http.request.method"] = request.head.method.rawValue


Do we also attach the path as a separate attribute, or is url.full the only place it's expected to show up?

Path may be nice tbh... it's not a "standardized" convention but I could see it be useful... Up to y'all if we want to duplicate; url.full seems to be the "expected one" to set otel wise...

url.path seems to be recommended for the server span only here: https://opentelemetry.io/docs/specs/semconv/http/http-spans/#http-server-semantic-conventions? But here it's simply listed, without that server guidance: https://opentelemetry.io/docs/specs/semconv/attributes-registry/url/#url-path

I guess wouldn't hurt to add it, but I guess the client is expected not to have the full URL broken out into components, but the server does.

I'm happy to keep it as-is, with just url.full.

Yeah, I think it's fine to just include url.full for now. That being said, something I haven't yet implemented in the PR is redaction as recommended (although experimental) in OTel SemConv:

https://opentelemetry.io/docs/specs/semconv/http/http-spans/#http-client

[4] url.full: For network calls, URL usually has scheme://host[:port][path][?query][#fragment] format, where the fragment is not transmitted over HTTP, but if it is known, it SHOULD be included nevertheless.

url.full MUST NOT contain credentials passed via URL in form of https://username:[email protected]/. In such case username and password SHOULD be redacted and attribute’s value SHOULD be https://REDACTED:[email protected]/.
Sensitive content provided in url.full SHOULD be scrubbed when instrumentations can identify it.

Experimental Query string values for the following keys SHOULD be redacted by default and replaced by the value REDACTED:
AWSAccessKeyId / Signature / sig / X-Goog-Signature
This list is subject to change over time.

When a query string value is redacted, the query string key SHOULD still be preserved, e.g. https://www.example.com/path?color=blue&sig=REDACTED.

What do you think about adding these? I'm a bit torn but I don't think we'd want to include these specific keys in AsyncHTTPClient. Perhaps we could skip adding them for now and only redact the basic auth parameters, and tackle the other redaction in swift-otel/swift-otel-semantic-conventions and eventually add a dependency on it to AsyncHTTPClient.

czechboy0 · 2025-02-10T14:02:07Z

Sources/AsyncHTTPClient/AsyncAwait/HTTPClient+execute.swift

+            } catch {
+                span.setStatus(.init(code: .error))
+                span.attributes["error.type"] = "\(type(of: error))"
+                throw error


Does this mean the span doesn't include the error description, and it's up to user code to attach it higher up the stack?

Comes back to the entire "can we log errors" (not really since no idea what they contain) discussion we were having elsewhere recently. 😅

The otel guidance is the same tbh: https://opentelemetry.io/docs/specs/semconv/general/recording-errors/#recording-errors-on-spans but you'll notice it gets hand wavy with it with exception messages -- Java influenced wording there;

SHOULD set the span status code to Error

SHOULD set the error.type attribute

SHOULD set the span status description when it has additional information about the error which is not expected to contain sensitive details and aligns with Span Status Description definition.

It’s NOT RECOMMENDED to duplicate status code or error.type in span status description.

When the operation fails with an exception, the span status description SHOULD be set to the exception message.

So that leaves us back at step 1 where we need to decide what is safe -- the "exception message" is not a thing in Swift after all, and "(error)" may or may not be safe...

Unless we're able to confirm all errors that can be thrown here are "safe to be logged" and only from inside the HTTP infra and don't include request details it doesn't seem like we should log the "message" (that being the "(error)" in swift). I'm not sure if we're 100% confident about it here?

Unsure, but I wouldn't want this PR to be blocked on that, I'm happy with the first iteration not including the message here.

ktoso · 2025-02-10T14:18:16Z

Sources/AsyncHTTPClient/AsyncAwait/HTTPClient+execute.swift

-            },
-            onCancel: {
-                cancelHandler.cancel(reason: .taskCanceled)
+        return try await withSpan(request.head.method.rawValue, context: context, ofKind: .client) { span in


LGTM on using just the method as low cardinality name (matches otel well) 👍

czechboy0 · 2025-02-10T14:32:54Z

Sources/AsyncHTTPClient/AsyncAwait/HTTPClient+execute.swift

-            },
-            onCancel: {
-                cancelHandler.cancel(reason: .taskCanceled)
+        return try await withSpan(request.head.method.rawValue, context: context, ofKind: .client) { span in


Should the Tracer be a property of the client, to allow for easier unit testing? And just default it to the current process-bootstrapped tracer in the initializer, unless it's overridden?

We could do that, yeah. Elsewhere we just use InstrumentationSystem.bootstrapInternal in unit tests which also works fine but I'd be happy either way.

Unless there are downsides, I prefer the explicit dependency injection. Easy enough to debug and reason about.

Sounds good to me as long as we don't expose this initializer publicly.

What's the downside of adding it to the public initializer?

It goes against the way Swift Distributed Tracing usually works, where a single tracer is chosen for the system and used transparently everywhere. Having said that, Logging also allows passing specific Loggers around and still use LoggingSystem. I'm curious to hear what @ktoso thinks about this.

slashmo marked this pull request as draft December 5, 2020 17:11

ktoso reviewed Dec 7, 2020

View reviewed changes

slashmo force-pushed the feature/tracing branch 2 times, most recently from 047fbb0 to 87085d9 Compare December 7, 2020 17:05

slashmo commented Dec 7, 2020

View reviewed changes

Sources/AsyncHTTPClient/HTTPClient.swift Outdated Show resolved Hide resolved

slashmo changed the base branch from main to tracing-development December 8, 2020 09:21

slashmo force-pushed the feature/tracing branch from 87085d9 to ae7268d Compare December 8, 2020 10:43

slashmo force-pushed the feature/tracing branch from ae7268d to 329522c Compare December 8, 2020 17:06

slashmo force-pushed the feature/tracing branch from 329522c to d68cb8f Compare April 27, 2021 14:33

slashmo force-pushed the feature/tracing branch from d68cb8f to ddc5304 Compare May 25, 2021 08:44

[WIP] Implement Distributed Tracing

5e7ddf1

slashmo force-pushed the feature/tracing branch from e6e48ef to 5e7ddf1 Compare December 30, 2024 21:31

slashmo changed the base branch from tracing-development to main December 30, 2024 21:32

czechboy0 reviewed Feb 10, 2025

View reviewed changes

ktoso reviewed Feb 10, 2025

View reviewed changes

czechboy0 reviewed Feb 10, 2025

View reviewed changes

Trace HTTPClient request execution #320

Are you sure you want to change the base?

Trace HTTPClient request execution #320

Conversation

slashmo commented Dec 5, 2020 • edited Loading

Context Propagation

Built-in tracing

swift-server-bot commented Dec 5, 2020

swift-server-bot commented Dec 5, 2020

swift-server-bot commented Dec 5, 2020

swift-server-bot commented Dec 5, 2020

swift-server-bot commented Dec 5, 2020

slashmo commented Dec 5, 2020

ktoso left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lukasa commented Dec 8, 2020

Lukasa commented Dec 8, 2020

slashmo commented Dec 8, 2020

Lukasa commented Dec 8, 2020

slashmo commented Dec 8, 2020 • edited Loading

ktoso commented Dec 8, 2020

ktoso commented Dec 8, 2020

ktoso commented Dec 8, 2020

ktoso commented Dec 8, 2020

Lukasa commented Dec 8, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ktoso Feb 10, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

slashmo commented Dec 5, 2020 •

edited

Loading

slashmo commented Dec 8, 2020 •

edited

Loading

ktoso Feb 10, 2025 •

edited

Loading