-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add data semantics #82
Conversation
|
||
As an example, consider the common case of a HTTP-based application server. The URL of an incoming request that the application is handling is often useful for diagnostics, as well as the HTTP verb and the resultant status code. An instrumentor could choose to report the URL in a tag named `URL`, or perhaps named `http.url`--either would be valid from the pure API perspective. But if the Tracer wishes to add intelligence, such as indexing on the URL value or sampling proactively for requests to a particular endpoint, it must know where to look for relevant data. In short, when tag names and other instrumentor-provided values are used consistently, the tracers on the other side of the API can employ more intelligence. | ||
|
||
The guidelines contained describe a common ground on which instrumentors and tracer authors can build beyond pure data collection. Adherence to the guidelines is optional but highly recommended for instrumetors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"contained here", and typo in the last word "instrumetors"
Overall LGTM, a few minor typos, and a couple questions. I think it makes sense to include in the top level menu, under Semantic Spec link. |
|
||
### Span Naming | ||
|
||
Spans carry tags, logs, and attributes, but they also have a top-level **operation name**. This should be a low-cardinality string value representing the type of work being done in the span. Typically, this is the local type of work being done. Examples include `GET` or `POST` for a HTTP span, `SELECT` or `INSERT` for a SQL span, and so on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/attributes/baggage/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the need for GET
/SELECT
/etc as operation names when nothing else is available, but in an ideal world the operation name would have more semantic meaning... e.g., in an RPC-based system it would be the name of the RPC (thrift/grpc/whatever) method. Maybe that would be a better example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, that would be a better example, especially as a guideline. In practice, when implicit instrumentation is used, it's nearly impossible to come up with business-specific names, but if an RPC framework itself is instrumented, it often does know the name of the endpoint it is calling.
We had a situation where we had a thick client for a storage service, which was communicating to the server via http. The implicit instrumentation picked the default get/post
names at the http level spans, but the explicit instrumentation in the thick client was wrapping them into better-named spans.
Looking great! I just made a bunch of nits. |
(@dkuebric friendly ping on this) |
@yurishkuro @bensigelman thanks for all the thoughtful comments (and close reading). I've taken most of your suggestions directly. I'm hoping this is mergeable right now, as I've heard some others had changes they wanted to start merging in. Here's my thoughts on some of the less obvious changes I made:
|
@@ -32,7 +32,7 @@ Span structure is also important: what do spans represent, and what relationship | |||
|
|||
## Span Tag Use-Cases | |||
|
|||
The following tags are recommended for instrumentors who are trying to represent a particular type of data. Tag names follow a general structure of namespacing; the values are determined purely on a tag-by-tag basis. | |||
The following tags are recommended for instrumentors who are trying to represent a particular type of data. Tag names follow a general structure of namespacing. | |||
|
|||
The recommended tags below are accompanied by `const` values included in an `ext` module for each opentracing implementation. These `ext` values should be used in place of the strings below, as tracers may choose to use different underlying representations for these common concepts. The symbols are similar in each implementation: ([Go](https://github.com/opentracing/opentracing-go/blob/master/ext/tags.go), [Python](https://github.com/opentracing/opentracing-python/blob/master/opentracing/ext/tags.py), etc.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI (i.e., no need to change this PR), #76 seems like a very wise idea and – once realized – would be something to reference here... a canonical YAML file and codegen seems like the only sane way to go in any case.
LGTM
Already there, but as c/s. We can change that, since there tags are normally only used in-memory. https://github.com/opentracing/opentracing-go/blob/master/ext/tags.go#L43 |
(LGTM as well) |
### Exception | ||
|
||
* Event: `exception` | ||
* Payload - map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think map
is too language specific (i.e. python). If this was Java/JavaScript, I would simply do span.log("exception", exc)
and let the tracer figure out how to record it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that languages would benefit from some convenience methods that automatically extract features from commonly-used error/exception objects. However, would that be implemented via a log_event overload, or via a log_error or similar?
I suppose secretly I've been assuming that payloads should are maps/arrays/objects in the way that spans are. (An event might have an arbitrary set of data to report with it.) If not, instrumentors are going to have to figure out what types of payloads might be valid for various tracer implementations, I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, would that be implemented via a log_event overload, or via a log_error or similar?
No, I was thinking just plain old log
method: span.log("exception", exc)
If not, instrumentors are going to have to figure out what types of payloads might be valid for various tracer implementations, I guess?
If we were shooting for maps, they I would suggest changing the API to take a map rather than interface{}
/ Object
. But it feels that people would question this choice; after all, an instance of Exception is a natural representation of the exception, why make more work for people to break it apart into a map?
We have stayed away from defining semantics of the payload so far. It's definitely something that we need to collectively address and be more prescriptive, to clarify the contract between instrumentors and implementors. But I would suggest a different issue/PR for that. Until then, an opaque Object
is the least restrictive form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I have generally thought of the payload
first and foremost as a place for the OpenTracing caller to stick "FYI"-level structured data that might help explain or contextualize the trace.
@dkuebric can you merge this soon? |
@bensigelman Made changes per discussions of other week. I don't have write access but I think the PR is ready, pending a quick review. |
@dkuebric oh dear, sorry! I'll review and likely merge shortly. |
👍 |
This is a cleaned-up version of #61 to continue the path towards merge, after discussion there and in #75.
I haven't incorporated the auto-generation suggested in #76 yet but that would be a possible next step.
One question in my mind is whether this document should get listed in the top-level nav? For now, I've left it out and referenced from the
Log
andSpan
sections ofspec.md
.