Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit histogram boundaries do not match the otel spec #3669

Closed
artdent opened this issue Mar 13, 2023 · 14 comments · Fixed by #3893
Closed

Explicit histogram boundaries do not match the otel spec #3669

artdent opened this issue Mar 13, 2023 · 14 comments · Fixed by #3893
Assignees
Labels
bug Something isn't working good first issue Good for newcomers priority:p4 Bugs and spec inconsistencies which do not fall into a higher prioritization spec-noncompliant An existing feature incorrectly or incompletely implements the OTel spec. May or may not be a bug

Comments

@artdent
Copy link

artdent commented Mar 13, 2023

What happened?

Steps to Reproduce

Initialize sdk-metrics and create a histogram with the default settings. Observe the chosen histogram buckets.

Expected Result

The chosen boundaries should match the metrics spec:

The Default Value represents the following buckets (heavily influenced by the default buckets of Prometheus clients, e.g. Java and Go):
(-∞, 0], (0, 5.0], (5.0, 10.0], (10.0, 25.0], (25.0, 50.0], (50.0, 75.0], (75.0, 100.0], (100.0, 250.0], (250.0, 500.0], (500.0, 750.0], (750.0, 1000.0], (1000.0, 2500.0], (2500.0, 5000.0], (5000.0, 7500.0], (7500.0, 10000.0], (10000.0, +∞). SDKs SHOULD use the default value when boundaries are not explicitly provided, unless they have good reasons to use something different (e.g. for backward compatibility reasons in a stable SDK release).

Specifically, the default boundaries should be [0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000, 2500, 5000, 7500, 10000].

Actual Result

The default boundaries in the node metrics sdk instead are [0, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000], which is not as useful for e.g. bucketing slow RPCs.

Additional Details

It looks like the defaults are configured here in the sdk-metrics package.

OpenTelemetry Setup Code

No response

package.json

{
  "dependencies": {
    "@opentelemetry/api": "1.3.0",
    "@opentelemetry/sdk-node": "0.34.0",
    "@opentelemetry/sdk-trace-base": "1.8.0",
  }
}

Relevant log output

No response

@artdent artdent added bug Something isn't working triage labels Mar 13, 2023
@pichlermarc pichlermarc added spec-noncompliant An existing feature incorrectly or incompletely implements the OTel spec. May or may not be a bug priority:p4 Bugs and spec inconsistencies which do not fall into a higher prioritization good first issue Good for newcomers up-for-grabs Good for taking. Extra help will be provided by maintainers and removed triage labels Mar 14, 2023
@pichlermarc
Copy link
Member

Thanks for reaching out, looks like this spec change flew under the radar for us. 😓

AFAIK adding these buckets is not a breaking change, and other SDKs made this change after GA as well, so adding these should be safe. 🙂

@Bharath-Ganesh
Copy link

@pichlermarc Can I take this up? I'm reading the documentation to get a better understanding.
I'll revert back to you with my findings.

@pichlermarc
Copy link
Member

@pichlermarc Can I take this up? I'm reading the documentation to get a better understanding. I'll revert back to you with my findings.

It's yours. 🎉 Thanks for picking this up. 🙂
If you have any questions, feel free to reach out here or on the CNCF Slack 🙂

@Bharath-Ganesh
Copy link

Great, I'll do that.
By using the default settings, I'm attempting to make a histogram.
https://opentelemetry.io/docs/reference/specification/metrics/api/#histogram-creation

While attempting to execute the npm test, I encountered a problem.
If I can't fix the problem below, I'll consult with the Slack team.
image

@pichlermarc
Copy link
Member

@Bharath-Ganesh were you able to resolve the issue? 🤔
I did spin up a Windows VM and installed Node 14 and it looks like there seem to be some issues indeed, however I could not reproduce the exact problem from your screenshot.

Could you try if the same problem persists with Node 16? 🤔 Maybe there's a problem with some of our tooling.

@Bharath-Ganesh
Copy link

I tried to spin up a Ubuntu VM to run the test, and it's still failing.
node version: 18
image

@pichlermarc
Copy link
Member

Interesting, could you include the actual error message in the logs? 🤔
With node 18, you'll need to run the tests using NODE_OPTIONS=--openssl-legacy-provider as we're using an older version of webpack that needs this flag.

@pichlermarc pichlermarc removed the up-for-grabs Good for taking. Extra help will be provided by maintainers label Apr 5, 2023
@ashutosh887
Copy link

I would like to work on this @pichlermarc @artdent

@pichlermarc
Copy link
Member

@ashutosh887 I wonder if @Bharath-Ganesh is still working on this, it may be worth syncing up with them to avoid some duplicate work. Feel free to open a PR then. 🙂

@Bharath-Ganesh
Copy link

@ashutosh887 I'm working on this. I got caught up with some work, sorry for the delay.

@ashutosh887
Copy link

Thanks @pichlermarc
Sure @Bharath-Ganesh let me know if you need any help in this

@ashutosh887
Copy link

Hey @Bharath-Ganesh are you still working on this?

cc: @pichlermarc

@Bharath-Ganesh
Copy link

@ashutosh887 I'm pretty swamped. You could take this up.

@ashutosh887
Copy link

Thanks @Bharath-Ganesh
@pichlermarc Please assign this issue to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers priority:p4 Bugs and spec inconsistencies which do not fall into a higher prioritization spec-noncompliant An existing feature incorrectly or incompletely implements the OTel spec. May or may not be a bug
Projects
None yet
4 participants