-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] Store traces in ClickHouse based on Jaeger V2 #6725
base: main
Are you sure you want to change the base?
[feat] Store traces in ClickHouse based on Jaeger V2 #6725
Conversation
a65794d
to
bacbf97
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6725 +/- ##
==========================================
- Coverage 96.05% 95.91% -0.14%
==========================================
Files 366 376 +10
Lines 20750 21370 +620
==========================================
+ Hits 19932 20498 +566
- Misses 624 667 +43
- Partials 194 205 +11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
bacbf97
to
21c3fab
Compare
a16370b
to
72d91cc
Compare
@yurishkuro ClickHouse integration test not working in CI. Is there anything I might have missed? |
51b3f8f
to
b8915ef
Compare
572c1a2
to
7718d65
Compare
19bc811
to
c408a25
Compare
I would like to implement all basic features, ensure that the integration tests pass, and then return to address other low-priority tasks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before moving code around I suggest you read & understand the comments and then propose a new directory structure that we can agree on. This will reduce the churn.
@yurishkuro The key here is that the design of the schema:
#auto run DDL script
auto: true
client:
database: jaeger
username: default
password: default
#ch-go
writer:
address: "127.0.0.1:9200"
pool:
max_connection_lifetime: 3600000000000
max_connection_idle_time: 1800000000000
#CPU Core number
min_connections: 4
#CPU Core number * 2
max_connections: 8
health_check_period: 60000000000
#clickhouse-go
reader:
#no cluster just a different field here.
addresses: ["node00:9200","node01:9200","node02:9200"] The directory structure has been adjusted as follows:
|
@yurishkuro In the implementation of |
Code pointer? |
jaeger/internal/storage/v1/cassandra/spanstore/reader.go Lines 150 to 154 in 84212d2
And only gRPC really use them all. jaeger/internal/storage/v1/grpc/shared/grpc_client.go Lines 72 to 76 in 84212d2
Also, do you think the new code structure I proposed is suitable? |
The timestamps in GetTraces request were introduced on request from 3rd party implementations (e.g. Tempo). None of the internally supported backends need these parameters because of how the db schemas are organized. But it was expected that the timestamps could be useful for ClickHouse. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This PR introduces ClickHouse as a storage backend for Jaeger traces, including a new clickhouse client implementation, test infrastructure, and CI configuration.
- Adds GitHub Actions workflows and docker-compose configurations for ClickHouse e2e tests.
- Implements new client, connection, and pool configurations along with generated mocks and their tests.
- Updates integration tests to support ClickHouse alongside existing storage backends.
Reviewed Changes
File | Description |
---|---|
.github/workflows/ci-e2e-clickhouse.yml | Adds CI workflow for running ClickHouse integration tests; note a potential variable reference error in the job name. |
internal/storage/v2/clickhouse/client/mocks/Conn.go | Generated mock for connection interface; no issues found. |
internal/storage/v2/clickhouse/client/mocks/Rows.go | Generated mock for rows interface; no issues found. |
internal/storage/integration/clickhouse_test.go | Integration tests for ClickHouse storage functionality. |
internal/storage/v2/clickhouse/client/pool/config_test.go | Tests for default pool configuration; variable naming typo observed. |
internal/storage/v2/clickhouse/client/conn/config_test.go | Tests for default connection configuration; variable naming typo observed. |
internal/storage/v2/clickhouse/config/config.go | Configuration for ClickHouse storage validated and aligned with new client components. |
internal/storage/v2/clickhouse/factory.go | Factory initialization for trace writer creation and resource cleanup. |
docker-compose/clickhouse/docker-compose.yml | Docker compose file to set up ClickHouse for local/integration testing. |
internal/storage/v2/clickhouse/client/conn/config.go | Implementation of connection configuration using the ClickHouse driver. |
internal/storage/v2/clickhouse/client/pool/config.go | Pool configuration implementation using the ClickHouse pool driver. |
.mockery.yaml | Updated to generate mocks for new client interfaces. |
internal/storage/v2/clickhouse/client/client.go | Defines client interfaces for connection, pool, and rows. |
internal/storage/integration/package_test.go | Enhancements in leak testing to handle multiple storage backends. |
.github/workflows/ci-e2e-all.yml | Updated CI pipeline to include ClickHouse integration tests. |
Copilot reviewed 33 out of 33 changed files in this pull request and generated 3 comments.
295bf44
to
f050114
Compare
The current implementation has numerous design flaws. My plan is to first implement the basic functionality: saving traces to the backend, retrieving them, and verifying them through integration testing. Then, I will enhance the implementation and add supplementary test cases. |
a1a580a
to
167dcf1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro I have refactored the code structure and implemented basic functions for writing and reading traces, as you suggested.
fail-fast: false | ||
matrix: | ||
clickhouse-version: ["25.x"] | ||
create-schema: [manual, auto] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can create schema automatically why do we need to support manual?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes,we should use create-schema: [auto]
only.
span.References = []model.SpanRef{} | ||
} | ||
if span.Tags == nil { | ||
span.Tags = []model.KeyValue{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create a separate pr for this change, no need to bundle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, please see: #6798
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if I don’t make this change, the test will fail because it compares an empty struct with nil in the assertion. However, after I rebased the branch, this issue seems to be resolved.
6bf5dc0
to
14562d5
Compare
id: test-execution | ||
run: bash scripts/e2e/clickhouse.sh ${{ matrix.clickhouse-version }}-${{ matrix.create-schema }} | ||
env: | ||
SKIP_APPLY_SCHEMA: ${{ matrix.create-schema == 'auto' && true || false }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we simply use SKIP_APPLY_SCHEMA: false
here?
6981216
to
8a35cbc
Compare
8a35cbc
to
a1edda1
Compare
fail-fast: false | ||
matrix: | ||
clickhouse-version: ["25.x"] | ||
create-schema: [auto] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant why have the parameter in the first place? You seem to have copied the Cassandra behavior with manual and auto schema init, but that was a legacy state. If we can always autocreate the schema we don't need to support manual schema creation at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated it and added more unit tests.
05578ad
to
9868dd3
Compare
Signed-off-by: zzzk1 <[email protected]> Signed-off-by: zhengkezhou1 <[email protected]>
Signed-off-by: zzzk1 <[email protected]> Signed-off-by: zhengkezhou1 <[email protected]>
Signed-off-by: zzzk1 <[email protected]> Signed-off-by: zhengkezhou1 <[email protected]>
Signed-off-by: zhengkezhou1 [email protected] Signed-off-by: zhengkezhou1 <[email protected]>
9868dd3
to
e62882d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro Can you please take a look at my new improvements and ideas?
type Model struct { | ||
Timestamp time.Time | ||
TraceId string | ||
SpanId string | ||
ParentSpanId string | ||
TraceState string | ||
SpanName string | ||
SpanKind string | ||
ServiceName string | ||
ResourceAttributesKeys []string `ch:"ResourceAttributes.keys"` | ||
ResourceAttributesValues []string `ch:"ResourceAttributes.values"` | ||
ScopeName string | ||
ScopeVersion string | ||
SpanAttributesKeys []string `ch:"SpanAttributes.keys"` | ||
SpanAttributesValues []string `ch:"SpanAttributes.values"` | ||
Duration uint64 | ||
StatusCode string | ||
StatusMessage string | ||
EventsTimestamp []time.Time `ch:"Events.Timestamp"` | ||
EventsName []string `ch:"Events.Name"` | ||
EventsAttributes []map[string]string `ch:"Events.Attributes"` | ||
LinksTraceId []string `ch:"Links.TraceId"` | ||
LinksSpanId []string `ch:"Links.SpanId"` | ||
LinksTraceState []string `ch:"Links.TraceState"` | ||
LinksAttributes []map[string]string `ch:"Links.Attributes"` | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should they be split as follows?:
type Model struct {
Trace Trace
Span Span
Scope Scope
Events Events
Links Links
}
type Trace struct {
Timestamp time.Time
Id string
State string
ServiceName string
Duration uint64
}
type Span struct {
Id string
ParentId string
Name string
Kind string
AttributesKeys []string `ch:"SpanAttributes.keys"`
AttributesValues []string `ch:"SpanAttributes.values"`
StatusCode string
StatusMessage string
}
type Scope struct {
Name string
Version string
ResourceAttributesKeys []string `ch:"ResourceAttributes.keys"`
ResourceAttributesValues []string `ch:"ResourceAttributes.values"`
}
type Events struct {
Timestamp []time.Time `ch:"Events.Timestamp"`
Name []string `ch:"Events.Name"`
Attributes []map[string]string `ch:"Events.Attributes"`
}
type Links struct {
TraceId []string `ch:"Links.TraceId"`
SpanId []string `ch:"Links.SpanId"`
TraceState []string `ch:"Links.TraceState"`
Attributes []map[string]string `ch:"Links.Attributes"`
}
} | ||
|
||
// ConvertToTraces convert the db model read from clickhouse to OTel Traces. | ||
func (m Model) ConvertToTraces() (ptrace.Traces, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can perform the convert operation based on the structure divided above.
"go.opentelemetry.io/collector/pdata/ptrace" | ||
) | ||
|
||
func TestConvertToTraces(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TableDrivenTests should be useful here.
}) | ||
} | ||
|
||
func TestConvertLink(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same.
}) | ||
} | ||
|
||
func TestStatusCode(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same.
}) | ||
} | ||
|
||
func TestSpanKind(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same.
Which problem is this PR solving?
Desgin doc: Jaeger V2: Support for ClickHouse as Storage Backend
Part of #5058
Description of the changes
How was this change tested?
Checklist
jaeger
:make lint test
jaeger-ui
:npm run lint
andnpm run test