-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report watcher metrics related to the latest revision #1094
base: main
Are you sure you want to change the base?
Conversation
client/java/src/main/java/com/linecorp/centraldogma/client/AbstractCentralDogmaBuilder.java
Show resolved
Hide resolved
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1094 +/- ##
============================================
+ Coverage 70.07% 70.12% +0.05%
- Complexity 4486 4496 +10
============================================
Files 453 453
Lines 18161 18212 +51
Branches 2008 2015 +7
============================================
+ Hits 12727 12772 +45
- Misses 4345 4347 +2
- Partials 1089 1093 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
...ria/src/main/java/com/linecorp/centraldogma/internal/client/armeria/ArmeriaCentralDogma.java
Show resolved
Hide resolved
client/java/src/main/java/com/linecorp/centraldogma/client/AbstractCentralDogmaBuilder.java
Show resolved
Hide resolved
client/java/src/main/java/com/linecorp/centraldogma/client/AbstractCentralDogmaBuilder.java
Show resolved
Hide resolved
private static final String LATEST_REVISION_METER_NAME = "centraldogma.watcher.latest.revision"; | ||
private static final String LATEST_COMMIT_TIME_METER_NAME = "centraldogma.watcher.latest.commit.time"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private static final String LATEST_REVISION_METER_NAME = "centraldogma.watcher.latest.revision"; | |
private static final String LATEST_COMMIT_TIME_METER_NAME = "centraldogma.watcher.latest.commit.time"; | |
private static final String LATEST_REVISION_METER_NAME = "centraldogma.client.watcher.latest.revision"; | |
private static final String LATEST_COMMIT_TIME_METER_NAME = "centraldogma.client.watcher.latest.commit.time"; |
Some people may want to use a different prefix and add custom tags for their metrics.
What do you think of adding a customization point for that?
I realized that the centraldogma-client
module has no dependency on Armeria. Actually, we have decided to add Armeria dependency to centraldogma-client
but didn't put it in action.
To conclude, let's add Armeria dependency and use MeterIdPrefix
, which I proposed in the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wrote this in the README, but since this is a relatively unique metric, I couldn’t think of many use cases for customization at this point.
I considered using this, but copying and pasting it didn’t seem ideal, so I decided not to for now.
(That said, it's also not a good idea for centraldogma-client to depend on Armeria.)
MeterIdPrefix in Armeria
And if we want to rename or retag metrics, we can transform them using a meter filter.
Micrometer Meter Filters Documentation
...How about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. How about just adding an infix .client
? I want to distinguish it from the server metric. We may add additional metrics under centraldogma.client
namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I overlooked the client infix remark. You're absolutely right. I'll fix it!
( Hmm... For some reason, the CI is failing on Windows... Why...? |
@@ -124,6 +150,13 @@ void start() { | |||
if (state.compareAndSet(State.INIT, State.STARTED)) { | |||
scheduleWatch(0); | |||
} | |||
if (meterRegistry != null) { | |||
// emit metrics once the values are ready | |||
initialValueFuture.thenAccept(latest -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question) What do you think of just watching this.latest.revision()
while setting -1 if not set?
I think the current approach introduces the additional complexity of maintaining an additional state.
e.g. Assuming:
val watcher = dogma.watcher()...
watcher.start()
watcher.close()
// `initialValueFuture` is completed after the close and a gauge is registered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also struggled with this.
Since I judged that it wouldn't become too complex, I wanted to provide as accurate metrics as possible, which is why I implemented it this way.
However, if the Central Dogma team determines that it would make maintenance more complex, I think it's fine to set it to -1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, Double.NaN
could also be used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you prefer precise metrics, should we use a lock to register metrics, close the watcher and remove metrics?
- The lock will solve the problem that jrhee17 mentioned.
- The metric registration could be placed in one method.
ReentrantLock lock = ...
if (initialValueFuture.complete(newLatest)) {
updateWatchMetrics();
}
void updateWatchMetrics() {
lock.lock();
if (isStopped()) {
return;
}
if (this.latest == null) {
// register two metrics
}
// update the last received time
lock.unlock();
}
void close() {
lock.lock();
....
lock.unlock();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...Thinking about this topic again, the -1
seems fine.
if we don't care about -1
, we can simply write ... >= 0
in PromQL.
(I prefer -1
because NaN has some quirks in handling.)
"pathPattern", pathPattern, | ||
// There is a possibility of using the same watcher for the same project, repo, and pathPattern. | ||
// Therefore, the watcher’s hash code should be included as a tag. | ||
"watcher_hash", String.valueOf(System.identityHashCode(this)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional) What do you think of assigning an ID instead of a hashcode on the off-chance there is a conflict? It can also help users check how many watchers there are in a JVM easily.
e.g.
private static final AtomicLong WATCHER_ID = new AtomicLong();
...
"watcher_id", String.valueOf(WATCHER_ID.incrementAndGet())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that sounds good too. If others prefer this approach, I’ll consider changing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about allowing users to set the name of a Watcher
? If it is absent, we may create a name automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think adding a method that allows passing a name would be a good approach.
Reference
...Personally, I feel that cases where one explicitly wants to specify a name (i.e., creating multiple watchers with the same project, repo, and path) are rare.
So, I prefer implementing it with @jrhee17 nim approach first, and if there is demand, we can add support for explicitly specifying a name.
// noinspection ConstantValue | ||
if (latestCommit.getAndSet(commit) == null) { | ||
// emit metrics once the values are ready | ||
meterRegistry.gauge(LATEST_COMMIT_TIME_METER_NAME, tags, latestCommit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question) I understand the usefulness of last commit time since it may be more convenient to check when a file was last committed as opposed to checking the revision numbers.
Having said this, this commit time may not accurate represent the current watched object since it is done in a separate IO.
What do you think of handling commit time separately by embedding commit time information in the watch response? This will involve server-side changes as well, so it may be better to handle in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of handling commit time separately by embedding commit time information in the watch response? This will involve server-side changes as well, so it may be better to handle in a separate PR.
Yes, I actually wanted this as well. However, since the API interface on the Central Dogma server side will also change, there seem to be many things that need to be discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn’t it more important when the client receives a new revision for a watch?
If you think the commit time recorded in the server is meaningful, I prefer @jrhee17's approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, now that you mention it, that makes sense.
It seems better to simply record the value at the time the watcher receives it, without calling the get history API.
I'll make that change. It should be much simpler.
e.g. centraldogma.client.watcher.latest.received.time
I have fixed it. |
PTAL |
|
||
if (latestReceivedTime.getAndSet(Instant.now().getEpochSecond()) == 0) { | ||
// emit metrics once the values are ready | ||
meterRegistry.gauge(LATEST_RECEIVED_TIME_METER_NAME, tags, latestReceivedTime, AtomicLong::get); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question) Will this actually register latestReceivedTime
subsequently after the first registration since the Meter.Id
is already registered? Would it make more sense to register the gauge once and define the update function to fetch AbstractWatcher#latestReceivedTime
instead?
@@ -124,6 +150,13 @@ void start() { | |||
if (state.compareAndSet(State.INIT, State.STARTED)) { | |||
scheduleWatch(0); | |||
} | |||
if (meterRegistry != null) { | |||
// emit metrics once the values are ready | |||
initialValueFuture.thenAccept(latest -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, Double.NaN
could also be used
client/java/src/main/java/com/linecorp/centraldogma/client/AbstractWatcher.java
Show resolved
Hide resolved
client/java/src/main/java/com/linecorp/centraldogma/client/AbstractWatcher.java
Outdated
Show resolved
Hide resolved
@@ -124,6 +150,13 @@ void start() { | |||
if (state.compareAndSet(State.INIT, State.STARTED)) { | |||
scheduleWatch(0); | |||
} | |||
if (meterRegistry != null) { | |||
// emit metrics once the values are ready | |||
initialValueFuture.thenAccept(latest -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you prefer precise metrics, should we use a lock to register metrics, close the watcher and remove metrics?
- The lock will solve the problem that jrhee17 mentioned.
- The metric registration could be placed in one method.
ReentrantLock lock = ...
if (initialValueFuture.complete(newLatest)) {
updateWatchMetrics();
}
void updateWatchMetrics() {
lock.lock();
if (isStopped()) {
return;
}
if (this.latest == null) {
// register two metrics
}
// update the last received time
lock.unlock();
}
void close() {
lock.lock();
....
lock.unlock();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! 🙏🚀
@@ -247,6 +296,7 @@ private void doWatch(int numAttemptsSoFar) { | |||
logger.debug("watcher noticed updated file {}/{}{}: rev={}", | |||
projectName, repositoryName, pathPattern, newLatest.revision()); | |||
notifyListeners(newLatest); | |||
latestReceivedTimeSeconds.set(Instant.now().getEpochSecond()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Should we use volatile long
instead? For now, we don't get any advantage from AtomicLong
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's right.
watcher -> Optional.ofNullable(watcher.latest) | ||
.map(it -> it.revision().major()) | ||
.orElse(-1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Micro optimization) Should we use if ( != null)
instead because we prefer to minimize object allocations even though it is trivial?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. 👍 Left small suggestions. 😉
if (watcher.latest == null) { | ||
return -1; | ||
} else { | ||
return watcher.latest.revision().major(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
if (watcher.latest == null) { | |
return -1; | |
} else { | |
return watcher.latest.revision().major(); | |
final Latest<T> latest = watcher.latest; | |
if (latest == null) { | |
return -1; | |
} else { | |
return latest.revision().major(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. In that case, it’s good since it also removes the warning in the IDE.
fixed: 1d7dd3b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it also doen't access the volatile field twice. 😉
@@ -142,6 +188,12 @@ public void close() { | |||
if (currentWatchFuture != null && !currentWatchFuture.isDone()) { | |||
currentWatchFuture.cancel(false); | |||
} | |||
|
|||
if (meterRegistry != null) { | |||
meterRegistry.remove(new Id(LATEST_REVISION_METER_NAME, tags, null, null, Type.GAUGE)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it's a good idea to remove this gauge since it takes long to remove it when there are a lot of meters in the map.
Anyway, how about creating the gauge in start()
method and remove it here?
this.gauge = Gauge.builder(LATEST_REVISION_METER_NAME, this, watcher -> {
final Latest<T> latest = watcher.latest;
if (latest == null) {
return -1;
} else {
return latest.revision().major();
}
}).tags(tags).register(meterRegistry);
...
meterRegistry.remove(gauge);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this gauge since it takes long to remove
The performance regression has been resolved, so removal operations will no longer take a long time. micrometer-metrics/micrometer#5750 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure, I checked the implementation, and it seems that using remove(Meter meter)
results in the same behavior.
https://github.com/micrometer-metrics/micrometer/blob/9b1d4042579ec0691c7b235478e7fe50ce5d8eab/micrometer-core/src/main/java/io/micrometer/core/instrument/MeterRegistry.java#L748
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance regression has been resolved, so removal operations will no longer take a long time. micrometer-metrics/micrometer#5750 (comment)
Thanks for the link. I didn't know they fixed it. 😉
it seems that using remove(Meter meter) results in the same behavior.
Yeah, it simply delegates the call to the remove
method that takes a meter.
I left this comment while considering a scenario where watcher.close()
is called without calling watcher.start()
.
In that case, it would attempt to remove unregistered meters.
I understand this is unlikely in most cases, but I wanted to clarify it since modifying the code wouldn’t be difficult. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for adding this nice metric, @be-hase! 😉
close #1092
Additional note
I can't think of a use case for customizing the meter name prefix, so I won't support it at this time.
(In the case of Armeria, there are occasional use cases when using both gRPC and Thrift simultaneously, though...)
Tasks