-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace Guava cache with Caffeine cache for vertex cache #3188
Replace Guava cache with Caffeine cache for vertex cache #3188
Conversation
|
Thank you for the contribution @ministat , @clovertrail !
After that login into GitHub with
After that login into GitHub with Please, let me know if you have any problems with signing EasyCLA, and I will help you with signing it. |
7ffd610
to
e479200
Compare
@porunov Thanks for your suggestion. I have passed the CLA checked. Appreciated! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking the dependency tree I see that we get Caffeine 2.3.1 from TinkerPop dependency (org.apache.tinkerpop:gremlin-groovy:3.5.4
).
I'm just thinking if we should use the latest Caffeine 3.1.1 or should we stick to the one provided by TinkerPop. Do you know if there were any breaking changes? If no, then I would prefer trying out the latest Caffeine dependency 3.1.1.
UPDATE: I see in the documentation (https://github.com/ben-manes/caffeine#download) that for Java 11 we should use 3.x Caffeine but for anything else we should use 2.x Caffeine, so, I think using Caffeine 2.9.3 should be safe.
@porunov The main reason for the major version bump was that v3.x requires Java 11 or higher, whereas previously it was Java 8. This allowed the implementation code to switch from sun.misc.Unsafe to VarHandles (fully removed in 3.0.2). Otherwise the changes were minor clean ups that had little impact on users (see release notes). |
Thank you for the clarification! JanusGraph supports Java 11 but we still write everything in Java 8 to support clients on Java 8, so I think in this case it should be safe to use Caffeine 2.9.3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks good but I have small nitpicks below.
Could you please add explicit Caffeine dependency to root pom.xml
like the next:
<dependency>
<groupId>com.github.ben-manes.caffeine</groupId>
<artifactId>caffeine</artifactId>
<version>${caffeine.version}</version>
</dependency>
Also, specify <caffeine.version>2.9.3</caffeine.version>
.
volatileVertices = new NonBlockingHashMapLong<>(initialDirtySize); | ||
log.debug("Created dirty vertex map with initial size {}", initialDirtySize); | ||
|
||
cache = Caffeine.newBuilder().maximumSize(maxCacheSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not an issue, just leaving the information below)
For other reviewers who are interested why there is no concurrencyLevel
option in Caffeine, I found this great explanation by @ben-manes https://groups.google.com/g/guava-discuss/c/BnoN9q9fDgw/m/6kF6H2aLAgAJ
Moreover, notice that the concurencyLevel
here was always 1
.
...h-core/src/main/java/org/janusgraph/graphdb/transaction/vertexcache/CaffeineVertexCache.java
Outdated
Show resolved
Hide resolved
e479200
to
11f130f
Compare
Add explicit Caffeine dependency, and excludes the lower version of org.checkerframework:checker-qual from guava, otherwise, there are many dependency convergence errors like the following:
|
@porunov Thanks for you comments, I have updated my patch. |
...h-core/src/main/java/org/janusgraph/graphdb/transaction/vertexcache/CaffeineVertexCache.java
Outdated
Show resolved
Hide resolved
11f130f
to
d838d08
Compare
@ministat I fixed HBase tests issue in the latest commit in
|
@ministat it also looks like the benchmark class is often stuck in the first or second iteration for long time. I'm not sure but it could be flaky. I was waiting for 10 minutes for the second iteration to pass but it didn't finish. Sometimes it hangs on the warmup iteration. Do you know what is a correct way to run the benchmark you included? I.e.
30 minutes later - nothing changed. Iteration 1 is still running. No timeouts are thrown. Very strange. Update: I did recompile the project and the issue was somehow resolved. Below are my benchmarks.
Running Caffeine before Guava produces the same results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d838d08
to
f79bdea
Compare
@porunov, I have rebased per your suggestion. |
Thanks! Now to merge this PR we need to either wait for another commiter to approve this PR or wait 7 days to merge this PR with lazy consensus. |
@porunov I found the build "mvn clean package" failed in any janusgraph-xxx folder, for example, janusgraph-core.
|
After I "mvn install -DskipTests" from the root, the "mvn clean package" from janusgraph-core is successful. So, I think the previous failure is caused by referencing the wrong library. |
I see that in |
I think we should add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution! I just have some nitpicks:
@@ -319,7 +319,7 @@ public void close() { | |||
config.getVertexCacheSize(), effectiveVertexCacheSize, MIN_VERTEX_CACHE_SIZE); | |||
} | |||
|
|||
vertexCache = new GuavaVertexCache(effectiveVertexCacheSize,concurrencyLevel,config.getDirtyVertexSize()); | |||
vertexCache = new CaffeineVertexCache(effectiveVertexCacheSize,config.getDirtyVertexSize()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also remove GuavaVertexCache
class since it's not being used anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand it is used in the benchmark test class. I guess we could move that class into testing module then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can remove that class with the benchmark test class in the follow up PRs but I'm OK either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think it's better to leave the benchmark test class in this PR for historical reasons. We can move GuavaVertexCache
to test module and then later after the full transition to Caffeine cache is done we can remove those test classes with the benchmark classes. It's just good if those benchmark classes are in git history
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
later after the full transition to Caffeine cache is done we can remove those test classes with the benchmark classes
That sounds good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree to leave GuavaVertexCache there. I'm ok to compose several following PRs to clean it up.
Meanwhile, I found there are some other Guava cache, like https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/transaction/subquerycache/GuavaSubqueryCache.java, and it is supposed to be replaced by Caffeine cache also, but I have not yet benchmark it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree to leave GuavaVertexCache there. I'm ok to compose several following PRs to clean it up. Meanwhile, I found there are some other Guava cache, like https://github.com/JanusGraph/janusgraph/blob/master/janusgraph-core/src/main/java/org/janusgraph/graphdb/transaction/subquerycache/GuavaSubqueryCache.java, and it is supposed to be replaced by Caffeine cache also, but I have not yet benchmark it.
There are 8 caches in JanusGraph actually and all of them need to be replaced by Caffeine (not in the current PR but in follow up PRs). I posted those places here: #871 (comment)
...h-core/src/main/java/org/janusgraph/graphdb/transaction/vertexcache/CaffeineVertexCache.java
Outdated
Show resolved
Hide resolved
...h-core/src/main/java/org/janusgraph/graphdb/transaction/vertexcache/CaffeineVertexCache.java
Outdated
Show resolved
Hide resolved
...h-core/src/main/java/org/janusgraph/graphdb/transaction/vertexcache/CaffeineVertexCache.java
Outdated
Show resolved
Hide resolved
f79bdea
to
e94d940
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat! I am okay with the failing codecov.
Fix the issue JanusGraph#3185 Signed-off-by: Hongjiang Zhang <[email protected]>
e94d940
to
9a22a1a
Compare
This is a follow-up commit to commit 9a22a1a Move the rest (7) caches from Guava implementation to Caffeine implementation This commit also removes a previously added Benchmark class with the obsolate GuavaVertexCache.class. The benchmark is provided in the next GitHub PR: JanusGraph#3188 which shows that Caffeine has better performance characteristics in all scenarious. ConcurencyLevel is not provided for cache implementation. The reason for this can be found by the following link https://groups.google.com/g/guava-discuss/c/BnoN9q9fDgw/m/6kF6H2aLAgAJ Related to JanusGraph#3188 and JanusGraph#3185 Partially related to JanusGraph#2033 Fixes JanusGraph#871
This is a follow-up commit to commit 9a22a1a Move the rest (7) caches from Guava implementation to Caffeine implementation This commit also removes a previously added Benchmark class with the obsolate GuavaVertexCache.class. The benchmark is provided in the next GitHub PR: JanusGraph#3188 which shows that Caffeine has better performance characteristics in all scenarious. ConcurencyLevel is not provided for cache implementation. The reason for this can be found by the following link https://groups.google.com/g/guava-discuss/c/BnoN9q9fDgw/m/6kF6H2aLAgAJ Related to JanusGraph#3188 and JanusGraph#3185 Partially related to JanusGraph#2033 Fixes JanusGraph#871 Signed-off-by: Oleksandr Porunov <[email protected]>
This is a follow-up commit to the 9a22a1a commit Migrates the rest 7 caches from Guava implementation to Caffeine implementation This commit also removes a previously added Benchmark class with the obsolate GuavaVertexCache.class. The benchmark is provided in the next GitHub PR: JanusGraph#3188 which shows that Caffeine has better performance characteristics in all scenarious. ConcurencyLevel is not provided for the cache implementation anymore. The reason for this can be found by the following link https://groups.google.com/g/guava-discuss/c/BnoN9q9fDgw/m/6kF6H2aLAgAJ Related to JanusGraph#3188 and JanusGraph#3185 Partially related to JanusGraph#2033 Fixes JanusGraph#871 Signed-off-by: Oleksandr Porunov <[email protected]>
This is a follow-up commit to the 9a22a1a commit Migrates the rest 7 caches from Guava implementation to Caffeine implementation This commit also removes a previously added Benchmark class with the obsolate GuavaVertexCache.class. The benchmark is provided in the next GitHub PR: #3188 which shows that Caffeine has better performance characteristics in all scenarious. ConcurencyLevel is not provided for the cache implementation anymore. The reason for this can be found by the following link https://groups.google.com/g/guava-discuss/c/BnoN9q9fDgw/m/6kF6H2aLAgAJ Related to #3188 and #3185 Partially related to #2033 Fixes #871 Signed-off-by: Oleksandr Porunov <[email protected]>
private static final Logger log = | ||
LoggerFactory.getLogger(CaffeineVertexCache.class); | ||
|
||
private final ConcurrentMap<Long, InternalVertex> volatileVertices; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caffeine has special feature for avoid this extra field - ben-manes/caffeine#264
Implementation could be looks like https://gist.github.com/mad/9119f48812000a03dea16c7485c621fc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caffeine has special feature for avoid this extra field - ben-manes/caffeine#264
Implementation could be looks like https://gist.github.com/mad/9119f48812000a03dea16c7485c621fc
That's interesting! I would prefer refactoring this class to eliminate this extra field as you proposed. That said, as it is a performance related part of the code we should probably execute benhmarking for any improvement we make here, just to make sure we don't have a regression. Other than that your proposal is much cleaner, so that's great.
Should we open a separate ticket to track this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, a related transitioning PR which replaces other Guava caches with Caffeine cache is: #3198
After that transitioning is merged I believe we need to refactor CaffeineVertexCache.java and ExpirationKCVSCache.java.
This is a follow-up commit to the 9a22a1a commit Migrates the rest 7 caches from Guava implementation to Caffeine implementation This commit also removes a previously added Benchmark class with the obsolate GuavaVertexCache.class. The benchmark is provided in the next GitHub PR: #3188 which shows that Caffeine has better performance characteristics in all scenarious. ConcurencyLevel is not provided for the cache implementation anymore. The reason for this can be found by the following link https://groups.google.com/g/guava-discuss/c/BnoN9q9fDgw/m/6kF6H2aLAgAJ Related to #3188 and #3185 Partially related to #2033 Fixes #871 Signed-off-by: Oleksandr Porunov <[email protected]>
Fix the issue #3185
Guava cache has performance bug in 100% read scenario. The solution is to use Caffeine cache which is also suggested by Guava community. See google/guava#2408 for more details. Meanwhile, Caffeine cache beats Guava cache in all scenarios.
The following result is for read scenario.
Benchmark (cacheType) Mode Cnt Score Error Units
VertexCacheBenchmark.run guava avgt 5 0.001 ± 0.001 ms/op
VertexCacheBenchmark.run caffeine avgt 5 ≈ 10⁻⁴ ms/op
Thank you for contributing to JanusGraph!
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
master
)?For code changes:
For documentation related changes: