Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misson Control Store: Improve performance #8549

Merged
merged 5 commits into from
Jun 19, 2024

Conversation

matheusd
Copy link
Contributor

The current implementation of the Misson Control Store has a one second ticker that trims the DB, ensuring only the latest 1000 entries are kept.

That ticker was added in #5515 as a way to improve performance by batching and de-duplicating the updates to the db.

However, its current implementation can be significantly improved by doing a few pointed changes. In particular:

  1. There is no need to open a db tx when there is no work.
  2. The ticker does not need to run continuously, only when there are items in the queue that need to be persisted.
  3. The current implementation duplicates the entire in-memory cache of the store, which is not ideal when there are only a few entries that need to be added

This PR improves the situation iteratively on each commit. A set of benchmarks is added to verify the performance improvement.

This is the final result (using a bbolt MC store):

MissionControlStoreFlushing/no_additional_results-7      221µs ±15%       0µs ± 2%   -99.86%  (p=0.000 n=10+10)
MissionControlStoreFlushing/one_additional_result-7      359µs ±40%      44µs ± 8%   -87.87%  (p=0.000 n=10+10)
MissionControlStoreFlushing/ten_additional_results-7     344µs ± 9%     129µs ±29%   -62.61%  (p=0.000 n=9+10)
MissionControlStoreFlushing/100_additional_results-7     987µs ±28%     621µs ±22%   -37.06%  (p=0.000 n=10+10)
MissionControlStoreFlushing/250_additional_results-7    2.02ms ±37%    1.53ms ±41%   -24.62%  (p=0.005 n=10+10)
MissionControlStoreFlushing/500_additional_results-7    2.88ms ±11%    2.84ms ±18%      ~     (p=0.780 n=9+10)

name                                                  old alloc/op   new alloc/op   delta
MissionControlStoreFlushing/no_additional_results-7      139kB ± 0%       0kB       -100.00%  (p=0.000 n=10+10)
MissionControlStoreFlushing/one_additional_result-7      153kB ± 0%      19kB ± 0%   -87.37%  (p=0.000 n=10+10)
MissionControlStoreFlushing/ten_additional_results-7     175kB ± 0%      41kB ± 0%   -76.35%  (p=0.000 n=8+8)
MissionControlStoreFlushing/100_additional_results-7     350kB ± 0%     222kB ± 0%   -36.63%  (p=0.000 n=10+10)
MissionControlStoreFlushing/250_additional_results-7     651kB ± 0%     535kB ± 0%   -17.75%  (p=0.000 n=8+10)
MissionControlStoreFlushing/500_additional_results-7    1.14MB ± 0%    1.04MB ± 0%    -8.57%  (p=0.000 n=10+10)

name                                                  old allocs/op  new allocs/op  delta
MissionControlStoreFlushing/no_additional_results-7      1.06k ± 0%     0.00k       -100.00%  (p=0.000 n=10+10)
MissionControlStoreFlushing/one_additional_result-7      1.15k ± 0%     0.12k ± 0%   -90.02%  (p=0.000 n=10+10)
MissionControlStoreFlushing/ten_additional_results-7     1.49k ± 0%     0.46k ± 0%   -69.36%  (p=0.000 n=10+10)
MissionControlStoreFlushing/100_additional_results-7     4.90k ± 0%     3.87k ± 0%   -21.07%  (p=0.000 n=10+10)
MissionControlStoreFlushing/250_additional_results-7     10.6k ± 0%      9.6k ± 0%    -9.75%  (p=0.000 n=10+10)
MissionControlStoreFlushing/500_additional_results-7     20.1k ± 0%     19.1k ± 0%    -5.15%  (p=0.000 n=9+10)

As can be seen, there is a significant improvement in cpu and memory usage when up to 100 additional results are added (i.e. up to 100 txs/second flowing through the node).

The improvement is specially significant when no new results are added to a full MC store, which is relevant for nodes that have a low or only episodic volument of payments (e.g. a mostly end user node or mobile node).

Copy link
Contributor

coderabbitai bot commented Mar 13, 2024

Important

Review skipped

Auto reviews are limited to specific labels.

Labels to auto review (1)
  • llm-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@guggero guggero self-requested a review March 14, 2024 07:42
@matheusd matheusd force-pushed the mc-store-perf-improv branch from 6bf04bf to 38ba47d Compare March 14, 2024 12:04
@matheusd
Copy link
Contributor Author

Going through the lint/CI failures now, might need some refactoring to deal with the type checks required from using non-generic *list.List

@matheusd matheusd force-pushed the mc-store-perf-improv branch 2 times, most recently from ab6590f to 3ee9bfa Compare March 14, 2024 14:08
@matheusd
Copy link
Contributor Author

Lint issues are fixed, I believe the other CI failures are just flakes.

Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Did a first round, have a couple of questions.

go.mod Outdated
@@ -4,6 +4,7 @@ require (
github.com/NebulousLabs/go-upnp v0.0.0-20180202185039-29b680b06c82
github.com/Yawning/aez v0.0.0-20211027044916-e49e68abd344
github.com/andybalholm/brotli v1.0.3
github.com/bahlo/generic-list-go v0.2.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'll want to pull in an external library for this but instead add the same functionality to the new fn package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, though I should note that this specific dependency is very small (the generic list impl is the only thing in it), it's a drop-in replacement, it's based on stdlib code, it's complete, fully tested has no open issues and hasn't had a need for any new code in over 2 years, so it's about as good a dependency as anyone could hope for :p

To bring this into the fn package, I'd just copy the subset of the code that was used here over (code is BSD licensed). Would that be ok?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to inline this library (just list.go and list_test.go and perhaps LICENSE?) into the fn package. But happy to hear other opinions, cc @Roasbeef.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree, perhaps the change is also not necessary, the improvement is only noticeable for 250 results:

                                                     │ with_type_assertions │                after                 │
                                                     │        sec/op        │    sec/op     vs base                │
MissionControlStoreFlushing/no_additional_results-4             216.3n ± 2%   218.7n ±  8%        ~ (p=0.190 n=10)
MissionControlStoreFlushing/one_additional_result-4             29.32µ ± 0%   29.19µ ± 12%        ~ (p=0.645 n=10)
MissionControlStoreFlushing/ten_additional_results-4            64.87µ ± 0%   63.99µ ±  0%   -1.37% (p=0.000 n=10)
MissionControlStoreFlushing/100_additional_results-4            385.4µ ± 6%   370.6µ ±  0%   -3.84% (p=0.000 n=10)
MissionControlStoreFlushing/250_additional_results-4           1016.6µ ± 2%   908.8µ ±  1%  -10.60% (p=0.000 n=10)
MissionControlStoreFlushing/500_additional_results-4            1.945m ± 3%   1.780m ±  1%   -8.49% (p=0.000 n=10)
geomean                                                         82.42µ        79.10µ         -4.03%

I'd hope that this package would make it into the standard library at some point :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What were the changes you did on this benchmark? Because the no_additional_results case is significantly different from the original benchmark in the PR:

MissionControlStoreFlushing/no_additional_results-7      221µs ±15%       0µs ± 2%   -99.86%  (p=0.000 n=10+10)

@matheusd matheusd force-pushed the mc-store-perf-improv branch from 3ee9bfa to 169f934 Compare March 19, 2024 14:53
@bitromortac bitromortac self-requested a review March 22, 2024 07:57
Copy link
Collaborator

@bitromortac bitromortac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvements in relative terms. In absolute terms, on my machine this reduces execution time from 30 ms to 30 µs for adding a single result to 100000 preexisting results (a value some people use if they want a more complete picture of the graph). I'm still thinking about the added de-duplication code complexity, perhaps you could give more motivation for the change (maybe benchmarks on weaker hardware)?

go.mod Outdated
@@ -4,6 +4,7 @@ require (
github.com/NebulousLabs/go-upnp v0.0.0-20180202185039-29b680b06c82
github.com/Yawning/aez v0.0.0-20211027044916-e49e68abd344
github.com/andybalholm/brotli v1.0.3
github.com/bahlo/generic-list-go v0.2.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree, perhaps the change is also not necessary, the improvement is only noticeable for 250 results:

                                                     │ with_type_assertions │                after                 │
                                                     │        sec/op        │    sec/op     vs base                │
MissionControlStoreFlushing/no_additional_results-4             216.3n ± 2%   218.7n ±  8%        ~ (p=0.190 n=10)
MissionControlStoreFlushing/one_additional_result-4             29.32µ ± 0%   29.19µ ± 12%        ~ (p=0.645 n=10)
MissionControlStoreFlushing/ten_additional_results-4            64.87µ ± 0%   63.99µ ±  0%   -1.37% (p=0.000 n=10)
MissionControlStoreFlushing/100_additional_results-4            385.4µ ± 6%   370.6µ ±  0%   -3.84% (p=0.000 n=10)
MissionControlStoreFlushing/250_additional_results-4           1016.6µ ± 2%   908.8µ ±  1%  -10.60% (p=0.000 n=10)
MissionControlStoreFlushing/500_additional_results-4            1.945m ± 3%   1.780m ±  1%   -8.49% (p=0.000 n=10)
geomean                                                         82.42µ        79.10µ         -4.03%

I'd hope that this package would make it into the standard library at some point :)

@matheusd matheusd force-pushed the mc-store-perf-improv branch from fb43678 to 413b871 Compare April 16, 2024 12:13
@matheusd
Copy link
Contributor Author

@bitromortac sent commit fb43678 to clean up the mission control tests and address your comments.

Also rebased to address the conflict.

@bitromortac bitromortac self-requested a review April 16, 2024 12:27
Copy link
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvements! Thanks for this 🙏

Main suggestion is about how to how to use sync.Cond instead of a channel for signalling & responding to new items.

Then more generally, some commit structure updates are needed: looks like some review comments were addressed in an additional commit rather than squashing those into the commit in question.

@matheusd matheusd force-pushed the mc-store-perf-improv branch 2 times, most recently from 96efd14 to 39f51e1 Compare May 9, 2024 12:42
@matheusd
Copy link
Contributor Author

matheusd commented May 9, 2024

@ellemouton Thank you for the thorough review! I believe I addressed all of your points.

Also, rebased against current master.

Not sure what to do about the release notes, given the v0.18 RCs are already out (move to v0.19 or v0.18.1 or something else).

@ellemouton
Copy link
Collaborator

thanks for the quick turn around! I'll check it out soon.

im also not sure re release notes.... cc @saubyk for that :)

@saubyk saubyk added this to the v0.18.1 milestone May 9, 2024
@saubyk
Copy link
Collaborator

saubyk commented May 9, 2024

tagged for 18.1

@saubyk
Copy link
Collaborator

saubyk commented May 9, 2024

tagged for 18.1

On a second thought, 18.1 is already getting bloated and this doesn't seem urgent so moving to 0.19

@saubyk saubyk modified the milestones: v0.18.1, 0.19.0 May 9, 2024
@ellemouton ellemouton self-requested a review May 10, 2024 09:01
Copy link
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

From my tests I also dont see a huge difference between the with-type-assertion vs no-type-assertion diffs so I guess I'd also vote for not pulling in the external dependency for this

@matheusd matheusd force-pushed the mc-store-perf-improv branch from 39f51e1 to 151522a Compare May 16, 2024 18:24
@ellemouton
Copy link
Collaborator

@matheusd - note the failing linter check

Copy link
Collaborator

@bitromortac bitromortac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉 (some nits and the linter fix left)

@matheusd matheusd force-pushed the mc-store-perf-improv branch 3 times, most recently from 141ea9a to a100051 Compare May 28, 2024 10:46
Copy link
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost there - main thing is to please remove the use of panic 🙏

@matheusd matheusd force-pushed the mc-store-perf-improv branch from a100051 to 457fcb0 Compare May 29, 2024 11:41
Copy link
Collaborator

@ellemouton ellemouton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once final nits have been addressed (note: I think you may have missed 2 from the last review)

@matheusd matheusd force-pushed the mc-store-perf-improv branch from 457fcb0 to 93f6edd Compare May 30, 2024 12:01
@matheusd
Copy link
Contributor Author

Sorry for the misses. Updated.

@lightninglabs-deploy
Copy link

@matheusd, remember to re-request review from reviewers when ready

@ellemouton
Copy link
Collaborator

@saubyk - happy for this to be merged? I think it is g2g

@saubyk saubyk modified the milestones: 0.19.0, v0.18.1 Jun 13, 2024
@saubyk
Copy link
Collaborator

saubyk commented Jun 13, 2024

Tagged for 18.1. Release notes need update

@ellemouton
Copy link
Collaborator

@matheusd - thanks for your patience 🙏 I think we wanna move this to 0.18.1 instead of 0.19 so do you mind doing one last update and moving the release notes to the 0.18.1 doc pls?

@matheusd matheusd force-pushed the mc-store-perf-improv branch 2 times, most recently from e33d572 to 8f3dbdc Compare June 13, 2024 16:49
@matheusd
Copy link
Contributor Author

Updated

Copy link
Collaborator

@saubyk saubyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

@ellemouton
Copy link
Collaborator

oh no @matheusd - looks like there is a merge conflict 🙈 one more rebase pls 🙏 will then merge this asap so that this can be the last one!

matheusd added 5 commits June 19, 2024 07:33
These will be useful in the next commits.
This modifies the mission control store to avoid doing any work when no
new payment result entries are in the queue to be processed.

The mission control store maintains keeps the latest N (in production:
1000) entries in its DB, evicting older entries when new ones are added.
Currently, its implementation is somewhat less performant than it could
be.

This commit adds an early return to the storeResults function to avoid
doing any DB or memory operations when its outstanding queue is empty,
improving the performance during quiescent periods of the LN node's
execution.
This modifies the mission control store to avoid running the one second
ticker for flushing data when there is no work to be done.

This improves performance of a quiscent LN node by avoiding a one second
interval busy loop that does nothing when there are no payments flowing
through the node.
This removes duplication of in-memory data during the periodic flushing
stage of the mission control store.

The existing code entirely duplicates the in-memory cache of the store,
which is very wasteful when only a few additional results are being
rotated into the store.

This has a significant performance penalty specially for wallets that
remain online for a long time with a low volume of payments. The worst
case scenario are wallets that see at most 1 new payment a second, where
the entire in-memory cache is recreated every second.

This commit improves the situation by determining what will be the
actual changes that need to be committed before initiating the db
transaction and only keeping track of these to update the in-memory
cache if the db tx is successful.
@matheusd matheusd force-pushed the mc-store-perf-improv branch from 8f3dbdc to f39edaa Compare June 19, 2024 10:35
@ellemouton ellemouton merged commit 2477bd7 into lightningnetwork:master Jun 19, 2024
27 of 34 checks passed
@matheusd matheusd deleted the mc-store-perf-improv branch June 19, 2024 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants