Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

routing: store missioncontrol state in blocks and eliminate cursor use #5515

Merged
merged 3 commits into from
Jul 26, 2021

Conversation

bhandras
Copy link
Collaborator

This PR changes missioncontrol's store update from per payment to
every second. Updating the missioncontrol store on every payment caused
gradual slowdown when using etcd.
We also completely eliminate the use of the cursor, further reducing
the performance bottleneck. One more Get could be removed if we don't care
about idempotency check (we really shouldn't as it's not efficient to do
so with a remote DB).

Using the itest=aync_payments_benchmark dbbackend=etcd:

=== RUN   TestLightningNetworkDaemon
=== RUN   TestLightningNetworkDaemon/37-of-82/btcd/async_payments_benchmark
    test_harness.go:120: 	Benchmark info: Elapsed time:  5.808059726s
    test_harness.go:120: 	Benchmark info: TPS:  83.16030185396204
--- PASS: TestLightningNetworkDaemon (15.93s)
    --- PASS: TestLightningNetworkDaemon/37-of-82/btcd/async_payments_benchmark (15.25s)
PASS

Without the fix:

=== RUN   TestLightningNetworkDaemon
=== RUN   TestLightningNetworkDaemon/37-of-82/btcd/async_payments_benchmark
    test_harness.go:120: 	Benchmark info: Elapsed time:  6.222378347s
    test_harness.go:120: 	Benchmark info: TPS:  77.62305232256877
--- PASS: TestLightningNetworkDaemon (16.93s)

Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great optimization! This should definitely make payments more efficient.

I wonder what a good value for that ticker would be. Too long and it becomes sort of a "stop-the-world" batch task. Too short and we get small stutters. But definitely worth doing it in batches and not for each attempt individually!

Copy link
Collaborator Author

@bhandras bhandras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR also fixes an existing data race in MC store:

test_harness.go:116: lnd finished with error (stderr):
        exit status 66
        2021-07-19 18:08:27.889101 I | WARNING: proto: file "rpc.proto" is already registered
        A future release will panic on registration conflicts. See:
        https://developers.google.com/protocol-buffers/docs/reference/go/faq#namespace-conflict

        ==================
        WARNING: DATA RACE
        Read at 0x00c001f16ad8 by goroutine 861:
          github.com/lightningnetwork/lnd/routing.(*missionControlStore).AddResult.func1()
              /home/bhandras/work/lnd/routing/missioncontrol_store.go:251 +0x3aa
          github.com/lightningnetwork/lnd/kvdb/etcd.(*db).Update.func1()
              /home/bhandras/work/lnd/vendor/github.com/lightningnetwork/lnd/kvdb/etcd/db.go:229 +0x18f
          github.com/lightningnetwork/lnd/kvdb/etcd.runSTM()
              /home/bhandras/work/lnd/vendor/github.com/lightningnetwork/lnd/kvdb/etcd/stm.go:279 +0x2d6
          github.com/lightningnetwork/lnd/kvdb/etcd.RunSTM()
              /home/bhandras/work/lnd/vendor/github.com/lightningnetwork/lnd/kvdb/etcd/stm.go:193 +0x84
          github.com/lightningnetwork/lnd/kvdb/etcd.(*db).Update()
              /home/bhandras/work/lnd/vendor/github.com/lightningnetwork/lnd/kvdb/etcd/db.go:232 +0x1a4
          github.com/lightningnetwork/lnd/channeldb.(*DB).Update()
              /home/bhandras/work/lnd/channeldb/db.go:225 +0x93
          github.com/lightningnetwork/lnd/kvdb.Update()
              /home/bhandras/work/lnd/vendor/github.com/lightningnetwork/lnd/kvdb/interface.go:18 +0x81
          github.com/lightningnetwork/lnd/routing.(*missionControlStore).AddResult()
              /home/bhandras/work/lnd/routing/missioncontrol_store.go:225 +0x104
          github.com/lightningnetwork/lnd/routing.(*MissionControl).processPaymentResult()
              /home/bhandras/work/lnd/routing/missioncontrol.go:432 +0x86
          github.com/lightningnetwork/lnd/routing.(*MissionControl).ReportPaymentSuccess()
              /home/bhandras/work/lnd/routing/missioncontrol.go:422 +0x1d1
          github.com/lightningnetwork/lnd/routing.(*shardHandler).collectResult()
              /home/bhandras/work/lnd/routing/payment_lifecycle.go:627 +0xb9b
          github.com/lightningnetwork/lnd/routing.(*shardHandler).collectResultAsync.func2()
              /home/bhandras/work/lnd/routing/payment_lifecycle.go:496 +0x8d

        Previous write at 0x00c001f16ad8 by goroutine 1535:
          github.com/lightningnetwork/lnd/routing.(*missionControlStore).AddResult.func1()
              /home/bhandras/work/lnd/routing/missioncontrol_store.go:251 +0x3cc
          github.com/lightningnetwork/lnd/kvdb/etcd.(*db).Update.func1()
              /home/bhandras/work/lnd/vendor/github.com/lightningnetwork/lnd/kvdb/etcd/db.go:229 +0x18f
          github.com/lightningnetwork/lnd/kvdb/etcd.runSTM()
              /home/bhandras/work/lnd/vendor/github.com/lightningnetwork/lnd/kvdb/etcd/stm.go:279 +0x2d6
          github.com/lightningnetwork/lnd/kvdb/etcd.RunSTM()
              /home/bhandras/work/lnd/vendor/github.com/lightningnetwork/lnd/kvdb/etcd/stm.go:193 +0x84
          github.com/lightningnetwork/lnd/kvdb/etcd.(*db).Update()
              /home/bhandras/work/lnd/vendor/github.com/lightningnetwork/lnd/kvdb/etcd/db.go:232 +0x1a4
          github.com/lightningnetwork/lnd/channeldb.(*DB).Update()
              /home/bhandras/work/lnd/channeldb/db.go:225 +0x93
          github.com/lightningnetwork/lnd/kvdb.Update()
              /home/bhandras/work/lnd/vendor/github.com/lightningnetwork/lnd/kvdb/interface.go:18 +0x81
          github.com/lightningnetwork/lnd/routing.(*missionControlStore).AddResult()
              /home/bhandras/work/lnd/routing/missioncontrol_store.go:225 +0x104
          github.com/lightningnetwork/lnd/routing.(*MissionControl).processPaymentResult()
              /home/bhandras/work/lnd/routing/missioncontrol.go:432 +0x86
          github.com/lightningnetwork/lnd/routing.(*MissionControl).ReportPaymentSuccess()
              /home/bhandras/work/lnd/routing/missioncontrol.go:422 +0x1d1
          github.com/lightningnetwork/lnd/routing.(*shardHandler).collectResult()
              /home/bhandras/work/lnd/routing/payment_lifecycle.go:627 +0xb9b
          github.com/lightningnetwork/lnd/routing.(*shardHandler).collectResultAsync.func2()
              /home/bhandras/work/lnd/routing/payment_lifecycle.go:496 +0x8d

        Goroutine 861 (running) created at:
          github.com/lightningnetwork/lnd/routing.(*shardHandler).collectResultAsync()
              /home/bhandras/work/lnd/routing/payment_lifecycle.go:492 +0x159
          github.com/lightningnetwork/lnd/routing.(*paymentLifecycle).resumePayment()
              /home/bhandras/work/lnd/routing/payment_lifecycle.go:328 +0x10c7
          github.com/lightningnetwork/lnd/routing.(*ChannelRouter).sendPayment()
              /home/bhandras/work/lnd/routing/router.go:2238 +0x1ac
          github.com/lightningnetwork/lnd/routing.(*ChannelRouter).SendPaymentAsync.func1()
              /home/bhandras/work/lnd/routing/router.go:1958 +0x304

        Goroutine 1535 (running) created at:
          github.com/lightningnetwork/lnd/routing.(*shardHandler).collectResultAsync()
              /home/bhandras/work/lnd/routing/payment_lifecycle.go:492 +0x159
          github.com/lightningnetwork/lnd/routing.(*paymentLifecycle).resumePayment()
              /home/bhandras/work/lnd/routing/payment_lifecycle.go:328 +0x10c7
          github.com/lightningnetwork/lnd/routing.(*ChannelRouter).sendPayment()
              /home/bhandras/work/lnd/routing/router.go:2238 +0x1ac
          github.com/lightningnetwork/lnd/routing.(*ChannelRouter).SendPaymentAsync.func1()
              /home/bhandras/work/lnd/routing/router.go:1958 +0x304
        ==================
        Found 1 data race(s)```

@bhandras bhandras force-pushed the mc_store_optimization branch from 8579fb8 to 0dd81e7 Compare July 19, 2021 16:43
@bhandras
Copy link
Collaborator Author

PR rebased on #5542 to allow testing that the PR indeed fixes the MC store data race when using etcd.

@bhandras bhandras force-pushed the mc_store_optimization branch 2 times, most recently from 209e255 to 4fc71d4 Compare July 19, 2021 17:08
@bhandras bhandras requested review from guggero and Crypt-iQ July 19, 2021 17:09
@bhandras
Copy link
Collaborator Author

@Crypt-iQ please feel free to unassign yourself if you don't have capacity for review.

@Crypt-iQ Crypt-iQ removed their request for review July 19, 2021 17:29
@guggero guggero requested review from joostjager and removed request for joostjager July 20, 2021 07:57
Copy link
Collaborator

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 💯

@bhandras bhandras requested a review from carlaKC July 20, 2021 08:54
@bhandras
Copy link
Collaborator Author

bhandras commented Jul 20, 2021

@carlaKC feel free to unassign yourself if you don't have free capacity. Only the last 3 commits are relevant, the Makefile changes are part of a separate PR(#5542) I rebased on top of.

@bhandras bhandras force-pushed the mc_store_optimization branch 3 times, most recently from 91cdb0f to 9305fb9 Compare July 20, 2021 12:06
@carlaKC carlaKC self-requested a review July 22, 2021 11:46
Copy link
Collaborator

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few questions from me!

@Roasbeef Roasbeef added this to the v0.14.0 milestone Jul 22, 2021
@bhandras bhandras force-pushed the mc_store_optimization branch from 9305fb9 to 93367e3 Compare July 23, 2021 12:41
@bhandras
Copy link
Collaborator Author

Just a few questions from me!

Thanks for the review! Everything should be fixed now.

Copy link
Collaborator

@carlaKC carlaKC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second pass looks good! Just a few comments that need updating now that we're switching to batches.

Just gotta sure we don't run into deadlock territory by locking before db calls.

@bhandras bhandras force-pushed the mc_store_optimization branch from 93367e3 to bd33735 Compare July 23, 2021 14:13
This commit changes missioncontrol's store update from per payment to
every second. Updating the missioncontrol store on every payment caused
gradual slowdown when using etcd.
We also completely eliminate the use of the cursor, further reducing
the performance bottleneck.
@bhandras bhandras force-pushed the mc_store_optimization branch from bd33735 to cfa3f70 Compare July 26, 2021 15:02
@bhandras bhandras merged commit 5de6af7 into lightningnetwork:master Jul 26, 2021
@bhandras bhandras deleted the mc_store_optimization branch September 12, 2023 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants