Partkey unblocked #3034

brianolson · 2021-10-11T15:07:36Z

Summary

Test Plan

codecov-commenter · 2021-10-11T15:38:22Z

Codecov Report

Merging #3034 (bea6f9f) into feature/partkey (59b5f28) will increase coverage by 0.03%.
The diff coverage is 86.13%.

@@                 Coverage Diff                 @@
##           feature/partkey    #3034      +/-   ##
===================================================
+ Coverage            43.95%   43.98%   +0.03%     
===================================================
  Files                  394      394              
  Lines                87242    87309      +67     
===================================================
+ Hits                 38350    38406      +56     
- Misses               42804    42812       +8     
- Partials              6088     6091       +3

Impacted Files	Coverage Δ
data/account/participationRegistry.go	`79.69% <86.13%> (+3.72%)`	⬆️
catchup/peerSelector.go	`98.95% <0.00%> (-1.05%)`	⬇️
data/abi/abi_type.go	`90.90% <0.00%> (-0.91%)`	⬇️
data/transactions/verify/txn.go	`43.42% <0.00%> (-0.88%)`	⬇️
network/wsPeer.go	`73.57% <0.00%> (-0.26%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 59b5f28...bea6f9f. Read the comment docs.

winder

This looks much simpler than the pattern we had before. Could you apply it to Record as well and delete dirty/Flush?

One complication is that we're also planning to call this from the REST API. Especially when installing a key I think we need a way to verify that the write completed. Is there a good way to do that with the write thread?

brianolson · 2021-10-11T20:20:28Z

I think things are updated now. See the changes in tests because a few things there demanded that things be applied immediately that now aren't guaranteed until .Flush().
.Flush() blocks and waits until all IO ops are done, then returns any accumulated error from asynch io.

winder · 2021-10-12T18:37:22Z

data/account/participationRegistry.go

@@ -691,42 +767,14 @@ func (db *participationDB) Flush() error {
 	db.mutex.Lock()
 	defer db.mutex.Unlock()


Is the mutex still required? Seems like this would cause a deadlock when the writeThread attempts to Lock

the flushDone condition variable temporarily releases its lock (&db.mutex) inside .Wait(); that's the receiving end of db.flushDone.Broadcast() in writeThread()

Ahh, that's pretty cool. I found a couple of cases that I thought were race conditions but they seem to all be covered by the pattern. (multiple calls to flush, the window between sending the flush event/calling wait)

One potential issue I see...

If an error is generated with multiple Flush() calls on the queue, we guarantee that only the first Flush returns the error. This might happen if we use Flush in the REST API and are inserting keys back-to-back.

I think we can solve that by counting the number of pending flushes and only clearing asyncErr when changing the pending account from 0 -> 1.

winder

Still looking good

data/account/participationRegistry.go

winder · 2021-10-13T13:16:27Z

data/account/participationRegistry.go

@@ -691,42 +767,14 @@ func (db *participationDB) Flush() error {
 	db.mutex.Lock()
 	defer db.mutex.Unlock()


Ahh, that's pretty cool. I found a couple of cases that I thought were race conditions but they seem to all be covered by the pattern. (multiple calls to flush, the window between sending the flush event/calling wait)

One potential issue I see...

If an error is generated with multiple Flush() calls on the queue, we guarantee that only the first Flush returns the error. This might happen if we use Flush in the REST API and are inserting keys back-to-back.

I think we can solve that by counting the number of pending flushes and only clearing asyncErr when changing the pending account from 0 -> 1.

winder · 2021-10-13T13:55:52Z

data/account/participationRegistry.go

+	for {
+		var wr partDBWriteRecord
+		var chanOk bool
+		if needsFlush {


I think this deserves a comment. It looks like the code can be simplified but there's a lot going on and all of it is important.

// The flush command locks the mutex, then sends a flush event to unblock the queue read. // When everything is flushed, the default is blocked until flush calls flushDone.Wait. // Finally flushDone.Broadcast notifies all calls to Flush that the queue is now empty.

Added a few comments around the Flush writeThread interaction.

winder

LGTM

reduce blocking in partkey db algorand#2878

a1eb034

brianolson requested a review from winder October 11, 2021 15:07

brianolson changed the base branch from master to feature/partkey October 11, 2021 16:32

make Delete async, cleanup Register, TODO: fix tests

2ec8117

winder reviewed Oct 11, 2021

View reviewed changes

brianolson added 2 commits October 11, 2021 16:09

wip, one test still failing

4685d38

fix flush test

bb4e1aa

lint

bea6f9f

winder reviewed Oct 12, 2021

View reviewed changes

brianolson marked this pull request as ready for review October 13, 2021 12:00

winder reviewed Oct 13, 2021

View reviewed changes

feedback tweaks. comments. cleanup.

7814fae

winder approved these changes Oct 13, 2021

View reviewed changes

winder merged commit 0b371a3 into algorand:feature/partkey Oct 13, 2021

winder linked an issue Oct 13, 2021 that may be closed by this pull request

Partkey Interface: Avoid mutex and DB operation in Register/Record #2878

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partkey unblocked #3034

Partkey unblocked #3034

brianolson commented Oct 11, 2021

codecov-commenter commented Oct 11, 2021 •

edited

Loading

winder left a comment

brianolson commented Oct 11, 2021

winder Oct 12, 2021

brianolson Oct 12, 2021

winder Oct 13, 2021

winder left a comment

winder Oct 13, 2021

winder Oct 13, 2021

brianolson Oct 13, 2021

winder left a comment

		@@ -691,42 +767,14 @@ func (db *participationDB) Flush() error {
		db.mutex.Lock()
		defer db.mutex.Unlock()

Partkey unblocked #3034

Partkey unblocked #3034

Conversation

brianolson commented Oct 11, 2021

Summary

Test Plan

codecov-commenter commented Oct 11, 2021 • edited Loading

Codecov Report

winder left a comment

Choose a reason for hiding this comment

brianolson commented Oct 11, 2021

winder Oct 12, 2021

Choose a reason for hiding this comment

brianolson Oct 12, 2021

Choose a reason for hiding this comment

winder Oct 13, 2021

Choose a reason for hiding this comment

winder left a comment

Choose a reason for hiding this comment

winder Oct 13, 2021

Choose a reason for hiding this comment

winder Oct 13, 2021

Choose a reason for hiding this comment

brianolson Oct 13, 2021

Choose a reason for hiding this comment

winder left a comment

Choose a reason for hiding this comment

codecov-commenter commented Oct 11, 2021 •

edited

Loading