Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: compile with golang 1.16 #3803

Merged
merged 17 commits into from
Mar 21, 2022

Conversation

tsachiherman
Copy link
Contributor

@tsachiherman tsachiherman commented Mar 19, 2022

Summary

This PR upgrade the go-algorand repository to use go 1.16 while adding workarounds for golang/go#44343.

Test Plan

Use performance testing to verify no regressions.

@codecov-commenter
Copy link

codecov-commenter commented Mar 19, 2022

Codecov Report

Merging #3803 (6399921) into master (35be745) will increase coverage by 0.12%.
The diff coverage is 30.00%.

@@            Coverage Diff             @@
##           master    #3803      +/-   ##
==========================================
+ Coverage   49.88%   50.01%   +0.12%     
==========================================
  Files         392      392              
  Lines       68722    68504     -218     
==========================================
- Hits        34284    34259      -25     
+ Misses      30682    30491     -191     
+ Partials     3756     3754       -2     
Impacted Files Coverage Δ
network/dialer.go 64.00% <0.00%> (ø)
network/rateLimitingTransport.go 48.14% <0.00%> (-1.86%) ⬇️
network/wsNetwork.go 62.99% <100.00%> (ø)
util/condvar/timedwait.go 100.00% <100.00%> (ø)
agreement/cryptoVerifier.go 70.21% <0.00%> (-4.97%) ⬇️
ledger/blockqueue.go 82.18% <0.00%> (-2.88%) ⬇️
node/node.go 23.26% <0.00%> (-1.85%) ⬇️
cmd/tealdbg/debugger.go 71.42% <0.00%> (-0.99%) ⬇️
data/abi/abi_type.go 87.67% <0.00%> (-0.95%) ⬇️
ledger/acctupdates.go 68.51% <0.00%> (-0.93%) ⬇️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35be745...6399921. Read the comment docs.

algojack
algojack previously approved these changes Mar 21, 2022
Copy link
Contributor

@algojack algojack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@tolikzinovyev tolikzinovyev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also consider upgrading (in a separate PR) to go 1.17, since 1.18 is out and 1.16 is therefore deprecated.

@@ -882,7 +883,7 @@ func (pps *WorkerState) sendFromTo(
timeCredit -= took
if timeCredit > 0 {
time.Sleep(timeCredit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use NanoSleep() here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to be super-accurate here, since the timeCredit mechanism would compensate by dropping sleeps on subsequent iterations.

@@ -882,7 +883,7 @@ func (pps *WorkerState) sendFromTo(
timeCredit -= took
if timeCredit > 0 {
time.Sleep(timeCredit)
timeCredit = time.Duration(0)
timeCredit -= time.Since(now)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems better, but still not as precise as #3478.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#3478 was removing the credits logic, which would make it worse if we were to generate > 1000 tps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why exactly?

Copy link
Contributor Author

@tsachiherman tsachiherman Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #3478, you were calculating the next timestamp based on the current one by adding the "1/tps". This is generally correct. However, if we want to have 1000 tps, and half of the iterations takes 0.5ms while the other half takes 1.5ms, the effective TPS would be 1.5 x 500 + 1 x 500 = 1250 sec / 1000 txn -> 800 tps. That's why we're using the credit algorithm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not what #3478 does. I think the PR description in #3478 explains how it works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description in #3478 does match the implementation.. But the goal was not to fix it for the 115 tps use case. The goal was to fix it so it would work correctly with the go 1.16 bug. If you'll take your implementation and try to use it for more than 1K tps from a single pingpong on go 1.16 it wouldn't work. With this implementation, it would.
( we need to have that in order to be able to test 10K TPS ).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why will my implementation not work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's say we set the tps to 2000. In that case, each sleep should take 0.5ms. But because of this bug, it would take 1ms. That would limit the tps to 1000 ( assuming that the iteration is free ).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw - I do like the simplicity in your implementation, but I think we should work a bit more on it. Let's rebase your branch based on this one, and conduct some further testing against high-volume TPS ? I want to see that it would work correctly both in the cases of random iteration timing as well as high TPS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's say we set the tps to 2000. In that case, each sleep should take 0.5ms. But because of this bug, it would take 1ms. That would limit the tps to 1000 ( assuming that the iteration is free ).

No, because some sleeps will be skipped. But I'm definitely not against using nanosleep for more precision.

Btw - I do like the simplicity in your implementation, but I think we should work a bit more on it. Let's rebase your branch based on this one, and conduct some further testing against high-volume TPS ? I want to see that it would work correctly both in the cases of random iteration timing as well as high TPS.

Sure!

@@ -358,7 +358,7 @@ func TestRewardRateRecalculation(t *testing.T) {
r.NoError(fixture.WaitForRoundWithTimeout(rewardRecalcRound - 1))
balanceOfRewardsPool, roundQueried := fixture.GetBalanceAndRound(rewardsAccount)
if roundQueried != rewardRecalcRound-1 {
r.FailNow("got rewards pool balance on round %d but wanted the balance on round %d, failing out", rewardRecalcRound-1, roundQueried)
r.FailNow("", "got rewards pool balance on round %d but wanted the balance on round %d, failing out", rewardRecalcRound-1, roundQueried)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FailNow parameters were just wrong, and instead of failing with the desired error message, it would fail during FailNow to format the parameters.

@@ -42,7 +44,7 @@ func TimedWait(c *sync.Cond, timeout time.Duration) {
// thread hasn't gotten around to calling c.Wait()
// yet, so the c.Broadcast() did not wake it up.
// Sleep for a second and check again.
<-time.After(time.Second)
time.Sleep(time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use util.NanoSleep()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used NanoSleep only in places that have time-criticality. On this one, we clearly don't care of it would take exactly 1 second or 1 second and 1 ms.

//
// You should have received a copy of the GNU Affero General Public License
// along with go-algorand. If not, see <https://www.gnu.org/licenses/>.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs // +build linux.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need. it's implied by the "_linux" postfix to the filename.

@tsachiherman
Copy link
Contributor Author

We should also consider upgrading (in a separate PR) to go 1.17, since 1.18 is out and 1.16 is therefore deprecated.

agree, but as you've suggested - a separate PR.

@tsachiherman tsachiherman merged commit bc19540 into algorand:master Mar 21, 2022
@tsachiherman tsachiherman deleted the tsachi/go116upgrade branch March 21, 2022 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants