Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All validators broadcast commit vote messages #12059

Merged
merged 13 commits into from
Feb 21, 2024

Conversation

vusirikala
Copy link
Contributor

@vusirikala vusirikala commented Feb 15, 2024

Description

Currently, all the validators send their commit votes to the block proposer. The block proposer aggregates the commit votes and broadcasts the commit decision. In order to reduce one hop of latency, this PR will let all the validators directly broadcast their commit votes.

Test Plan

The goal of the PR is to reduce latency. We ran forge test with 100 nodes on this PR vs main branch.

Forge run on this PR: https://github.com/aptos-labs/aptos-core/actions/runs/7981466686
Forge run on main: https://github.com/aptos-labs/aptos-core/actions/runs/7965951082

Copy link

trunk-io bot commented Feb 15, 2024

⏱️ 51h 18m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-unit-tests 10h 20m 🟥🟩🟩 (+12 more)
windows-build 5h 24m 🟩🟩🟩🟩🟩 (+12 more)
rust-unit-coverage 5h 9m 🟩
execution-performance / single-node-performance 4h 56m 🟩🟥🟩🟩🟩 (+10 more)
rust-smoke-tests 4h 53m 🟩🟩🟩 (+9 more)
rust-smoke-coverage 4h 14m 🟩
forge-e2e-test / forge 2h 41m 🟥🟥🟩🟩🟩 (+9 more)
rust-images / rust-all 2h 27m 🟩🟩🟩 (+11 more)
forge-compat-test / forge 2h 5m 🟩🟩🟩🟩🟩 (+6 more)
cli-e2e-tests / run-cli-tests 1h 47m 🟥🟥🟥🟥🟥 (+7 more)
run-tests-main-branch 1h 36m 🟥🟥🟥🟥 (+12 more)
rust-lints 1h 24m 🟥🟩🟩 (+12 more)
adhoc-forge-test / forge 1h 21m 🟩🟩
check 1h 2m 🟩🟩🟩 (+12 more)
general-lints 35m 🟩🟩🟩🟩 (+12 more)
check-dynamic-deps 34m 🟩🟩🟩🟩🟩 (+12 more)
indexer-grpc-e2e-tests / test-indexer-grpc-docker-compose 16m 🟩🟥🟩🟩🟥 (+5 more)
node-api-compatibility-tests / node-api-compatibility-tests 9m 🟩🟩🟩🟩🟩 (+5 more)
semgrep/ci 7m 🟩🟩🟩🟩🟩 (+12 more)
file_change_determinator 3m 🟩🟩🟩🟩🟩 (+12 more)
file_change_determinator 3m 🟩🟩🟩🟩🟩 (+12 more)
file_change_determinator 3m 🟩🟩🟩🟩🟩 (+11 more)
execution-performance / file_change_determinator 2m 🟩🟩🟩🟩🟩 (+9 more)
permission-check 1m 🟩🟩🟩🟩🟩 (+12 more)
permission-check 52s 🟩🟩🟩🟩🟩 (+12 more)
determine-docker-build-metadata 49s 🟩🟩🟩🟩🟩 (+11 more)
permission-check 49s 🟩🟩🟩🟩🟩 (+12 more)
permission-check 43s 🟩🟩🟩🟩🟩 (+11 more)
permission-check 41s 🟩🟩🟩🟩🟩 (+12 more)
determine-forge-run-metadata 12s 🟩🟩
upload-to-codecov 12s 🟩

settingsfeedbackdocs ⋅ learn more about trunk.io

This comment has been minimized.

This comment has been minimized.

@vusirikala vusirikala added CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR and removed CICD:experimental-forge CICD:run-forge-e2e-perf Run the e2e perf forge only labels Feb 16, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link

codecov bot commented Feb 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (ea133dc) 71.4% compared to head (60744f1) 69.9%.
Report is 21 commits behind head on main.

❗ Current head 60744f1 differs from pull request most recent head cb38636. Consider uploading reports for the commit cb38636 to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##             main   #12059       +/-   ##
===========================================
- Coverage    71.4%    69.9%     -1.5%     
===========================================
  Files         802     2222     +1420     
  Lines      184373   419549   +235176     
===========================================
+ Hits       131737   293517   +161780     
- Misses      52636   126032    +73396     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@ibalajiarun
Copy link
Contributor

You need to adjust the timeout here to make the test work for 100 nodes.

let timeout_duration = Duration::from_secs(30);

Bump it to 5 minutes or so in your PR.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

@sitalkedia sitalkedia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a TODO to remove the proposal commit decision broadcast in following release?

@vusirikala vusirikala enabled auto-merge (squash) February 21, 2024 20:14

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on aptos-node-v1.9.5 ==> cb386360e848aba2bc230f60d10718e9c49f77df

Compatibility test results for aptos-node-v1.9.5 ==> cb386360e848aba2bc230f60d10718e9c49f77df (PR)
1. Check liveness of validators at old version: aptos-node-v1.9.5
compatibility::simple-validator-upgrade::liveness-check : committed: 6761 txn/s, latency: 4943 ms, (p50: 4800 ms, p90: 8100 ms, p99: 9600 ms), latency samples: 236660
2. Upgrading first Validator to new version: cb386360e848aba2bc230f60d10718e9c49f77df
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1669 txn/s, latency: 16700 ms, (p50: 17700 ms, p90: 21700 ms, p99: 22200 ms), latency samples: 85160
3. Upgrading rest of first batch to new version: cb386360e848aba2bc230f60d10718e9c49f77df
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 109 txn/s, submitted: 327 txn/s, expired: 218 txn/s, latency: 24339 ms, (p50: 7900 ms, p90: 90200 ms, p99: 90300 ms), latency samples: 16034
4. upgrading second batch to new version: cb386360e848aba2bc230f60d10718e9c49f77df
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 2609 txn/s, latency: 11366 ms, (p50: 12100 ms, p90: 17500 ms, p99: 17800 ms), latency samples: 117420
5. check swarm health
Compatibility test for aptos-node-v1.9.5 ==> cb386360e848aba2bc230f60d10718e9c49f77df passed
Test Ok

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on cb386360e848aba2bc230f60d10718e9c49f77df

two traffics test: inner traffic : committed: 7628 txn/s, latency: 4997 ms, (p50: 4500 ms, p90: 6200 ms, p99: 11400 ms), latency samples: 3295620
two traffics test : committed: 100 txn/s, latency: 2198 ms, (p50: 2000 ms, p90: 2300 ms, p99: 7000 ms), latency samples: 1760
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.246, avg: 0.206", "QsPosToProposal: max: 0.184, avg: 0.155", "ConsensusProposalToOrdered: max: 0.548, avg: 0.533", "ConsensusOrderedToCommit: max: 0.395, avg: 0.376", "ConsensusProposalToCommit: max: 0.942, avg: 0.909"]
Max round gap was 1 [limit 4] at version 873472. Max no progress secs was 4.821868 [limit 15] at version 3261394.
Test Ok

@vusirikala vusirikala merged commit 76d8532 into main Feb 21, 2024
65 of 81 checks passed
@vusirikala vusirikala deleted the satya/commit-vote-broadcast branch February 21, 2024 20:44
zjma added a commit that referenced this pull request Feb 22, 2024
* Fix `iss`-related bug in Groth16 path & refactor (#12017)

Co-authored-by: Oliver <[email protected]>

* [aptosvm] Simplify VM flows (#11888)

* Duplicated logic for creating the gas meter for view functions has been removed.
* Duplicated logic for calculating gas used for view functions has been removed.
* There was unreachable code in failure transaction cleanup, where the discarded
status has been returned immediately, but then re-checked again. The first check
is shifted inside.
* No more default transaction metadata.
* Scripts are now validated consistently.
* Simplifies transaction execution function signature to avoid `Option<String>`.
* Removes duplicated features from `AptosVM` and keeps them in `MoveVMExt`.
* Fixes a bug when script hash was not computed for `RunOnAbort`.

Related tests are moved  to `move-e2e-tests`.

* [Compiler V2] Critical edge elimination (#11894)

Implement a pass to eliminate critical edges by splitting them with empty blocks

* [consensus configs] reduce sending block size from 2500 to 1900 (#12091)

### Description

The block output limit is no longer hit with p2p txns.

### Test Plan

Forge `realistic_env_max_load` TPS improves.

* [Indexer-grpc] Add profiling support. (#12034)

* Minor aggregator cleanup (#12013)

* Minor aggregator cleanup

* Addressing PR comments

* [move] rotate_authentication_key_call should not modify OriginatingAddress (#12108)

Co-authored-by: Alin Tomescu <[email protected]>

* [Data Streaming Service] Add dynamic prefetching support

* [Data Streaming Service] Add dynamic prefetching unit tests.

* [Data Streaming Service] Update existing integration tests.

* [State Sync] Add backpressure to fast sync receiver.

* Update perf baseline for gas charging coverage improvements (reducing throughput) (#12124)

* Reduce latency of cloning network sender using Arc pointers (#12103)

* Avoid cloning network sender using Arc pointers

* Removing a clone

* 100 node sweep test

* Removing a few clone operations

* reset forge test

* Removing some clones

* Removing clones

* adopt AIP-61 terminology for consistency (#12123)

adopt AIP-61 terminology for consistency

* [Consensus] Remove non-decoupled execution and refactor for cleaner interfaces (#12104)

* fix jwk key logging (#12090)

* remove spurious error lines (#12137)

* randomness #1: types update from randomnet (#12106)

* types update from randomnet

* update

* lint

* lint

* All validators broadcast commit vote messages (#12059)

* All validators broadcast commit messages

* Forge testing

* Increase timeout for forge

* test forge realistic_env_workload_sweep_test

* run realistic_env_workload_sweep_test

* run realistic_env_workload_sweep_test

* run sweep test

* increase forge runner duration

* forge testing

* Letting the proposer also broadcast commit decision for backward compatibility

* removing forge changes

* Added a TODO

* [vm] Resource access control: runtime engine (#10544)

* [vm] Resource access control: runtime engine

Implements the runtime engine for resource access control:

- a representation of access control specifiers in `loaded_data::runtime_access_specifiers`.
- a loader for access specifiers in `runtime::loader::access_specifier_loader`.
- a new stateful object representing the access control logic in `runtime::access_control`.
- finally the use of the `AccessControlState` in `runtime::interpreter`.

* Addressing reviewer comments.

* Addressing reviewer comments.

* typo: PTLA -> PTAL

* Rebasing: adjusting to upstream changes

* Rebasing

* ObjectCodeDeployment API cleanup update (#12133)

* ObjectCodeDeployment API cleanup update (#12141)

* [Compiler-v2] porting more V1 unit tests to V2 (#12085)

* update tests

* fix bug

* fix-12116

* fix missing space

* add expected got

* remove live-var tests

* fix had_erros

* fix

* Enable the max object nesting check (#12129)

* Resolved the warning for unused variable (#12157)

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Squashed commit of the following:

commit a50ffec
Author: Zhoujun Ma <[email protected]>
Date:   Thu Feb 22 21:10:12 2024 +0000

    lint

commit 388350f
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 13:04:28 2024 -0800

    update

commit 76f7eca
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:56:04 2024 -0800

    update

commit a663542
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:54:18 2024 -0800

    update

commit b439449
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:34:14 2024 -0800

    update

commit 3378ceb
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:17:06 2024 -0800

    update

commit 6cd6685
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:15:05 2024 -0800

    update

commit 6d89f37
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:13:51 2024 -0800

    update

commit 980f257
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:12:04 2024 -0800

    update

commit 16e9349
Author: Zhoujun Ma <[email protected]>
Date:   Thu Feb 22 18:25:08 2024 +0000

    lint

---------

Co-authored-by: Alin Tomescu <[email protected]>
Co-authored-by: Oliver <[email protected]>
Co-authored-by: George Mitenkov <[email protected]>
Co-authored-by: Zekun Wang <[email protected]>
Co-authored-by: Brian (Sunghoon) Cho <[email protected]>
Co-authored-by: Guoteng Rao <[email protected]>
Co-authored-by: Satya Vusirikala <[email protected]>
Co-authored-by: David Wolinsky <[email protected]>
Co-authored-by: Josh Lind <[email protected]>
Co-authored-by: igor-aptos <[email protected]>
Co-authored-by: Sital Kedia <[email protected]>
Co-authored-by: Wolfgang Grieskamp <[email protected]>
Co-authored-by: Teng Zhang <[email protected]>
Co-authored-by: Junkil Park <[email protected]>
vusirikala added a commit that referenced this pull request Feb 23, 2024
* All validators broadcast commit messages

* Forge testing

* Increase timeout for forge

* test forge realistic_env_workload_sweep_test

* run realistic_env_workload_sweep_test

* run realistic_env_workload_sweep_test

* run sweep test

* increase forge runner duration

* forge testing

* Letting the proposer also broadcast commit decision for backward compatibility

* removing forge changes

* Added a TODO
vusirikala added a commit that referenced this pull request Feb 23, 2024
* All validators broadcast commit messages

* Forge testing

* Increase timeout for forge

* test forge realistic_env_workload_sweep_test

* run realistic_env_workload_sweep_test

* run realistic_env_workload_sweep_test

* run sweep test

* increase forge runner duration

* forge testing

* Letting the proposer also broadcast commit decision for backward compatibility

* removing forge changes

* Added a TODO
vusirikala added a commit that referenced this pull request Mar 26, 2024
* All validators broadcast commit messages

* Forge testing

* Increase timeout for forge

* test forge realistic_env_workload_sweep_test

* run realistic_env_workload_sweep_test

* run realistic_env_workload_sweep_test

* run sweep test

* increase forge runner duration

* forge testing

* Letting the proposer also broadcast commit decision for backward compatibility

* removing forge changes

* Added a TODO
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants