Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Consensus] Remove non-decoupled execution and refactor for cleaner interfaces #12104

Merged
merged 2 commits into from
Feb 21, 2024

Conversation

sitalkedia
Copy link
Contributor

Description

  1. Decoupled execution is default - so simplify the code by removing non decoupled execution code path.
  2. Remove the OrderedStateComputer and replace it with cleaner interface ExecutionClient. This provides a cleaner API for consensus to interact with pipelined execution.

Test Plan

Forge test and UTs

Copy link

trunk-io bot commented Feb 19, 2024

⏱️ 15h 13m total CI duration on this PR
Job Cumulative Duration Recent Runs
rust-unit-coverage 4h 17m 🟩
rust-smoke-coverage 3h 16m 🟩
windows-build 1h 25m 🟩🟩🟩🟩
execution-performance / single-node-performance 1h 4m 🟩🟩🟩
rust-unit-tests 1h 2m 🟩🟩
rust-smoke-tests 1h 2m 🟩🟩
rust-images / rust-all 35m 🟩🟩
forge-compat-test / forge 29m 🟩🟩
forge-e2e-test / forge 27m 🟩🟩
rust-lints 21m 🟩🟩
run-tests-main-branch 20m 🟥🟥🟥
cli-e2e-tests / run-cli-tests 18m 🟥🟥
check 11m 🟩🟩
check-dynamic-deps 8m 🟩🟩🟩🟩
general-lints 6m 🟩🟩
indexer-grpc-e2e-tests / test-indexer-grpc-docker-compose 4m 🟩🟩
node-api-compatibility-tests / node-api-compatibility-tests 2m 🟩🟩
semgrep/ci 2m 🟩🟩🟩🟩
file_change_determinator 52s 🟩🟩🟩🟩🟩
file_change_determinator 35s 🟩🟩🟩
file_change_determinator 35s 🟩🟩🟩🟩
execution-performance / file_change_determinator 28s 🟩🟩🟩
permission-check 15s 🟩🟩🟩🟩🟩
permission-check 14s 🟩🟩🟩🟩🟩
determine-docker-build-metadata 11s 🟩🟩🟩🟩
upload-to-codecov 11s 🟩
permission-check 11s 🟩🟩🟩🟩🟩
permission-check 9s 🟩🟩🟩🟩
permission-check 8s 🟩🟩🟩🟩🟩

🚨 1 job on the last run was significantly faster/slower than expected

Job Duration vs 7d avg Delta
rust-images / rust-all 16m 13m +22%

settingsfeedbackdocs ⋅ learn more about trunk.io

@sitalkedia sitalkedia added the CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR label Feb 19, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link

codecov bot commented Feb 20, 2024

Codecov Report

Attention: 273 lines in your changes are missing coverage. Please review.

Comparison is base (d771cec) 71.4% compared to head (dc83ecc) 69.8%.
Report is 6 commits behind head on main.

❗ Current head dc83ecc differs from pull request most recent head 26d59a2. Consider uploading reports for the commit 26d59a2 to get more accurate results

Files Patch % Lines
consensus/src/pipeline/execution_client.rs 4.8% 217 Missing ⚠️
consensus/src/epoch_manager.rs 47.0% 18 Missing ⚠️
consensus/src/test_utils/mock_execution_client.rs 87.0% 13 Missing ⚠️
consensus/src/consensus_provider.rs 0.0% 12 Missing ⚠️
consensus/src/test_utils/mock_state_computer.rs 57.1% 6 Missing ⚠️
consensus/src/recovery_manager.rs 0.0% 3 Missing ⚠️
consensus/src/block_storage/block_store.rs 94.5% 2 Missing ⚠️
consensus/src/dag/bootstrap.rs 80.0% 1 Missing ⚠️
consensus/src/twins/twins_node.rs 75.0% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main   #12104       +/-   ##
===========================================
- Coverage    71.4%    69.8%     -1.6%     
===========================================
  Files         810     2231     +1421     
  Lines      184821   419640   +234819     
===========================================
+ Hits       132005   293316   +161311     
- Misses      52816   126324    +73508     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bchocho bchocho changed the title [Consensus] Remove decoupled execution and refactor for cleaner interfaces [Consensus] Remove non-decoupled execution and refactor for cleaner interfaces Feb 20, 2024
@sitalkedia sitalkedia force-pushed the public_remove_decoupled_execution branch from 542963a to 26d59a2 Compare February 21, 2024 19:39
@sitalkedia sitalkedia enabled auto-merge (squash) February 21, 2024 19:39

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite compat success on aptos-node-v1.9.5 ==> 26d59a2f601ec96bf89e5cde0c13122b200e25ba

Compatibility test results for aptos-node-v1.9.5 ==> 26d59a2f601ec96bf89e5cde0c13122b200e25ba (PR)
1. Check liveness of validators at old version: aptos-node-v1.9.5
compatibility::simple-validator-upgrade::liveness-check : committed: 6578 txn/s, latency: 5067 ms, (p50: 4800 ms, p90: 8900 ms, p99: 10100 ms), latency samples: 230240
2. Upgrading first Validator to new version: 26d59a2f601ec96bf89e5cde0c13122b200e25ba
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 1613 txn/s, latency: 16943 ms, (p50: 17500 ms, p90: 22600 ms, p99: 23500 ms), latency samples: 83900
3. Upgrading rest of first batch to new version: 26d59a2f601ec96bf89e5cde0c13122b200e25ba
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 336 txn/s, submitted: 684 txn/s, expired: 347 txn/s, latency: 22568 ms, (p50: 13900 ms, p90: 54200 ms, p99: 57500 ms), latency samples: 27897
4. upgrading second batch to new version: 26d59a2f601ec96bf89e5cde0c13122b200e25ba
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 2974 txn/s, latency: 10194 ms, (p50: 9600 ms, p90: 16800 ms, p99: 17800 ms), latency samples: 130880
5. check swarm health
Compatibility test for aptos-node-v1.9.5 ==> 26d59a2f601ec96bf89e5cde0c13122b200e25ba passed
Test Ok

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 26d59a2f601ec96bf89e5cde0c13122b200e25ba

two traffics test: inner traffic : committed: 7546 txn/s, latency: 5194 ms, (p50: 5000 ms, p90: 6300 ms, p99: 10800 ms), latency samples: 3260220
two traffics test : committed: 100 txn/s, latency: 2133 ms, (p50: 2000 ms, p90: 2400 ms, p99: 3700 ms), latency samples: 1780
Latency breakdown for phase 0: ["QsBatchToPos: max: 0.219, avg: 0.202", "QsPosToProposal: max: 0.447, avg: 0.399", "ConsensusProposalToOrdered: max: 0.567, avg: 0.522", "ConsensusOrderedToCommit: max: 0.463, avg: 0.443", "ConsensusProposalToCommit: max: 0.999, avg: 0.966"]
Max round gap was 1 [limit 4] at version 1504644. Max no progress secs was 6.137769 [limit 15] at version 1504644.
Test Ok

@sitalkedia sitalkedia merged commit 055683a into main Feb 21, 2024
62 of 81 checks passed
@sitalkedia sitalkedia deleted the public_remove_decoupled_execution branch February 21, 2024 20:10
zjma added a commit that referenced this pull request Feb 22, 2024
* Fix `iss`-related bug in Groth16 path & refactor (#12017)

Co-authored-by: Oliver <[email protected]>

* [aptosvm] Simplify VM flows (#11888)

* Duplicated logic for creating the gas meter for view functions has been removed.
* Duplicated logic for calculating gas used for view functions has been removed.
* There was unreachable code in failure transaction cleanup, where the discarded
status has been returned immediately, but then re-checked again. The first check
is shifted inside.
* No more default transaction metadata.
* Scripts are now validated consistently.
* Simplifies transaction execution function signature to avoid `Option<String>`.
* Removes duplicated features from `AptosVM` and keeps them in `MoveVMExt`.
* Fixes a bug when script hash was not computed for `RunOnAbort`.

Related tests are moved  to `move-e2e-tests`.

* [Compiler V2] Critical edge elimination (#11894)

Implement a pass to eliminate critical edges by splitting them with empty blocks

* [consensus configs] reduce sending block size from 2500 to 1900 (#12091)

### Description

The block output limit is no longer hit with p2p txns.

### Test Plan

Forge `realistic_env_max_load` TPS improves.

* [Indexer-grpc] Add profiling support. (#12034)

* Minor aggregator cleanup (#12013)

* Minor aggregator cleanup

* Addressing PR comments

* [move] rotate_authentication_key_call should not modify OriginatingAddress (#12108)

Co-authored-by: Alin Tomescu <[email protected]>

* [Data Streaming Service] Add dynamic prefetching support

* [Data Streaming Service] Add dynamic prefetching unit tests.

* [Data Streaming Service] Update existing integration tests.

* [State Sync] Add backpressure to fast sync receiver.

* Update perf baseline for gas charging coverage improvements (reducing throughput) (#12124)

* Reduce latency of cloning network sender using Arc pointers (#12103)

* Avoid cloning network sender using Arc pointers

* Removing a clone

* 100 node sweep test

* Removing a few clone operations

* reset forge test

* Removing some clones

* Removing clones

* adopt AIP-61 terminology for consistency (#12123)

adopt AIP-61 terminology for consistency

* [Consensus] Remove non-decoupled execution and refactor for cleaner interfaces (#12104)

* fix jwk key logging (#12090)

* remove spurious error lines (#12137)

* randomness #1: types update from randomnet (#12106)

* types update from randomnet

* update

* lint

* lint

* All validators broadcast commit vote messages (#12059)

* All validators broadcast commit messages

* Forge testing

* Increase timeout for forge

* test forge realistic_env_workload_sweep_test

* run realistic_env_workload_sweep_test

* run realistic_env_workload_sweep_test

* run sweep test

* increase forge runner duration

* forge testing

* Letting the proposer also broadcast commit decision for backward compatibility

* removing forge changes

* Added a TODO

* [vm] Resource access control: runtime engine (#10544)

* [vm] Resource access control: runtime engine

Implements the runtime engine for resource access control:

- a representation of access control specifiers in `loaded_data::runtime_access_specifiers`.
- a loader for access specifiers in `runtime::loader::access_specifier_loader`.
- a new stateful object representing the access control logic in `runtime::access_control`.
- finally the use of the `AccessControlState` in `runtime::interpreter`.

* Addressing reviewer comments.

* Addressing reviewer comments.

* typo: PTLA -> PTAL

* Rebasing: adjusting to upstream changes

* Rebasing

* ObjectCodeDeployment API cleanup update (#12133)

* ObjectCodeDeployment API cleanup update (#12141)

* [Compiler-v2] porting more V1 unit tests to V2 (#12085)

* update tests

* fix bug

* fix-12116

* fix missing space

* add expected got

* remove live-var tests

* fix had_erros

* fix

* Enable the max object nesting check (#12129)

* Resolved the warning for unused variable (#12157)

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Squashed commit of the following:

commit a50ffec
Author: Zhoujun Ma <[email protected]>
Date:   Thu Feb 22 21:10:12 2024 +0000

    lint

commit 388350f
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 13:04:28 2024 -0800

    update

commit 76f7eca
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:56:04 2024 -0800

    update

commit a663542
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:54:18 2024 -0800

    update

commit b439449
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:34:14 2024 -0800

    update

commit 3378ceb
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:17:06 2024 -0800

    update

commit 6cd6685
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:15:05 2024 -0800

    update

commit 6d89f37
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:13:51 2024 -0800

    update

commit 980f257
Author: zhoujun.ma <[email protected]>
Date:   Thu Feb 22 12:12:04 2024 -0800

    update

commit 16e9349
Author: Zhoujun Ma <[email protected]>
Date:   Thu Feb 22 18:25:08 2024 +0000

    lint

---------

Co-authored-by: Alin Tomescu <[email protected]>
Co-authored-by: Oliver <[email protected]>
Co-authored-by: George Mitenkov <[email protected]>
Co-authored-by: Zekun Wang <[email protected]>
Co-authored-by: Brian (Sunghoon) Cho <[email protected]>
Co-authored-by: Guoteng Rao <[email protected]>
Co-authored-by: Satya Vusirikala <[email protected]>
Co-authored-by: David Wolinsky <[email protected]>
Co-authored-by: Josh Lind <[email protected]>
Co-authored-by: igor-aptos <[email protected]>
Co-authored-by: Sital Kedia <[email protected]>
Co-authored-by: Wolfgang Grieskamp <[email protected]>
Co-authored-by: Teng Zhang <[email protected]>
Co-authored-by: Junkil Park <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants