Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix](partial update) Persist partial_update_info in RocksDB in case of BE restart after a partial update has commited #38331

Merged
merged 3 commits into from
Aug 7, 2024

Conversation

bobhan1
Copy link
Contributor

@bobhan1 bobhan1 commented Jul 24, 2024

Proposed changes

If a partial update has conflict with another load during publish phase, it should combine the two load's data into one to get the corrrect result. This procedure needs partial update info. But If BE crashed after the partial update load has committed, the partial update info will be missing becasuse it's not persisted and will not be restored in DataDir::load(). This PR persists partial update info in RocksDB before the txn is commited and remove it after the publish phase.
Before #25147, partial update info is persisted with tablet_schema in RocksDB. #25147 split partial update info from tablet schema but forget to handle the persistence logic.

branch-2.1-pick: #39035
branch-2.0-pick: #39078

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@bobhan1 bobhan1 force-pushed the persist-partial-update-info branch from 236a525 to e79f107 Compare July 24, 2024 14:57
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions


#include "olap/partial_update_info.h"

#include <gen_cpp/olap_file.pb.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'gen_cpp/olap_file.pb.h' file not found [clang-diagnostic-error]

#include <gen_cpp/olap_file.pb.h>
         ^


struct PartialUpdateInfo {
void init(const TabletSchema& tablet_schema, bool partial_update,
const std::set<string>& partial_update_cols, bool is_strict_mode,
const std::set<std::string>& partial_update_cols, bool is_strict_mode,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: no template named 'set' in namespace 'std' [clang-diagnostic-error]

              const std::set<std::string>& partial_update_cols, bool is_strict_mode,
                         ^

@@ -18,6 +18,8 @@
#ifndef DORIS_BE_SRC_OLAP_ROWSET_ROWSET_META_MANAGER_H
#define DORIS_BE_SRC_OLAP_ROWSET_ROWSET_META_MANAGER_H

#include <gen_cpp/olap_file.pb.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'gen_cpp/olap_file.pb.h' file not found [clang-diagnostic-error]

#include <gen_cpp/olap_file.pb.h>
         ^

@bobhan1
Copy link
Contributor Author

bobhan1 commented Jul 25, 2024

run buildall

1 similar comment
@bobhan1
Copy link
Contributor Author

bobhan1 commented Jul 25, 2024

run buildall

@bobhan1 bobhan1 force-pushed the persist-partial-update-info branch from ad2d9a9 to efe2ea6 Compare July 25, 2024 11:54
@bobhan1
Copy link
Contributor Author

bobhan1 commented Jul 25, 2024

run buildall

@bobhan1 bobhan1 force-pushed the persist-partial-update-info branch from efe2ea6 to 762e3ef Compare July 29, 2024 02:15
@bobhan1
Copy link
Contributor Author

bobhan1 commented Jul 29, 2024

run buildall

@bobhan1 bobhan1 force-pushed the persist-partial-update-info branch 2 times, most recently from 68d404b to 4095139 Compare July 31, 2024 03:32
@dataroaring
Copy link
Contributor

run buildall

@bobhan1 bobhan1 force-pushed the persist-partial-update-info branch from 4095139 to f9af921 Compare August 1, 2024 05:54
@github-actions github-actions bot added the doing label Aug 1, 2024
tmp

tmp

tmp

tmp

finish

finish case

tmp

fix

fix complie

tmp

tmp

tmp

tmp
@bobhan1 bobhan1 force-pushed the persist-partial-update-info branch from f9af921 to d4b8f3b Compare August 1, 2024 06:08
zhannngchen
zhannngchen previously approved these changes Aug 1, 2024
Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 1, 2024
Copy link
Contributor

github-actions bot commented Aug 1, 2024

PR approved by at least one committer and no changes requested.

Copy link
Contributor

github-actions bot commented Aug 1, 2024

PR approved by anyone and no changes requested.

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Aug 1, 2024
@bobhan1 bobhan1 force-pushed the persist-partial-update-info branch from be9266f to 628fb44 Compare August 1, 2024 06:33
@bobhan1
Copy link
Contributor Author

bobhan1 commented Aug 1, 2024

run buildall

zhannngchen
zhannngchen previously approved these changes Aug 1, 2024
Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 1, 2024
Copy link
Contributor

@zhannngchen zhannngchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 2, 2024
Copy link
Contributor

github-actions bot commented Aug 2, 2024

PR approved by at least one committer and no changes requested.

Copy link
Contributor

@hust-hhb hust-hhb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 0b834be into apache:master Aug 7, 2024
31 of 33 checks passed
zhannngchen pushed a commit that referenced this pull request Aug 8, 2024
… in RocksDB in case of BE restart after a partial update has commited #38331" (#39035)

picks #38331 and
#39066
zhannngchen pushed a commit that referenced this pull request Aug 8, 2024
dataroaring pushed a commit that referenced this pull request Aug 9, 2024
… in RocksDB in case of BE restart after a partial update has commited #38331" (#39078)

picks #38331 and
#39066
dataroaring pushed a commit that referenced this pull request Aug 11, 2024
…of BE restart after a partial update has commited (#38331)

## Proposed changes
If a partial update has conflict with another load during publish phase,
it should combine the two load's data into one to get the corrrect
result. This procedure needs partial update info. But If BE crashed
after the partial update load has committed, the partial update info
will be missing becasuse it's not persisted and will not be restored in
`DataDir::load()`. This PR persists partial update info in RocksDB
before the txn is commited and remove it after the publish phase.
Before #25147, partial update info
is persisted with tablet_schema in RocksDB.
#25147 split partial update info
from tablet schema but forget to handle the persistence logic.
dataroaring pushed a commit that referenced this pull request Aug 11, 2024
wyxxxcat pushed a commit to wyxxxcat/doris that referenced this pull request Aug 14, 2024
…of BE restart after a partial update has commited (apache#38331)

## Proposed changes
If a partial update has conflict with another load during publish phase,
it should combine the two load's data into one to get the corrrect
result. This procedure needs partial update info. But If BE crashed
after the partial update load has committed, the partial update info
will be missing becasuse it's not persisted and will not be restored in
`DataDir::load()`. This PR persists partial update info in RocksDB
before the txn is commited and remove it after the publish phase.
Before apache#25147, partial update info
is persisted with tablet_schema in RocksDB.
apache#25147 split partial update info
from tablet schema but forget to handle the persistence logic.
wyxxxcat pushed a commit to wyxxxcat/doris that referenced this pull request Aug 14, 2024
dataroaring pushed a commit that referenced this pull request Aug 16, 2024
…of BE restart after a partial update has commited (#38331)

## Proposed changes
If a partial update has conflict with another load during publish phase,
it should combine the two load's data into one to get the corrrect
result. This procedure needs partial update info. But If BE crashed
after the partial update load has committed, the partial update info
will be missing becasuse it's not persisted and will not be restored in
`DataDir::load()`. This PR persists partial update info in RocksDB
before the txn is commited and remove it after the publish phase.
Before #25147, partial update info
is persisted with tablet_schema in RocksDB.
#25147 split partial update info
from tablet schema but forget to handle the persistence logic.
dataroaring pushed a commit that referenced this pull request Aug 16, 2024
GoGoWen pushed a commit to GoGoWen/incubator-doris that referenced this pull request Aug 27, 2024
… in RocksDB in case of BE restart after a partial update has commited apache#38331" (apache#39078)

picks apache#38331 and
apache#39066
bobhan1 added a commit to bobhan1/doris that referenced this pull request Jan 21, 2025
… in RocksDB in case of BE restart after a partial update has commited apache#38331" (apache#39078)

picks apache#38331 and
apache#39066
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants