Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](http) enable auth token with BE http request #41994

Merged
merged 11 commits into from
Nov 12, 2024

Conversation

morningman
Copy link
Contributor

@morningman morningman commented Oct 16, 2024

What problem does this PR solve?

Related PR: #39577

Problem Summary:
In #39577, we added the auth check for all HTTP API on FE side.
But it introduced an issue that when enable_all_http_auth, the internal http request
will fail due to lack of authentication info.
For example, when cloning replica from one BE to another, it use HTTP API.

This PR mainly changes:

  1. Unify the token generation and checking logic

    Move TokenManager from load package to Env, as a global mgr.

    It is responsible for generating tokens at fix interval.
    And the token will be sent to BE via heartbeat.

    BE will save last 2 tokens, and use the latest token in HTTP request.

    All HTTP request sent by BE will add a header Auth-Token,
    and BE's HTTP server will check if this token in header is same as token
    from FE heartbeat.

  2. Add a new class ClusterInfo on BE side to replace TMasterInfo.

    TMasterInfo is a thrift object used to save master info and pass them
    from FE to BE via heartbeat.
    So it should only be a message payload, we should get info from it and
    save it in another structure: ClusterInfo.

Check List (For Committer)

  • Test

    • Regression test
    • Unit Test
    • Manual test
      I created a cluster with 2 BE, set enable_all_http_auth=true. And create a table with 1 replica,
      and then modify the replica num to 2. The clone task run success.
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No colde files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.
  • Release note

    None

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

Comment on lines 282 to 289
} else if (_master_info->curr_auth_token != master_info.auth_token)
_master_info->__set_last_auth_token(_master_info->curr_auth_token);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: statement should be inside braces [readability-braces-around-statements]

Suggested change
} else if (_master_info->curr_auth_token != master_info.auth_token)
_master_info->__set_last_auth_token(_master_info->curr_auth_token);
} else if (_master_info->curr_auth_token != master_info.auth_token) {
_master_info->__set_last_auth_token(_master_info->curr_auth_token);
}

@@ -94,5 +94,6 @@ const char* HttpHeaders::WEBSOCKET_PROTOCOL = "WebSocket-Protocol";
const char* HttpHeaders::WWW_AUTHENTICATE = "WWW-Authenticate";

const std::string HttpHeaders::JsonType = "application/json";
const std::string HttpHeaders::AUTH_TOKEN = "Auth-Token";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: no member named 'AUTH_TOKEN' in 'doris::HttpHeaders' [clang-diagnostic-error]

const std::string HttpHeaders::AUTH_TOKEN = "Auth-Token";
                               ^

@morningman morningman force-pushed the auth_token branch 2 times, most recently from 2f02060 to 00880a0 Compare November 6, 2024 09:51
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

uint32_t worker_thread_num, TMasterInfo* local_master_info) {
HeartbeatServer* heartbeat_server = new HeartbeatServer(local_master_info);
uint32_t worker_thread_num, ClusterInfo* cluster_info) {
HeartbeatServer* heartbeat_server = new HeartbeatServer(cluster_info);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: use auto when initializing with new to avoid duplicating the type name [modernize-use-auto]

Suggested change
HeartbeatServer* heartbeat_server = new HeartbeatServer(cluster_info);
auto* heartbeat_server = new HeartbeatServer(cluster_info);

@morningman
Copy link
Contributor Author

run buildall

@morningman
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@morningman morningman marked this pull request as ready for review November 7, 2024 16:16
@morningman
Copy link
Contributor Author

run buildall

@apache apache deleted a comment from github-actions bot Nov 7, 2024
// This class is used to save the cluster info
// like cluster id, epoch, cloud_unique_id, etc.
// These info are usually in heartbeat from Master FE.
class ClusterInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class and struct is same?

int64_t backend_id = 0;

// Auth token for internal authentication
std::string curr_auth_token = "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the relationship and connection between these "tokens"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment

@@ -43,6 +43,8 @@ struct TMasterInfo {
11: optional string cloud_unique_id;
// See configuration item Config.java rehash_tablet_after_be_dead_seconds for meaning
12: optional i64 tablet_report_inactive_duration_ms;
13: optional string last_auth_token;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems FE does not set it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@gavinchou
Copy link
Contributor

how do we get it tested?

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.90% (9865/26028)
Line Coverage: 29.06% (82054/282361)
Region Coverage: 28.29% (42269/149435)
Branch Coverage: 24.86% (21441/86252)
Coverage Report: http://coverage.selectdb-in.cc/coverage/4c3f88246c1136d8e7d8e273f326aaaa0ec208d4_4c3f88246c1136d8e7d8e273f326aaaa0ec208d4/report/index.html

@morningman
Copy link
Contributor Author

run buildall

@morningman
Copy link
Contributor Author

how do we get it tested?

already described in PR description. But we also need test it in a multi node env

@morningman morningman marked this pull request as draft November 8, 2024 00:01
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions


#pragma once

#include <gen_cpp/Types_types.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'gen_cpp/Types_types.h' file not found [clang-diagnostic-error]

#include <gen_cpp/Types_types.h>
         ^

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.91% (9868/26029)
Line Coverage: 29.11% (82315/282803)
Region Coverage: 28.25% (42359/149942)
Branch Coverage: 24.80% (21451/86500)
Coverage Report: http://coverage.selectdb-in.cc/coverage/8fe9ba30c4c51d151a4081261a22d46bc2caac5a_8fe9ba30c4c51d151a4081261a22d46bc2caac5a/report/index.html

@morningman morningman marked this pull request as ready for review November 11, 2024 08:31
Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@kaka11chen kaka11chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 12, 2024
@morningman morningman merged commit 7d290e1 into apache:master Nov 12, 2024
28 of 30 checks passed
github-actions bot pushed a commit that referenced this pull request Nov 12, 2024
### What problem does this PR solve?
Related PR: #39577

Problem Summary:
In #39577, we added the auth check for all HTTP API on FE side.
But it introduced an issue that when `enable_all_http_auth`, the
internal http request
will fail due to lack of authentication info.
For example, when cloning replica from one BE to another, it use HTTP
API.

This PR mainly changes:

1. Unify the token generation and checking logic

	Move `TokenManager` from `load` package to `Env`, as a global mgr.

	It is responsible for generating tokens at fix interval.
	And the token will be sent to BE via heartbeat.

	BE will save last 2 tokens, and use the latest token in HTTP request.

	All HTTP request sent by BE will add a header `Auth-Token`,
and BE's HTTP server will check if this token in header is same as token
	from FE heartbeat.

2. Add a new class `ClusterInfo` on BE side to replace `TMasterInfo`.

	`TMasterInfo` is a thrift object used to save master info and pass them
	from FE to BE via heartbeat.
	So it should only be a message payload, we should get info from it and
	save it in another structure: `ClusterInfo`.

Co-authored-by: morningman <[email protected]>
zzzxl1993 pushed a commit to zzzxl1993/doris that referenced this pull request Nov 12, 2024
### What problem does this PR solve?
Related PR: apache#39577

Problem Summary:
In apache#39577, we added the auth check for all HTTP API on FE side.
But it introduced an issue that when `enable_all_http_auth`, the
internal http request
will fail due to lack of authentication info.
For example, when cloning replica from one BE to another, it use HTTP
API.

This PR mainly changes:

1. Unify the token generation and checking logic

	Move `TokenManager` from `load` package to `Env`, as a global mgr.

	It is responsible for generating tokens at fix interval.
	And the token will be sent to BE via heartbeat.

	BE will save last 2 tokens, and use the latest token in HTTP request.

	All HTTP request sent by BE will add a header `Auth-Token`,
and BE's HTTP server will check if this token in header is same as token
	from FE heartbeat.

2. Add a new class `ClusterInfo` on BE side to replace `TMasterInfo`.

	`TMasterInfo` is a thrift object used to save master info and pass them
	from FE to BE via heartbeat.
	So it should only be a message payload, we should get info from it and
	save it in another structure: `ClusterInfo`.

Co-authored-by: morningman <[email protected]>
morningman added a commit that referenced this pull request Nov 12, 2024
Cherry-picked from #41994

Co-authored-by: Mingyu Chen (Rayner) <[email protected]>
Co-authored-by: morningman <[email protected]>
py023 pushed a commit to py023/doris that referenced this pull request Nov 13, 2024
### What problem does this PR solve?
Related PR: apache#39577

Problem Summary:
In apache#39577, we added the auth check for all HTTP API on FE side.
But it introduced an issue that when `enable_all_http_auth`, the
internal http request
will fail due to lack of authentication info.
For example, when cloning replica from one BE to another, it use HTTP
API.

This PR mainly changes:

1. Unify the token generation and checking logic

	Move `TokenManager` from `load` package to `Env`, as a global mgr.

	It is responsible for generating tokens at fix interval.
	And the token will be sent to BE via heartbeat.

	BE will save last 2 tokens, and use the latest token in HTTP request.

	All HTTP request sent by BE will add a header `Auth-Token`,
and BE's HTTP server will check if this token in header is same as token
	from FE heartbeat.

2. Add a new class `ClusterInfo` on BE side to replace `TMasterInfo`.

	`TMasterInfo` is a thrift object used to save master info and pass them
	from FE to BE via heartbeat.
	So it should only be a message payload, we should get info from it and
	save it in another structure: `ClusterInfo`.

Co-authored-by: morningman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.3-merged kind/behavior-changed reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants