-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Improve](inverted index) improve match performance without index #24751
Conversation
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TeamCity be ut coverage result: |
f692b6d
to
a6dde5e
Compare
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
a6dde5e
to
97feec2
Compare
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
97feec2
to
352610b
Compare
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
@@ -282,18 +294,18 @@ Status FullTextIndexReader::query(OlapReaderStatistics* stats, RuntimeState* run | |||
if (query_type == InvertedIndexQueryType::MATCH_PHRASE_QUERY || | |||
query_type == InvertedIndexQueryType::MATCH_ALL_QUERY || | |||
query_type == InvertedIndexQueryType::EQUAL_QUERY) { | |||
std::wstring wstr_tokens; | |||
std::string str_tokens; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: variable 'str_tokens' is not initialized [cppcoreguidelines-init-variables]
std::string str_tokens; | |
std::string str_tokens = 0; |
auto reader = doris::segment_v2::InvertedIndexReader::create_reader( | ||
&inverted_index_ctx, tokenize_str.to_string()); | ||
|
||
std::vector<std::string> query_tokens = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: variable 'query_tokens' is not initialized [cppcoreguidelines-init-variables]
std::vector<std::string> query_tokens = | |
std::vector<std::string> query_tokens = 0 = |
TeamCity be ut coverage result: |
run buildall |
TeamCity be ut coverage result: |
(From new machine)TeamCity pipeline, clickbench performance test result: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Proposed changes
Issue Number: close #xxx
improve performance for match without index, from 2 min 34.54 sec to 9.16s, on httplogs dataset.
Further comments
If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...