Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](schema scan) Fix invalid pointer access #48313

Merged
merged 1 commit into from
Feb 26, 2025

Conversation

Gabriel39
Copy link
Contributor

@Gabriel39 Gabriel39 commented Feb 25, 2025

What problem does this PR solve?

Schema scanner runs on a separate thread which is executed asynchronously. We should make sure all context used not be freed once it is scheduled.

ERROR: AddressSanitizer: heap-buffer-overflow on address 0x613002f33eb2 at pc 0x55e085dccbe3 bp 0x7f345c0e1f10 sp 0x7f345c0e1f08
READ of size 1 at 0x613002f33eb2 thread T2776 (FragmentMgrAsyn)
#0 0x55e085dccbe2 in std::__atomic_base::load(std::memory_order) const /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:481:9
#1 0x55e085dccbe2 in std::atomic::operator bool() const /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/atomic:87:22
#2 0x55e085dccbe2 in doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0::operator()() const /home/zcp/repo_center/doris_master/doris/be/src/exec/schema_scanner.cpp:118:5
#3 0x55e085dccbe2 in void std::__invoke_impl(std::__invoke_other, doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
#4 0x55e085dccbe2 in std::enable_if, void>::type std::__invoke_r(doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:111:2
#5 0x55e085dccbe2 in std::_Function_handler::_M_invoke(std::_Any_data const&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291:9
#6 0x55e050f081ca in doris::ThreadPool::dispatch_thread() /home/zcp/repo_center/doris_master/doris/be/src/util/threadpool.cpp:608:24
#7 0x55e050ede467 in doris::Thread::supervise_thread(void*) /home/zcp/repo_center/doris_master/doris/be/src/util/thread.cpp:498:5
#8 0x7f376ef5aac2 in start_thread nptl/pthread_create.c:442:8
#9 0x7f376efec84f misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Feb 25, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Gabriel39
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31691 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d058fbb575cd83598b9159051f33dcc31cacf28d, data reload: false

------ Round 1 ----------------------------------
q1	17630	5178	5094	5094
q2	2052	305	173	173
q3	10460	1330	713	713
q4	10228	1027	521	521
q5	7537	2408	2325	2325
q6	194	170	136	136
q7	902	735	611	611
q8	9315	1284	1011	1011
q9	5276	4846	4781	4781
q10	6822	2329	1902	1902
q11	478	282	256	256
q12	349	352	219	219
q13	17762	3664	3120	3120
q14	240	248	214	214
q15	513	470	476	470
q16	625	607	584	584
q17	558	847	348	348
q18	7049	6217	6308	6217
q19	1204	945	552	552
q20	316	327	194	194
q21	2891	2193	1954	1954
q22	373	349	296	296
Total cold run time: 102774 ms
Total hot run time: 31691 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5125	5120	5101	5101
q2	236	337	232	232
q3	2170	2684	2298	2298
q4	1423	1807	1332	1332
q5	4263	4086	4185	4086
q6	202	164	123	123
q7	1852	1840	1699	1699
q8	2638	2620	2604	2604
q9	7306	7090	7147	7090
q10	2990	3186	2766	2766
q11	585	508	488	488
q12	684	746	616	616
q13	3536	3985	3236	3236
q14	297	316	277	277
q15	517	462	460	460
q16	640	697	650	650
q17	1162	1559	1347	1347
q18	7677	7278	7174	7174
q19	786	805	951	805
q20	1973	1991	1882	1882
q21	5439	4880	4985	4880
q22	626	594	515	515
Total cold run time: 52127 ms
Total hot run time: 49661 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 183570 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d058fbb575cd83598b9159051f33dcc31cacf28d, data reload: false

query1	994	387	381	381
query2	6536	1880	1860	1860
query3	6788	216	206	206
query4	26721	23264	23700	23264
query5	4380	676	521	521
query6	320	223	185	185
query7	4613	506	295	295
query8	303	245	245	245
query9	8632	2511	2520	2511
query10	482	328	257	257
query11	15394	15209	14765	14765
query12	162	107	107	107
query13	1685	553	402	402
query14	9098	6947	6399	6399
query15	219	198	179	179
query16	7368	665	493	493
query17	1213	722	564	564
query18	1969	415	311	311
query19	198	191	163	163
query20	123	119	118	118
query21	208	126	112	112
query22	4187	4200	4312	4200
query23	34200	33006	32829	32829
query24	7745	2387	2419	2387
query25	521	451	387	387
query26	1223	268	158	158
query27	2111	483	325	325
query28	3902	2409	2377	2377
query29	752	544	437	437
query30	233	186	158	158
query31	912	873	816	816
query32	74	68	61	61
query33	567	348	310	310
query34	810	860	502	502
query35	797	815	762	762
query36	978	991	906	906
query37	114	97	75	75
query38	4091	4089	4226	4089
query39	1443	1431	1380	1380
query40	204	113	105	105
query41	55	54	50	50
query42	127	101	105	101
query43	494	522	489	489
query44	1275	795	789	789
query45	184	170	158	158
query46	869	1033	638	638
query47	1770	1800	1703	1703
query48	379	403	291	291
query49	807	533	417	417
query50	661	746	421	421
query51	4143	4188	4180	4180
query52	107	109	93	93
query53	228	251	187	187
query54	493	478	406	406
query55	80	74	83	74
query56	262	269	258	258
query57	1160	1153	1082	1082
query58	259	231	242	231
query59	2587	2737	2719	2719
query60	288	263	258	258
query61	124	123	121	121
query62	823	719	693	693
query63	234	189	192	189
query64	4355	1024	666	666
query65	3235	3114	3163	3114
query66	1138	416	301	301
query67	15704	15554	15162	15162
query68	8242	875	512	512
query69	470	314	275	275
query70	1206	1130	1019	1019
query71	466	307	286	286
query72	5353	3499	3798	3499
query73	788	741	349	349
query74	8915	9100	8676	8676
query75	3839	3178	2723	2723
query76	3709	1180	771	771
query77	784	365	282	282
query78	9980	10386	9177	9177
query79	2260	873	593	593
query80	610	524	436	436
query81	499	286	246	246
query82	637	130	102	102
query83	180	171	156	156
query84	238	90	79	79
query85	822	370	314	314
query86	337	293	283	283
query87	4617	4585	4404	4404
query88	3203	2211	2193	2193
query89	395	325	289	289
query90	1904	202	195	195
query91	142	140	110	110
query92	81	61	61	61
query93	1166	1065	576	576
query94	671	413	302	302
query95	355	269	257	257
query96	483	549	265	265
query97	3323	3404	3288	3288
query98	244	208	197	197
query99	1338	1425	1262	1262
Total cold run time: 271368 ms
Total hot run time: 183570 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.06 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d058fbb575cd83598b9159051f33dcc31cacf28d, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.03	0.03
query3	0.24	0.07	0.06
query4	1.60	0.10	0.10
query5	0.56	0.57	0.56
query6	1.18	0.71	0.73
query7	0.02	0.02	0.01
query8	0.04	0.04	0.03
query9	0.59	0.53	0.53
query10	0.57	0.58	0.57
query11	0.16	0.10	0.11
query12	0.15	0.11	0.11
query13	0.64	0.60	0.59
query14	2.67	2.70	2.82
query15	0.93	0.84	0.86
query16	0.39	0.38	0.38
query17	1.05	1.03	1.03
query18	0.21	0.19	0.20
query19	1.92	2.00	1.82
query20	0.01	0.01	0.01
query21	15.38	0.91	0.54
query22	0.76	1.30	0.79
query23	14.78	1.40	0.63
query24	7.69	0.92	1.12
query25	0.51	0.23	0.08
query26	0.64	0.16	0.13
query27	0.05	0.05	0.04
query28	9.06	0.88	0.44
query29	12.55	4.02	3.30
query30	0.24	0.09	0.07
query31	2.82	0.58	0.39
query32	3.23	0.55	0.46
query33	3.08	3.01	2.99
query34	15.74	5.17	4.52
query35	4.58	4.58	4.54
query36	0.65	0.50	0.48
query37	0.10	0.07	0.07
query38	0.06	0.05	0.04
query39	0.03	0.02	0.03
query40	0.17	0.14	0.13
query41	0.08	0.03	0.03
query42	0.04	0.02	0.03
query43	0.04	0.04	0.03
Total cold run time: 105.32 s
Total hot run time: 31.06 s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 26, 2025
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@Gabriel39 Gabriel39 merged commit 961a844 into apache:master Feb 26, 2025
31 of 33 checks passed
github-actions bot pushed a commit that referenced this pull request Feb 26, 2025
Schema scanner runs on a separate thread which is executed
asynchronously. We should make sure all context used not be freed once
it is scheduled.

ERROR: AddressSanitizer: heap-buffer-overflow on address 0x613002f33eb2
at pc 0x55e085dccbe3 bp 0x7f345c0e1f10 sp 0x7f345c0e1f08
READ of size 1 at 0x613002f33eb2 thread T2776 (FragmentMgrAsyn)
#0 0x55e085dccbe2 in std::__atomic_base::load(std::memory_order) const
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:481:9
#1 0x55e085dccbe2 in std::atomic::operator bool() const
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/atomic:87:22
#2 0x55e085dccbe2 in
doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0::operator()()
const
/home/zcp/repo_center/doris_master/doris/be/src/exec/schema_scanner.cpp:118:5
#3 0x55e085dccbe2 in void std::__invoke_impl(std::__invoke_other,
doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0&)
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
#4 0x55e085dccbe2 in std::enable_if, void>::type
std::__invoke_r(doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0&)
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:111:2
#5 0x55e085dccbe2 in std::_Function_handler::_M_invoke(std::_Any_data
const&)
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291:9
#6 0x55e050f081ca in doris::ThreadPool::dispatch_thread()
/home/zcp/repo_center/doris_master/doris/be/src/util/threadpool.cpp:608:24
#7 0x55e050ede467 in doris::Thread::supervise_thread(void*)
/home/zcp/repo_center/doris_master/doris/be/src/util/thread.cpp:498:5
    #8 0x7f376ef5aac2 in start_thread nptl/pthread_create.c:442:8
#9 0x7f376efec84f misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
github-actions bot pushed a commit that referenced this pull request Feb 26, 2025
Schema scanner runs on a separate thread which is executed
asynchronously. We should make sure all context used not be freed once
it is scheduled.

ERROR: AddressSanitizer: heap-buffer-overflow on address 0x613002f33eb2
at pc 0x55e085dccbe3 bp 0x7f345c0e1f10 sp 0x7f345c0e1f08
READ of size 1 at 0x613002f33eb2 thread T2776 (FragmentMgrAsyn)
#0 0x55e085dccbe2 in std::__atomic_base::load(std::memory_order) const
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:481:9
#1 0x55e085dccbe2 in std::atomic::operator bool() const
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/atomic:87:22
#2 0x55e085dccbe2 in
doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0::operator()()
const
/home/zcp/repo_center/doris_master/doris/be/src/exec/schema_scanner.cpp:118:5
#3 0x55e085dccbe2 in void std::__invoke_impl(std::__invoke_other,
doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0&)
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
#4 0x55e085dccbe2 in std::enable_if, void>::type
std::__invoke_r(doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0&)
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:111:2
#5 0x55e085dccbe2 in std::_Function_handler::_M_invoke(std::_Any_data
const&)
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291:9
#6 0x55e050f081ca in doris::ThreadPool::dispatch_thread()
/home/zcp/repo_center/doris_master/doris/be/src/util/threadpool.cpp:608:24
#7 0x55e050ede467 in doris::Thread::supervise_thread(void*)
/home/zcp/repo_center/doris_master/doris/be/src/util/thread.cpp:498:5
    #8 0x7f376ef5aac2 in start_thread nptl/pthread_create.c:442:8
#9 0x7f376efec84f misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
@Gabriel39 Gabriel39 added the p0_c label Feb 26, 2025
yiguolei pushed a commit that referenced this pull request Feb 26, 2025
dataroaring pushed a commit that referenced this pull request Feb 27, 2025
zhiqiang-hhhh pushed a commit to zhiqiang-hhhh/doris that referenced this pull request Feb 27, 2025
Schema scanner runs on a separate thread which is executed
asynchronously. We should make sure all context used not be freed once
it is scheduled.

ERROR: AddressSanitizer: heap-buffer-overflow on address 0x613002f33eb2
at pc 0x55e085dccbe3 bp 0x7f345c0e1f10 sp 0x7f345c0e1f08
READ of size 1 at 0x613002f33eb2 thread T2776 (FragmentMgrAsyn)
#0 0x55e085dccbe2 in std::__atomic_base::load(std::memory_order) const
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:481:9
#1 0x55e085dccbe2 in std::atomic::operator bool() const
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/atomic:87:22
apache#2 0x55e085dccbe2 in
doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0::operator()()
const
/home/zcp/repo_center/doris_master/doris/be/src/exec/schema_scanner.cpp:118:5
apache#3 0x55e085dccbe2 in void std::__invoke_impl(std::__invoke_other,
doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0&)
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
apache#4 0x55e085dccbe2 in std::enable_if, void>::type
std::__invoke_r(doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0&)
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:111:2
apache#5 0x55e085dccbe2 in std::_Function_handler::_M_invoke(std::_Any_data
const&)
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291:9
apache#6 0x55e050f081ca in doris::ThreadPool::dispatch_thread()
/home/zcp/repo_center/doris_master/doris/be/src/util/threadpool.cpp:608:24
apache#7 0x55e050ede467 in doris::Thread::supervise_thread(void*)
/home/zcp/repo_center/doris_master/doris/be/src/util/thread.cpp:498:5
    apache#8 0x7f376ef5aac2 in start_thread nptl/pthread_create.c:442:8
apache#9 0x7f376efec84f misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
seawinde pushed a commit to seawinde/doris that referenced this pull request Feb 28, 2025
Schema scanner runs on a separate thread which is executed
asynchronously. We should make sure all context used not be freed once
it is scheduled.

ERROR: AddressSanitizer: heap-buffer-overflow on address 0x613002f33eb2
at pc 0x55e085dccbe3 bp 0x7f345c0e1f10 sp 0x7f345c0e1f08
READ of size 1 at 0x613002f33eb2 thread T2776 (FragmentMgrAsyn)
#0 0x55e085dccbe2 in std::__atomic_base::load(std::memory_order) const
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:481:9
#1 0x55e085dccbe2 in std::atomic::operator bool() const
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/atomic:87:22
apache#2 0x55e085dccbe2 in
doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0::operator()()
const
/home/zcp/repo_center/doris_master/doris/be/src/exec/schema_scanner.cpp:118:5
apache#3 0x55e085dccbe2 in void std::__invoke_impl(std::__invoke_other,
doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0&)
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
apache#4 0x55e085dccbe2 in std::enable_if, void>::type
std::__invoke_r(doris::SchemaScanner::get_next_block_async(doris::RuntimeState*)::$_0&)
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:111:2
apache#5 0x55e085dccbe2 in std::_Function_handler::_M_invoke(std::_Any_data
const&)
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291:9
apache#6 0x55e050f081ca in doris::ThreadPool::dispatch_thread()
/home/zcp/repo_center/doris_master/doris/be/src/util/threadpool.cpp:608:24
apache#7 0x55e050ede467 in doris::Thread::supervise_thread(void*)
/home/zcp/repo_center/doris_master/doris/be/src/util/thread.cpp:498:5
    apache#8 0x7f376ef5aac2 in start_thread nptl/pthread_create.c:442:8
apache#9 0x7f376efec84f misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.9-merged dev/3.0.5-merged p0_c reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants