Skip to content

Conversation

@bobhan1
Copy link
Contributor

@bobhan1 bobhan1 commented Jan 5, 2026

What problem does this PR solve?

Problem Summary:

Exception::to_string() may by accessed concurrently in the following situation:

  • thread A and thread B access the same DorisCallOnce object
  • An Exception e is throw when thread A is calling DorisCallOnce::call and e is thrown to thread A
  • e will be also thrown to thread B later by DorisCallOnce
  • thread A and thread B access Exception::to_string() concurrently

This may cause use-after-free due to the assignment of _cache_string in Exception::to_string

Considering that Exception should not be frequently used, this PR construct _cache_string in constructor Exception::Exception rather than lazily creating it. This can avoid additional unnessary synchronazation.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jan 5, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@bobhan1 bobhan1 force-pushed the remove-cached-string-in-Exception branch from 0d90f21 to df5240f Compare January 5, 2026 06:12
@bobhan1 bobhan1 force-pushed the remove-cached-string-in-Exception branch from df5240f to 394140e Compare January 5, 2026 06:16
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 5, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2026

PR approved by at least one committer and no changes requested.

@bobhan1
Copy link
Contributor Author

bobhan1 commented Jan 5, 2026

run buildall

@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2026

PR approved by anyone and no changes requested.

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.53% (1776/2233)
Line Coverage 64.83% (31447/48508)
Region Coverage 65.36% (15634/23918)
Branch Coverage 56.04% (8324/14854)

@doris-robot
Copy link

TPC-H: Total hot run time: 31449 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 394140e4bf3f364298be96741323e0d8b4fc3484, data reload: false

------ Round 1 ----------------------------------
q1	17601	4389	4089	4089
q2	2053	362	249	249
q3	10134	1276	720	720
q4	10208	830	315	315
q5	8157	2060	1896	1896
q6	194	176	145	145
q7	932	798	648	648
q8	9286	1395	1178	1178
q9	5051	4630	4510	4510
q10	6848	1811	1429	1429
q11	540	300	275	275
q12	749	733	595	595
q13	17810	3808	3069	3069
q14	286	299	288	288
q15	591	518	510	510
q16	701	686	629	629
q17	681	806	527	527
q18	6801	6491	6352	6352
q19	1099	974	610	610
q20	395	347	246	246
q21	3038	2472	2230	2230
q22	1079	1025	939	939
Total cold run time: 104234 ms
Total hot run time: 31449 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4126	4078	4063	4063
q2	317	388	325	325
q3	2121	2612	2238	2238
q4	1336	1766	1309	1309
q5	4163	4129	4406	4129
q6	194	168	134	134
q7	1861	1781	2068	1781
q8	2574	2377	2427	2377
q9	7403	7058	7195	7058
q10	2468	2713	2338	2338
q11	571	453	473	453
q12	746	758	597	597
q13	3573	4102	3363	3363
q14	316	344	415	344
q15	609	503	506	503
q16	642	675	641	641
q17	1119	1274	1349	1274
q18	8116	8125	7635	7635
q19	823	854	867	854
q20	2002	2037	1920	1920
q21	4835	4402	4111	4111
q22	1073	1008	1026	1008
Total cold run time: 50988 ms
Total hot run time: 48455 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 173109 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 394140e4bf3f364298be96741323e0d8b4fc3484, data reload: false

query5	4398	571	427	427
query6	321	223	211	211
query7	4213	462	266	266
query8	324	238	244	238
query9	8744	2593	2585	2585
query10	521	380	322	322
query11	15291	15001	15002	15001
query12	181	116	115	115
query13	1258	500	380	380
query14	5956	2968	2739	2739
query14_1	2654	2612	2656	2612
query15	204	190	178	178
query16	989	482	469	469
query17	1109	676	583	583
query18	2435	439	340	340
query19	223	224	202	202
query20	125	117	118	117
query21	221	143	117	117
query22	4107	4233	4079	4079
query23	16392	15470	15192	15192
query23_1	15561	15400	15452	15400
query24	7383	1540	1167	1167
query24_1	1192	1190	1180	1180
query25	549	484	425	425
query26	1237	273	162	162
query27	2757	450	292	292
query28	4509	2126	2110	2110
query29	812	547	456	456
query30	313	245	211	211
query31	798	648	558	558
query32	82	73	72	72
query33	538	343	301	301
query34	928	899	529	529
query35	757	783	709	709
query36	838	869	838	838
query37	134	138	73	73
query38	2700	2681	2654	2654
query39	767	770	745	745
query39_1	712	728	713	713
query40	218	130	116	116
query41	67	60	62	60
query42	105	102	102	102
query43	475	449	418	418
query44	1282	716	704	704
query45	186	185	175	175
query46	921	957	586	586
query47	1443	1504	1321	1321
query48	323	313	244	244
query49	612	412	334	334
query50	630	269	207	207
query51	3795	3798	3703	3703
query52	103	107	95	95
query53	298	327	274	274
query54	285	251	263	251
query55	82	75	73	73
query56	287	297	295	295
query57	1024	988	890	890
query58	263	255	249	249
query59	1976	2266	2206	2206
query60	322	317	294	294
query61	201	160	164	160
query62	400	342	304	304
query63	298	266	268	266
query64	5022	1339	1005	1005
query65	3765	3724	3691	3691
query66	1449	419	297	297
query67	15492	15419	15614	15419
query68	7260	975	704	704
query69	514	350	308	308
query70	1001	940	936	936
query71	360	296	278	278
query72	6064	3456	3371	3371
query73	773	729	297	297
query74	8830	8807	8574	8574
query75	2842	2811	2438	2438
query76	3888	1077	625	625
query77	532	373	290	290
query78	9853	9935	9159	9159
query79	1632	848	619	619
query80	675	581	476	476
query81	531	259	228	228
query82	414	153	111	111
query83	274	252	237	237
query84	264	117	100	100
query85	918	560	463	463
query86	397	294	321	294
query87	2895	2849	2737	2737
query88	3208	2203	2184	2184
query89	386	353	328	328
query90	2003	154	152	152
query91	173	166	143	143
query92	73	72	61	61
query93	1290	915	531	531
query94	584	304	289	289
query95	576	339	376	339
query96	583	445	204	204
query97	2336	2399	2288	2288
query98	237	199	195	195
query99	632	581	492	492
Total cold run time: 254646 ms
Total hot run time: 173109 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 26.93 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 394140e4bf3f364298be96741323e0d8b4fc3484, data reload: false

query1	0.06	0.05	0.05
query2	0.09	0.05	0.05
query3	0.25	0.09	0.09
query4	1.61	0.13	0.11
query5	0.29	0.26	0.25
query6	1.15	0.66	0.65
query7	0.03	0.03	0.02
query8	0.05	0.04	0.04
query9	0.58	0.49	0.48
query10	0.54	0.56	0.55
query11	0.14	0.09	0.10
query12	0.15	0.11	0.11
query13	0.61	0.59	0.59
query14	0.95	0.95	0.94
query15	0.79	0.79	0.78
query16	0.37	0.41	0.39
query17	1.05	1.04	0.99
query18	0.23	0.21	0.22
query19	1.97	1.83	1.80
query20	0.02	0.02	0.01
query21	15.46	0.29	0.14
query22	5.08	0.05	0.04
query23	15.88	0.27	0.10
query24	1.18	0.44	0.51
query25	0.11	0.07	0.06
query26	0.15	0.13	0.13
query27	0.05	0.05	0.08
query28	5.12	1.04	0.88
query29	12.60	3.98	3.21
query30	0.28	0.13	0.13
query31	2.83	0.63	0.39
query32	3.23	0.56	0.47
query33	3.05	3.05	3.00
query34	16.87	5.08	4.41
query35	4.51	4.55	4.44
query36	0.64	0.50	0.48
query37	0.10	0.06	0.06
query38	0.08	0.04	0.04
query39	0.04	0.02	0.03
query40	0.17	0.14	0.14
query41	0.08	0.04	0.03
query42	0.04	0.03	0.04
query43	0.04	0.04	0.04
Total cold run time: 98.52 s
Total hot run time: 26.93 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 100.00% (8/8) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.25% (18971/35626)
Line Coverage 39.21% (175961/448738)
Region Coverage 33.77% (136213/403378)
Branch Coverage 34.72% (58796/169354)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (8/8) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 72.14% (25126/34830)
Line Coverage 58.82% (263246/447552)
Region Coverage 53.63% (218557/407564)
Branch Coverage 55.21% (93817/169920)

Copy link
Member

@mrhhsg mrhhsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@bobhan1
Copy link
Contributor Author

bobhan1 commented Jan 6, 2026

run cloud_p0

@bobhan1
Copy link
Contributor Author

bobhan1 commented Jan 6, 2026

run nonConcurrent

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (8/8) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 72.14% (25126/34830)
Line Coverage 58.82% (263246/447552)
Region Coverage 53.63% (218557/407564)
Branch Coverage 55.21% (93817/169920)

@yiguolei yiguolei merged commit 0395b50 into apache:master Jan 6, 2026
28 of 30 checks passed
github-actions bot pushed a commit that referenced this pull request Jan 6, 2026
…string` is not thread safe (#59558)

### What problem does this PR solve?

Problem Summary:

`Exception::to_string()` may by accessed concurrently in the following
situation:
- thread A and thread B access the same `DorisCallOnce` object
- An Exception `e` is throw when thread A is calling
`DorisCallOnce::call` and `e` is thrown to thread A
- `e` will be also thrown to thread B later by DorisCallOnce
- thread A and thread B access `Exception::to_string()` concurrently

This may cause use-after-free due to the assignment of `_cache_string`
in `Exception::to_string`

Considering that `Exception` should not be frequently used, this PR
construct `_cache_string` in constructor `Exception::Exception` rather
than lazily creating it. This can avoid additional unnessary
synchronazation.

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants