You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "BiteTheDDDDt (via GitHub)" <gi...@apache.org> on 2024/01/10 09:40:35 UTC

[PR] [Improvement](function) optimization for substr with ascii string [doris]

BiteTheDDDDt opened a new pull request, #29799:
URL: https://github.com/apache/doris/pull/29799

   ## Proposed changes
   optimization for substr with ascii string
   select count(substr(l_comment, 1, 10)) from lineitem;
   32s ->7s
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #29799:
URL: https://github.com/apache/doris/pull/29799#issuecomment-1884976844

   
   <details>
   <summary>TPC-H: <b>Total hot run time: 38933 ms</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 1aaebfa423e117854be6701404451afd7ea5f9a3, data reload: false
   
   ------ Round 1 ----------------------------------
   q1	17941	5012	5006	5006
   q2	2035	146	145	145
   q3	10666	1112	1110	1110
   q4	10272	854	817	817
   q5	7834	3244	3136	3136
   q6	205	125	126	125
   q7	892	521	494	494
   q8	9279	2016	2034	2016
   q9	6679	6523	6516	6516
   q10	8257	3077	3101	3077
   q11	418	219	216	216
   q12	359	194	199	194
   q13	18099	3469	3449	3449
   q14	238	214	217	214
   q15	558	520	520	520
   q16	443	385	401	385
   q17	957	504	547	504
   q18	7375	6753	6649	6649
   q19	1582	1462	1383	1383
   q20	542	294	294	294
   q21	2821	2381	2414	2381
   q22	347	302	307	302
   Total cold run time: 107799 ms
   Total hot run time: 38933 ms
   
   ----- Round 2, with runtime_filter_mode=off -----
   q1	4998	4965	5044	4965
   q2	305	202	201	201
   q3	3335	3308	3295	3295
   q4	2246	2268	2262	2262
   q5	5829	5810	5803	5803
   q6	193	120	118	118
   q7	2331	1870	1870	1870
   q8	3458	3609	3604	3604
   q9	8883	8850	8854	8850
   q10	3728	3845	3836	3836
   q11	550	433	416	416
   q12	792	604	600	600
   q13	6313	3234	3167	3167
   q14	297	258	273	258
   q15	576	513	511	511
   q16	513	469	461	461
   q17	2104	2021	2009	2009
   q18	8788	8364	8284	8284
   q19	1624	1646	1629	1629
   q20	2197	1944	1917	1917
   q21	6101	5709	5829	5709
   q22	556	479	459	459
   Total cold run time: 65717 ms
   Total hot run time: 60224 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #29799:
URL: https://github.com/apache/doris/pull/29799#issuecomment-1885004126

   (From new machine)TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 45.82 seconds
    stream load tsv:          566 seconds loaded 74807831229 Bytes, about 126 MB/s
    stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
    stream load orc:          66 seconds loaded 1101869774 Bytes, about 15 MB/s
    stream load parquet:          32 seconds loaded 861443392 Bytes, about 25 MB/s
    insert into select:          21.2 seconds inserted 10000000 Rows, about 471K ops/s
    storage size: 17183830963 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #29799:
URL: https://github.com/apache/doris/pull/29799#issuecomment-1885014616

   
   Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Load test result on commit 1aaebfa423e117854be6701404451afd7ea5f9a3 with default session variables
   Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
   Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
   Stream load parquet:      33 seconds loaded 861443392 Bytes, about 24 MB/s
   Insert into select:       12.4 seconds inserted 10000000 Rows, about 806K ops/s
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #29799:
URL: https://github.com/apache/doris/pull/29799#issuecomment-1884513909

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #29799:
URL: https://github.com/apache/doris/pull/29799#issuecomment-1885008933

   
   <details>
   <summary>ClickBench: <b>Total hot run time: 30.97 s</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit 1aaebfa423e117854be6701404451afd7ea5f9a3, data reload: false
   
   query1	0.09	0.06	0.06
   query2	0.06	0.03	0.03
   query3	0.25	0.11	0.12
   query4	1.74	0.13	0.12
   query5	0.52	0.51	0.51
   query6	1.33	0.66	0.64
   query7	0.02	0.02	0.01
   query8	0.04	0.02	0.02
   query9	0.56	0.49	0.50
   query10	0.54	0.54	0.53
   query11	0.14	0.09	0.09
   query12	0.12	0.09	0.09
   query13	0.60	0.60	0.60
   query14	0.77	0.81	0.79
   query15	0.81	0.80	0.78
   query16	0.34	0.35	0.35
   query17	0.99	0.96	1.01
   query18	0.27	0.25	0.25
   query19	1.84	1.77	1.78
   query20	0.02	0.01	0.01
   query21	15.40	0.57	0.55
   query22	2.63	2.52	2.27
   query23	17.40	0.92	0.77
   query24	2.51	0.60	5.59
   query25	1.80	0.15	0.15
   query26	0.14	0.14	0.12
   query27	0.16	0.14	0.14
   query28	9.38	0.80	0.82
   query29	12.57	3.20	3.28
   query30	0.54	0.52	0.52
   query31	2.78	0.35	0.37
   query32	3.34	0.49	0.48
   query33	3.21	3.26	3.21
   query34	15.71	4.16	4.14
   query35	4.16	4.16	4.16
   query36	1.10	1.05	1.04
   query37	0.07	0.05	0.05
   query38	0.04	0.03	0.03
   query39	0.02	0.02	0.02
   query40	0.16	0.14	0.12
   query41	0.07	0.02	0.01
   query42	0.02	0.01	0.02
   query43	0.02	0.02	0.02
   Total cold run time: 104.28 s
   Total hot run time: 30.97 s
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "BiteTheDDDDt (via GitHub)" <gi...@apache.org>.
BiteTheDDDDt commented on PR #29799:
URL: https://github.com/apache/doris/pull/29799#issuecomment-1884865782

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "BiteTheDDDDt (via GitHub)" <gi...@apache.org>.
BiteTheDDDDt commented on PR #29799:
URL: https://github.com/apache/doris/pull/29799#issuecomment-1884501198

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #29799:
URL: https://github.com/apache/doris/pull/29799#issuecomment-1884881437

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #29799:
URL: https://github.com/apache/doris/pull/29799#issuecomment-1885009293

   TeamCity be ut coverage result:
    Function Coverage: 36.69% (8654/23584) 
    Line Coverage: 28.73% (70472/245297)
    Region Coverage: 27.67% (36432/131678)
    Branch Coverage: 24.37% (18635/76482)
    Coverage Report: http://coverage.selectdb-in.cc/coverage/1aaebfa423e117854be6701404451afd7ea5f9a3_1aaebfa423e117854be6701404451afd7ea5f9a3/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #29799:
URL: https://github.com/apache/doris/pull/29799#issuecomment-1884878309

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #29799:
URL: https://github.com/apache/doris/pull/29799#issuecomment-1884881514

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "HappenLee (via GitHub)" <gi...@apache.org>.
HappenLee merged PR #29799:
URL: https://github.com/apache/doris/pull/29799


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Improvement](function) optimization for substr with ascii string [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #29799:
URL: https://github.com/apache/doris/pull/29799#issuecomment-1884998577

   
   <details>
   <summary>TPC-DS: <b>Total hot run time: 178536 ms</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit 1aaebfa423e117854be6701404451afd7ea5f9a3, data reload: false
   
   query1	937	340	337	337
   query2	6716	2060	1824	1824
   query3	6705	209	206	206
   query4	26249	22234	22215	22215
   query5	5546	485	520	485
   query6	276	199	183	183
   query7	4619	275	273	273
   query8	221	204	199	199
   query9	9175	2832	2743	2743
   query10	609	271	243	243
   query11	16226	15444	15376	15376
   query12	137	82	77	77
   query13	1650	350	329	329
   query14	12148	7283	7275	7275
   query15	238	192	203	192
   query16	6536	271	274	271
   query17	1930	489	492	489
   query18	1952	286	266	266
   query19	291	143	135	135
   query20	87	84	77	77
   query21	191	92	92	92
   query22	5184	4811	4906	4811
   query23	31709	31026	30977	30977
   query24	12166	2833	2845	2833
   query25	577	348	337	337
   query26	1757	157	150	150
   query27	2853	278	286	278
   query28	6826	1935	1925	1925
   query29	2088	391	386	386
   query30	289	148	145	145
   query31	963	771	792	771
   query32	95	69	60	60
   query33	752	272	284	272
   query34	901	448	437	437
   query35	928	756	749	749
   query36	1263	1237	1306	1237
   query37	186	69	74	69
   query38	3414	3273	3280	3273
   query39	1333	1290	1299	1290
   query40	302	98	90	90
   query41	39	36	35	35
   query42	100	96	92	92
   query43	543	506	513	506
   query44	1092	709	739	709
   query45	211	183	178	178
   query46	1045	656	674	656
   query47	1717	1583	1615	1583
   query48	391	320	316	316
   query49	1277	314	316	314
   query50	742	328	320	320
   query51	5425	5257	5358	5257
   query52	100	83	90	83
   query53	227	155	155	155
   query54	1422	580	584	580
   query55	100	92	89	89
   query56	221	197	208	197
   query57	1020	978	971	971
   query58	244	215	216	215
   query59	2711	2514	2502	2502
   query60	242	233	248	233
   query61	83	83	82	82
   query62	602	419	418	418
   query63	179	163	155	155
   query64	5857	1722	1626	1626
   query65	3354	3263	3266	3263
   query66	1326	342	334	334
   query67	15707	15262	15216	15216
   query68	11378	533	523	523
   query69	577	279	283	279
   query70	1685	1560	1438	1438
   query71	516	211	238	211
   query72	4854	2858	2875	2858
   query73	2059	326	321	321
   query74	7000	6463	6429	6429
   query75	4908	2298	2277	2277
   query76	6319	1057	1136	1057
   query77	694	281	272	272
   query78	9670	8755	8682	8682
   query79	1043	520	497	497
   query80	562	352	349	349
   query81	455	212	211	211
   query82	208	88	94	88
   query83	164	140	134	134
   query84	270	59	64	59
   query85	961	299	296	296
   query86	385	401	366	366
   query87	3544	3457	3458	3457
   query88	2779	2225	2246	2225
   query89	339	260	253	253
   query90	1862	193	206	193
   query91	176	128	141	128
   query92	62	56	52	52
   query93	1013	434	428	428
   query94	764	191	193	191
   query95	463	426	418	418
   query96	615	318	313	313
   query97	4277	4153	4210	4153
   query98	207	203	182	182
   query99	1024	756	734	734
   Total cold run time: 292835 ms
   Total hot run time: 178536 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org