You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "kaka11chen (via GitHub)" <gi...@apache.org> on 2024/01/04 08:51:21 UTC

[PR] [Opt](parquet-reader) Opt `ColumnSelectVector::set_run_length_null_map()` in parquet reader. [doris]

kaka11chen opened a new pull request, #29527:
URL: https://github.com/apache/doris/pull/29527

   ## Proposed changes
   
   [Opt] (parquet-reader) Opt `ColumnSelectVector::set_run_length_null_map()` in parquet reader.
   
   ### Test Result:
   tpch500,2 node:
   ```
   select count(l_orderkey), count(l_extendedprice), count(l_discount), count(l_shipdate) from lineitem 
   where l_shipdate > '1995-03-15';
   ```
   9 s to 7.9 s.
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Opt](parquet-reader) Opt `ColumnSelectVector::set_run_length_null_map()` in parquet reader. [doris]

Posted by "yiguolei (via GitHub)" <gi...@apache.org>.
yiguolei merged PR #29527:
URL: https://github.com/apache/doris/pull/29527


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Opt](parquet-reader) Opt `ColumnSelectVector::set_run_length_null_map()` in parquet reader. [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #29527:
URL: https://github.com/apache/doris/pull/29527#issuecomment-1876763147

   TeamCity be ut coverage result:
    Function Coverage: 36.64% (8616/23517) 
    Line Coverage: 28.67% (70017/244175)
    Region Coverage: 27.65% (36238/131073)
    Branch Coverage: 24.34% (18507/76040)
    Coverage Report: http://coverage.selectdb-in.cc/coverage/b525de8bef59660d0cbc6d636eb1455fca8e9d26_b525de8bef59660d0cbc6d636eb1455fca8e9d26/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Opt](parquet-reader) Opt `ColumnSelectVector::set_run_length_null_map()` in parquet reader. [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #29527:
URL: https://github.com/apache/doris/pull/29527#issuecomment-1876764271

   
   TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools
   ```
   Tpch sf100 test result on commit b525de8bef59660d0cbc6d636eb1455fca8e9d26, data reload: false
   
   ------ Round 1 ----------------------------------
   q1	17670	5180	5097	5097
   q2	2013	152	143	143
   q3	10546	1078	1109	1078
   q4	10177	787	841	787
   q5	7796	2941	2848	2848
   q6	225	141	137	137
   q7	906	558	527	527
   q8	9275	2009	2030	2009
   q9	6781	6386	6353	6353
   q10	8202	3057	3057	3057
   q11	437	232	205	205
   q12	390	232	227	227
   q13	18021	3608	3619	3608
   q14	238	224	208	208
   q15	550	517	515	515
   q16	453	394	396	394
   q17	979	530	478	478
   q18	7256	6705	6686	6686
   q19	1609	1436	1306	1306
   q20	717	350	332	332
   q21	2838	2396	2444	2396
   q22	384	346	327	327
   Total cold run time: 107463 ms
   Total hot run time: 38718 ms
   
   ----- Round 2, with runtime_filter_mode=off -----
   q1	5117	5105	5091	5091
   q2	333	230	233	230
   q3	3291	3261	3244	3244
   q4	2109	2020	2018	2018
   q5	5773	5772	5773	5772
   q6	210	126	127	126
   q7	2346	1865	1875	1865
   q8	3358	3453	3448	3448
   q9	8770	8716	8733	8716
   q10	3774	3853	3831	3831
   q11	584	480	479	479
   q12	794	638	619	619
   q13	7996	3193	3171	3171
   q14	319	262	272	262
   q15	607	515	518	515
   q16	561	492	506	492
   q17	1942	1767	1749	1749
   q18	8623	8334	8211	8211
   q19	1648	1624	1621	1621
   q20	2172	1954	1939	1939
   q21	5608	5300	5208	5208
   q22	525	472	496	472
   Total cold run time: 66460 ms
   Total hot run time: 59079 ms
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Opt](parquet-reader) Opt `ColumnSelectVector::set_run_length_null_map()` in parquet reader. [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #29527:
URL: https://github.com/apache/doris/pull/29527#issuecomment-1876732210

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Opt](parquet-reader) Opt `ColumnSelectVector::set_run_length_null_map()` in parquet reader. [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #29527:
URL: https://github.com/apache/doris/pull/29527#issuecomment-1878024782

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Opt](parquet-reader) Opt `ColumnSelectVector::set_run_length_null_map()` in parquet reader. [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #29527:
URL: https://github.com/apache/doris/pull/29527#issuecomment-1876780374

   
   TPC-DS test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpcds-tools
   ```
   TPC-DS sf100 test result on commit b525de8bef59660d0cbc6d636eb1455fca8e9d26, data reload: false
   
   run tpcds-sf100 query with default conf and session variables
   query1	953	344	331	331
   query2	6442	1932	1948	1932
   query3	6660	209	205	205
   query4	28275	22434	22458	22434
   query5	3965	542	509	509
   query6	283	182	185	182
   query7	4576	276	259	259
   query8	217	194	194	194
   query9	8225	2497	2614	2497
   query10	507	227	245	227
   query11	16300	15594	15730	15594
   query12	139	73	74	73
   query13	1633	311	312	311
   query14	11816	7068	7104	7068
   query15	219	186	194	186
   query16	6417	265	260	260
   query17	1884	509	503	503
   query18	1949	260	258	258
   query19	186	133	132	132
   query20	78	77	72	72
   query21	184	100	93	93
   query22	4884	4811	4582	4582
   query23	31935	30980	31050	30980
   query24	11902	2754	2814	2754
   query25	602	349	338	338
   query26	1736	140	135	135
   query27	2928	269	275	269
   query28	7195	1876	1859	1859
   query29	2067	379	394	379
   query30	308	143	146	143
   query31	962	769	776	769
   query32	87	59	53	53
   query33	738	253	249	249
   query34	918	446	432	432
   query35	880	781	769	769
   query36	1306	1180	1201	1180
   query37	188	62	78	62
   query38	3373	3253	3215	3215
   query39	1314	1284	1253	1253
   query40	298	87	86	86
   query41	38	37	35	35
   query42	95	82	83	82
   query43	538	483	506	483
   query44	1044	686	698	686
   query45	198	197	183	183
   query46	1063	644	636	636
   query47	1600	1512	1614	1512
   query48	339	253	265	253
   query49	1214	307	313	307
   query50	751	318	365	318
   query51	5557	5256	5346	5256
   query52	91	80	81	80
   query53	209	147	141	141
   query54	1373	562	549	549
   query55	91	78	83	78
   query56	202	186	193	186
   query57	1047	946	887	887
   query58	235	205	195	195
   query59	2742	2599	2630	2599
   query60	232	222	227	222
   query61	87	86	85	85
   query62	646	456	469	456
   query63	159	141	138	138
   query64	6053	1716	1714	1714
   query65	3316	3257	3241	3241
   query66	1390	350	344	344
   query67	15603	15160	15308	15160
   query68	11022	501	523	501
   query69	507	246	254	246
   query70	1680	1578	1555	1555
   query71	499	213	219	213
   query72	5678	3598	3603	3598
   query73	2122	312	305	305
   query74	7013	6362	6430	6362
   query75	4819	2288	2285	2285
   query76	6319	1112	1088	1088
   query77	659	241	268	241
   query78	9163	8583	8563	8563
   query79	2149	506	486	486
   query80	686	348	338	338
   query81	467	212	209	209
   query82	204	96	91	91
   query83	166	138	147	138
   query84	245	54	53	53
   query85	956	286	282	282
   query86	399	412	416	412
   query87	3518	3372	3352	3352
   query88	3207	2247	2228	2228
   query89	366	255	261	255
   query90	1956	195	191	191
   query91	128	93	93	93
   query92	58	47	48	47
   query93	3140	484	424	424
   query94	831	180	176	176
   query95	460	416	401	401
   query96	633	313	316	313
   query97	4262	4120	4165	4120
   query98	208	185	216	185
   query99	1111	842	833	833
   Total cold run time: 295073 ms
   Total hot run time: 178237 ms
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Opt](parquet-reader) Opt `ColumnSelectVector::set_run_length_null_map()` in parquet reader. [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #29527:
URL: https://github.com/apache/doris/pull/29527#issuecomment-1876779520

   
   TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G', run with scripts in https://github.com/apache/doris/tree/master/tools/tpch-tools
   ```
   Tpch sf100 test result on commit b525de8bef59660d0cbc6d636eb1455fca8e9d26, data reload: false
   
   run tpch-sf100 query with default conf and session variables
   q1	5500	5187	5126	5126
   q2	395	170	158	158
   q3	1450	1138	1239	1138
   q4	1077	834	847	834
   q5	3108	3147	3123	3123
   q6	230	139	136	136
   q7	991	567	497	497
   q8	2157	2294	2210	2210
   q9	6724	6684	6678	6678
   q10	3172	3116	3141	3116
   q11	353	209	215	209
   q12	383	233	241	233
   q13	4393	3630	3655	3630
   q14	239	215	224	215
   q15	594	542	529	529
   q16	455	407	404	404
   q17	1044	535	512	512
   q18	7125	6763	6759	6759
   q19	1641	1508	1645	1508
   q20	538	326	364	326
   q21	2898	2450	2519	2450
   q22	396	338	330	330
   Total cold run time: 44863 ms
   Total hot run time: 40121 ms
   
   run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
   q1	5158	5117	5093	5093
   q2	341	241	246	241
   q3	3348	3320	3327	3320
   q4	2162	2071	2039	2039
   q5	5964	5938	5952	5938
   q6	223	127	131	127
   q7	2380	1971	1967	1967
   q8	3565	3659	3659	3659
   q9	9010	8996	9012	8996
   q10	3868	3907	3896	3896
   q11	574	510	501	501
   q12	802	636	657	636
   q13	3883	3215	3203	3203
   q14	304	279	278	278
   q15	595	531	535	531
   q16	567	524	502	502
   q17	2038	1839	1812	1812
   q18	8732	8357	8402	8357
   q19	1742	1680	1697	1680
   q20	2260	1974	1957	1957
   q21	5649	5282	5303	5282
   q22	563	475	484	475
   Total cold run time: 63728 ms
   Total hot run time: 60490 ms
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Opt](parquet-reader) Opt `ColumnSelectVector::set_run_length_null_map()` in parquet reader. [doris]

Posted by "kaka11chen (via GitHub)" <gi...@apache.org>.
kaka11chen commented on PR #29527:
URL: https://github.com/apache/doris/pull/29527#issuecomment-1876723245

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Opt](parquet-reader) Opt `ColumnSelectVector::set_run_length_null_map()` in parquet reader. [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #29527:
URL: https://github.com/apache/doris/pull/29527#issuecomment-1876785446

   (From new machine)TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 47.26 seconds
    stream load tsv:          581 seconds loaded 74807831229 Bytes, about 122 MB/s
    stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
    stream load orc:          66 seconds loaded 1101869774 Bytes, about 15 MB/s
    stream load parquet:          32 seconds loaded 861443392 Bytes, about 25 MB/s
    insert into select:          28.2 seconds inserted 10000000 Rows, about 354K ops/s
    storage size: 17183807302 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Opt](parquet-reader) Opt `ColumnSelectVector::set_run_length_null_map()` in parquet reader. [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #29527:
URL: https://github.com/apache/doris/pull/29527#issuecomment-1878024877

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org