You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "zzzzzzzs (via GitHub)" <gi...@apache.org> on 2023/11/18 13:43:52 UTC

[PR] [Enhancement](Load) Nereids supports http_stream and group_commit with stream load [doris]

zzzzzzzs opened a new pull request, #27233:
URL: https://github.com/apache/doris/pull/27233

   This PR is from https://github.com/apache/doris/pull/25049. Due to https://github.com/apache/doris/pull/25049 not merge for a long time, there was a problem during rebase, so a new PR was created.
   
   ## Proposed changes
   
   Issue Number: close #xxx
   
   <!--Describe your changes.-->
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](Load) Nereids supports http_stream and group_commit with stream load [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27233:
URL: https://github.com/apache/doris/pull/27233#issuecomment-1820158555

   (From new machine)TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 45.45 seconds
    stream load tsv:          580 seconds loaded 74807831229 Bytes, about 123 MB/s
    stream load json:         18 seconds loaded 2358488459 Bytes, about 124 MB/s
    stream load orc:          65 seconds loaded 1101869774 Bytes, about 16 MB/s
    stream load parquet:          32 seconds loaded 861443392 Bytes, about 25 MB/s
    insert into select:          28.4 seconds inserted 10000000 Rows, about 352K ops/s
    storage size: 17099610777 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](Load) Nereids supports http_stream and group_commit with stream load [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27233:
URL: https://github.com/apache/doris/pull/27233#issuecomment-1820371529

   
   <details>
   <summary>TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'</summary>
   
   ```
   Tpch sf100 test result on commit 5fa0aab231bff2a122170be9e242146cd2515bf9, data reload: false
   
   run tpch-sf100 query with default conf and session variables
   q1	4955	4678	4663	4663
   q2	366	148	158	148
   q3	2025	1876	1896	1876
   q4	1376	1269	1261	1261
   q5	3947	3954	4022	3954
   q6	249	136	131	131
   q7	1462	881	901	881
   q8	2756	2772	2759	2759
   q9	56967	11728	9569	9569
   q10	10288	3560	3535	3535
   q11	389	254	242	242
   q12	1314	293	296	293
   q13	4591	3827	3816	3816
   q14	325	280	290	280
   q15	590	540	529	529
   q16	660	585	578	578
   q17	1135	993	936	936
   q18	7819	7281	7324	7281
   q19	1708	1681	1658	1658
   q20	555	312	316	312
   q21	7298	3968	3982	3968
   q22	485	376	382	376
   Total cold run time: 111260 ms
   Total hot run time: 49046 ms
   
   run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
   q1	4642	4554	4569	4554
   q2	337	229	258	229
   q3	4010	3983	3982	3982
   q4	2699	2694	2685	2685
   q5	9766	9731	9728	9728
   q6	241	126	126	126
   q7	3006	2464	2475	2464
   q8	4432	4418	4441	4418
   q9	13222	13228	13198	13198
   q10	4100	4198	4210	4198
   q11	802	633	661	633
   q12	982	812	791	791
   q13	4292	3591	3567	3567
   q14	384	346	363	346
   q15	580	525	523	523
   q16	742	684	685	684
   q17	3938	3904	3850	3850
   q18	9462	8980	9036	8980
   q19	1815	1752	1755	1752
   q20	2388	2050	2030	2030
   q21	8726	8511	8618	8511
   q22	898	804	824	804
   Total cold run time: 81464 ms
   Total hot run time: 78053 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](Load) Nereids supports http_stream and group_commit with stream load [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27233:
URL: https://github.com/apache/doris/pull/27233#issuecomment-1820369622

   (From new machine)TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 45.37 seconds
    stream load tsv:          574 seconds loaded 74807831229 Bytes, about 124 MB/s
    stream load json:         18 seconds loaded 2358488459 Bytes, about 124 MB/s
    stream load orc:          65 seconds loaded 1101869774 Bytes, about 16 MB/s
    stream load parquet:          34 seconds loaded 861443392 Bytes, about 24 MB/s
    insert into select:          28.3 seconds inserted 10000000 Rows, about 353K ops/s
    storage size: 17099043152 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](Load) Nereids supports http_stream and group_commit with stream load [doris]

Posted by "morrySnow (via GitHub)" <gi...@apache.org>.
morrySnow commented on code in PR #27233:
URL: https://github.com/apache/doris/pull/27233#discussion_r1408709572


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java:
##########
@@ -418,12 +423,26 @@ public PlanFragment visitPhysicalOlapTableSink(PhysicalOlapTableSink<? extends P
             slotDesc.setIsNullable(column.isAllowNull());
             slotDesc.setAutoInc(column.isAutoInc());
         }
-        OlapTableSink sink = new OlapTableSink(
+        checkInnerGroupCommit(context.getScanNodes());
+        OlapTableSink sink;
+        if (context.getConnectContext().isGroupCommitTvf()) {

Review Comment:
   should not implement and logic code in translator, translator should only transalte Nereids' plan to legacy one



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](Load) Nereids supports http_stream and group_commit with stream load [doris]

Posted by "zzzzzzzs (via GitHub)" <gi...@apache.org>.
zzzzzzzs commented on PR #27233:
URL: https://github.com/apache/doris/pull/27233#issuecomment-1820067619

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](Load) Nereids supports http_stream and group_commit with stream load [doris]

Posted by "zzzzzzzs (via GitHub)" <gi...@apache.org>.
zzzzzzzs commented on PR #27233:
URL: https://github.com/apache/doris/pull/27233#issuecomment-1820330319

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](Load) Nereids supports http_stream and group_commit with stream load [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27233:
URL: https://github.com/apache/doris/pull/27233#issuecomment-1820150767

   
   <details>
   <summary>TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'</summary>
   
   ```
   Tpch sf100 test result on commit 77fdfd7e2d5989d019138040b48152c4e057d0ae, data reload: false
   
   run tpch-sf100 query with default conf and session variables
   q1	4933	4695	4659	4659
   q2	356	159	159	159
   q3	2077	1906	1908	1906
   q4	1395	1294	1287	1287
   q5	3940	3977	4031	3977
   q6	259	137	134	134
   q7	1445	889	909	889
   q8	2777	2807	2792	2792
   q9	9817	9877	9536	9536
   q10	3458	3549	3556	3549
   q11	377	261	247	247
   q12	442	295	296	295
   q13	4584	3839	3811	3811
   q14	330	291	302	291
   q15	588	548	524	524
   q16	662	585	588	585
   q17	1150	996	971	971
   q18	7767	7339	7292	7292
   q19	1678	1691	1692	1691
   q20	562	296	301	296
   q21	4377	3938	3992	3938
   q22	485	365	375	365
   Total cold run time: 53459 ms
   Total hot run time: 49194 ms
   
   run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
   q1	4616	4613	4587	4587
   q2	344	237	267	237
   q3	4031	4037	4009	4009
   q4	2705	2696	2699	2696
   q5	9753	9680	9689	9680
   q6	248	126	122	122
   q7	3006	2433	2462	2433
   q8	4480	4436	4445	4436
   q9	13301	13204	13215	13204
   q10	4103	4203	4196	4196
   q11	840	736	643	643
   q12	981	827	814	814
   q13	4315	3572	3572	3572
   q14	394	339	346	339
   q15	595	523	528	523
   q16	735	685	672	672
   q17	3923	3874	3872	3872
   q18	9449	9090	8898	8898
   q19	1814	1774	1777	1774
   q20	2398	2067	2060	2060
   q21	8787	8704	8715	8704
   q22	907	806	803	803
   Total cold run time: 81725 ms
   Total hot run time: 78274 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org