You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "felixwluo (via GitHub)" <gi...@apache.org> on 2023/11/28 17:55:51 UTC

[PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

felixwluo opened a new pull request, #27727:
URL: https://github.com/apache/doris/pull/27727

   ## Proposed changes
   
   Issue Number: close #27481
   
   <!--Describe your changes.-->
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1830402328

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "freemandealer (via GitHub)" <gi...@apache.org>.
freemandealer commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1840482036

   please add docs for the new config in docs/{en,zh-CN}


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1842521092

   
   <details>
   <summary>TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'</summary>
   
   ```
   Tpch sf100 test result on commit 451de1817ed57cb890737987df3abb4dae8f9b2a, data reload: false
   
   run tpch-sf100 query with default conf and session variables
   q1	4752	4442	4469	4442
   q2	363	152	157	152
   q3	1469	1265	1197	1197
   q4	1116	951	866	866
   q5	3195	3188	3178	3178
   q6	249	130	128	128
   q7	1000	501	497	497
   q8	2227	2225	2199	2199
   q9	6681	6657	6656	6656
   q10	3200	3261	3259	3259
   q11	328	207	212	207
   q12	350	208	205	205
   q13	4559	3848	3780	3780
   q14	243	207	216	207
   q15	570	517	522	517
   q16	448	392	386	386
   q17	1006	611	566	566
   q18	7520	7418	7271	7271
   q19	1514	1569	1370	1370
   q20	620	437	332	332
   q21	3097	2697	2693	2693
   q22	359	295	299	295
   Total cold run time: 44866 ms
   Total hot run time: 40403 ms
   
   run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
   q1	4427	4376	4377	4376
   q2	269	165	177	165
   q3	3543	3532	3503	3503
   q4	2388	2402	2374	2374
   q5	5737	5734	5744	5734
   q6	238	121	121	121
   q7	2366	1873	1836	1836
   q8	3518	3539	3535	3535
   q9	9028	9022	9021	9021
   q10	3897	4001	4003	4001
   q11	502	374	396	374
   q12	768	608	596	596
   q13	4275	3567	3537	3537
   q14	281	255	257	255
   q15	567	517	516	516
   q16	489	457	486	457
   q17	1875	1848	1863	1848
   q18	8676	8182	8103	8103
   q19	1741	1750	1750	1750
   q20	2237	1970	1938	1938
   q21	6519	6175	6148	6148
   q22	514	411	419	411
   Total cold run time: 63855 ms
   Total hot run time: 60599 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "xy720 (via GitHub)" <gi...@apache.org>.
xy720 commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1864011938

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1837787713

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "felixwluo (via GitHub)" <gi...@apache.org>.
felixwluo commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1842243535

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "felixwluo (via GitHub)" <gi...@apache.org>.
felixwluo commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1833123029

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1865933727

   TeamCity be ut coverage result:
    Function Coverage: 36.47% (8536/23406) 
    Line Coverage: 28.58% (69396/242779)
    Region Coverage: 27.60% (35903/130076)
    Branch Coverage: 24.35% (18355/75376)
    Coverage Report: http://coverage.selectdb-in.cc/coverage/a9fd8a85b7622a8574197c81566e0edbda3ed1cb_a9fd8a85b7622a8574197c81566e0edbda3ed1cb/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "freemandealer (via GitHub)" <gi...@apache.org>.
freemandealer commented on code in PR #27727:
URL: https://github.com/apache/doris/pull/27727#discussion_r1409159856


##########
regression-test/suites/load_p1/stream_load/test_stream_load_err_log_limit.groovy:
##########
@@ -0,0 +1,70 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+suite("test_stream_load_err_log_limit", "p1") {

Review Comment:
   Why it is p1 instead of p0? The dataset is fairly small and I believe the test will end very soon.



##########
regression-test/suites/load_p1/stream_load/test_stream_load_err_log_limit.groovy:
##########
@@ -0,0 +1,70 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+suite("test_stream_load_err_log_limit", "p1") {
+    sql "show tables"
+
+    def tableName = "test_stream_load_err_log_limit_table"
+
+    sql """ DROP TABLE IF EXISTS ${tableName} """
+    sql """
+        CREATE TABLE IF NOT EXISTS ${tableName} (
+            `k1` int NOT NULL,
+            `k2` varchar(20) NOT NULL
+        ) ENGINE=OLAP
+        DUPLICATE KEY(`k1`)
+        DISTRIBUTED BY HASH(`k1`) BUCKETS 3
+        PROPERTIES ("replication_allocation" = "tag.location.default: 1");
+    """
+
+    def backendId_to_backendIP = [:]
+    def backendId_to_backendHttpPort = [:]
+    getBackendIpHttpPort(backendId_to_backendIP, backendId_to_backendHttpPort);
+
+    def set_be_param = { paramName, paramValue ->
+        for (String id in backendId_to_backendIP.keySet()) {
+            def beIp = backendId_to_backendIP.get(id)
+            def bePort = backendId_to_backendHttpPort.get(id)
+            def (code, out, err) = curl("POST", String.format("http://%s:%s/api/update_config?%s=%s", beIp, bePort, paramName, paramValue))
+            assertTrue(out.contains("OK"))
+        }
+    }
+
+    set_be_param.call("load_error_log_limit_bytes", "100")
+
+    streamLoad {
+        table "${tableName}"
+        set 'column_separator', ','
+        set 'columns', 'k1, k2, k3'
+        file 'test_stream_load_err_log_limit.csv'
+
+        check { result, exception, startTime, endTime ->
+            if (exception != null) {
+                throw exception
+            }
+            log.info("Stream load result: ${result}".toString())
+            def json = parseJson(result)
+            def (code, out, err) = curl("GET", json.ErrorURL)
+

Review Comment:
   check the error log file contents to see if they are as expected.



##########
regression-test/suites/load_p1/stream_load/test_stream_load_err_log_limit.groovy:
##########
@@ -0,0 +1,70 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+suite("test_stream_load_err_log_limit", "p1") {
+    sql "show tables"
+
+    def tableName = "test_stream_load_err_log_limit_table"
+
+    sql """ DROP TABLE IF EXISTS ${tableName} """
+    sql """
+        CREATE TABLE IF NOT EXISTS ${tableName} (
+            `k1` int NOT NULL,
+            `k2` varchar(20) NOT NULL
+        ) ENGINE=OLAP
+        DUPLICATE KEY(`k1`)
+        DISTRIBUTED BY HASH(`k1`) BUCKETS 3
+        PROPERTIES ("replication_allocation" = "tag.location.default: 1");
+    """
+
+    def backendId_to_backendIP = [:]
+    def backendId_to_backendHttpPort = [:]
+    getBackendIpHttpPort(backendId_to_backendIP, backendId_to_backendHttpPort);
+
+    def set_be_param = { paramName, paramValue ->
+        for (String id in backendId_to_backendIP.keySet()) {
+            def beIp = backendId_to_backendIP.get(id)
+            def bePort = backendId_to_backendHttpPort.get(id)
+            def (code, out, err) = curl("POST", String.format("http://%s:%s/api/update_config?%s=%s", beIp, bePort, paramName, paramValue))
+            assertTrue(out.contains("OK"))
+        }
+    }
+
+    set_be_param.call("load_error_log_limit_bytes", "100")

Review Comment:
   forget set it back when the test is over? use try.. finally to make sure it is set back.



##########
be/src/runtime/runtime_state.cpp:
##########
@@ -414,7 +414,8 @@ Status RuntimeState::append_error_msg_to_file(std::function<std::string()> line,
         }
     }
 
-    if (out.size() > 0) {
+    size_t error_row_size = out.size();
+    if (error_row_size > 0 && error_row_size <= config::load_error_log_limit_bytes) {

Review Comment:
   Maybe we should notify the user that the error log is truncated by appending a line at the end of the error log file saying 'truncation: error log is too long'?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1830672211

   (From new machine)TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 47.19 seconds
    stream load tsv:          570 seconds loaded 74807831229 Bytes, about 125 MB/s
    stream load json:         24 seconds loaded 2358488459 Bytes, about 93 MB/s
    stream load orc:          73 seconds loaded 1101869774 Bytes, about 14 MB/s
    stream load parquet:          32 seconds loaded 861443392 Bytes, about 25 MB/s
    insert into select:          27.9 seconds inserted 10000000 Rows, about 358K ops/s
    storage size: 17100717777 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1865981853

   (From new machine)TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 43.87 seconds
    stream load tsv:          583 seconds loaded 74807831229 Bytes, about 122 MB/s
    stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
    stream load orc:          66 seconds loaded 1101869774 Bytes, about 15 MB/s
    stream load parquet:          32 seconds loaded 861443392 Bytes, about 25 MB/s
    insert into select:          28.9 seconds inserted 10000000 Rows, about 346K ops/s
    storage size: 17183780607 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1842491567

   (From new machine)TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 44.33 seconds
    stream load tsv:          560 seconds loaded 74807831229 Bytes, about 127 MB/s
    stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
    stream load orc:          66 seconds loaded 1101869774 Bytes, about 15 MB/s
    stream load parquet:          33 seconds loaded 861443392 Bytes, about 24 MB/s
    insert into select:          29.0 seconds inserted 10000000 Rows, about 344K ops/s
    storage size: 17162175099 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1864011260

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1830653286

   
   <details>
   <summary>TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'</summary>
   
   ```
   Tpch sf100 test result on commit baab1f2b8f5f1455ad9b1012150e8d13d7ce4d3e, data reload: false
   
   run tpch-sf100 query with default conf and session variables
   q1	4903	4665	4640	4640
   q2	350	162	159	159
   q3	1519	1316	1266	1266
   q4	1164	963	928	928
   q5	3219	3235	3226	3226
   q6	255	135	135	135
   q7	991	508	511	508
   q8	2260	2229	2217	2217
   q9	6951	6925	6921	6921
   q10	3308	3373	3368	3368
   q11	345	208	216	208
   q12	355	211	223	211
   q13	4654	3881	3887	3881
   q14	247	221	216	216
   q15	586	539	532	532
   q16	428	405	385	385
   q17	1031	627	632	627
   q18	8125	7535	7365	7365
   q19	1560	1526	1572	1526
   q20	570	303	321	303
   q21	3378	2912	2983	2912
   q22	365	299	307	299
   Total cold run time: 46564 ms
   Total hot run time: 41833 ms
   
   run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
   q1	4585	4575	4571	4571
   q2	325	202	244	202
   q3	3746	3740	3737	3737
   q4	2509	2503	2497	2497
   q5	6164	6168	6172	6168
   q6	243	122	126	122
   q7	2561	1959	1944	1944
   q8	3733	3684	3702	3684
   q9	9360	9340	9369	9340
   q10	4104	4163	4144	4144
   q11	626	530	516	516
   q12	794	626	618	618
   q13	4375	3654	3653	3653
   q14	273	251	248	248
   q15	585	533	530	530
   q16	506	503	505	503
   q17	2111	2024	2044	2024
   q18	9500	9382	8692	8692
   q19	1766	1784	1761	1761
   q20	2325	1997	1987	1987
   q21	7434	7043	6892	6892
   q22	661	559	517	517
   Total cold run time: 68286 ms
   Total hot run time: 64350 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1833113008

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1833233014

   (From new machine)TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 44.49 seconds
    stream load tsv:          566 seconds loaded 74807831229 Bytes, about 126 MB/s
    stream load json:         18 seconds loaded 2358488459 Bytes, about 124 MB/s
    stream load orc:          65 seconds loaded 1101869774 Bytes, about 16 MB/s
    stream load parquet:          33 seconds loaded 861443392 Bytes, about 24 MB/s
    insert into select:          28.9 seconds inserted 10000000 Rows, about 346K ops/s
    storage size: 17098959010 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "felixwluo (via GitHub)" <gi...@apache.org>.
felixwluo commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1830391754

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1832567745

   (From new machine)TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 43.96 seconds
    stream load tsv:          563 seconds loaded 74807831229 Bytes, about 126 MB/s
    stream load json:         18 seconds loaded 2358488459 Bytes, about 124 MB/s
    stream load orc:          66 seconds loaded 1101869774 Bytes, about 15 MB/s
    stream load parquet:          32 seconds loaded 861443392 Bytes, about 25 MB/s
    insert into select:          29.3 seconds inserted 10000000 Rows, about 341K ops/s
    storage size: 17162716827 Bytes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1833600868

   
   <details>
   <summary>TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'</summary>
   
   ```
   Tpch sf100 test result on commit 85390c020e77ffe95ee78fe01bbf41a1c9f11977, data reload: false
   
   run tpch-sf100 query with default conf and session variables
   q1	4916	4604	4622	4604
   q2	356	139	140	139
   q3	1515	1290	1298	1290
   q4	1148	952	936	936
   q5	3255	3231	3225	3225
   q6	253	128	129	128
   q7	996	511	527	511
   q8	2289	2263	2223	2223
   q9	6942	6987	6907	6907
   q10	3304	3374	3375	3374
   q11	329	206	215	206
   q12	350	218	213	213
   q13	4650	5315	4242	4242
   q14	254	220	218	218
   q15	591	545	515	515
   q16	448	384	401	384
   q17	1022	703	614	614
   q18	8040	7855	7736	7736
   q19	1540	1550	1509	1509
   q20	538	301	312	301
   q21	3421	2966	2967	2966
   q22	373	291	301	291
   Total cold run time: 46530 ms
   Total hot run time: 42532 ms
   
   run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
   q1	4593	4618	4612	4612
   q2	329	197	257	197
   q3	3778	3759	3738	3738
   q4	2549	2537	2534	2534
   q5	6197	6219	6224	6219
   q6	246	123	124	123
   q7	2668	1952	1972	1952
   q8	3797	3737	3724	3724
   q9	9486	9426	9462	9426
   q10	4066	4133	4140	4133
   q11	657	490	520	490
   q12	817	643	641	641
   q13	4401	3645	3646	3645
   q14	277	260	260	260
   q15	582	533	523	523
   q16	529	506	504	504
   q17	2117	2042	2091	2042
   q18	9414	8985	9074	8985
   q19	1822	1761	1763	1761
   q20	2301	1960	1970	1960
   q21	7657	6961	7006	6961
   q22	693	570	554	554
   Total cold run time: 68976 ms
   Total hot run time: 64984 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1842074765

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "xy720 (via GitHub)" <gi...@apache.org>.
xy720 commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1865831495

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1866021394

   
   TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   ```
   Tpch sf100 test result on commit a9fd8a85b7622a8574197c81566e0edbda3ed1cb, data reload: false
   
   run tpch-sf100 query with default conf and session variables
   q1	4705	4405	4408	4405
   q2	363	151	158	151
   q3	1458	1248	1198	1198
   q4	1105	914	888	888
   q5	3155	3156	3131	3131
   q6	249	128	127	127
   q7	989	487	485	485
   q8	2164	2217	2192	2192
   q9	6686	6670	6653	6653
   q10	3203	3286	3278	3278
   q11	307	191	187	187
   q12	351	204	209	204
   q13	4581	3789	3787	3787
   q14	238	212	210	210
   q15	573	518	528	518
   q16	437	384	381	381
   q17	990	580	610	580
   q18	7185	7071	6953	6953
   q19	1507	1362	1412	1362
   q20	503	311	297	297
   q21	3068	2674	2620	2620
   q22	346	279	281	279
   Total cold run time: 44163 ms
   Total hot run time: 39886 ms
   
   run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
   q1	4356	4417	4329	4329
   q2	269	163	169	163
   q3	3541	3518	3519	3518
   q4	2378	2362	2383	2362
   q5	5745	5745	5748	5745
   q6	242	123	119	119
   q7	2366	1868	1898	1868
   q8	3523	3518	3520	3518
   q9	8983	8988	8966	8966
   q10	3906	3991	3995	3991
   q11	481	372	368	368
   q12	772	585	598	585
   q13	4257	3586	3556	3556
   q14	297	263	259	259
   q15	565	521	530	521
   q16	513	445	471	445
   q17	1855	1857	1848	1848
   q18	8627	8215	8357	8215
   q19	1733	1778	1741	1741
   q20	2260	1932	1932	1932
   q21	6540	6208	6185	6185
   q22	497	414	431	414
   Total cold run time: 63706 ms
   Total hot run time: 60648 ms
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1865839832

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1832389259

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "felixwluo (via GitHub)" <gi...@apache.org>.
felixwluo commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1832378528

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "felixwluo (via GitHub)" <gi...@apache.org>.
felixwluo commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1842070911

   > please add docs for the new config in docs/{en,zh-CN}
   
   done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #27727:
URL: https://github.com/apache/doris/pull/27727#issuecomment-1837823819

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [Enhancement](load) Limit the number of incorrect data drops [doris]

Posted by "xy720 (via GitHub)" <gi...@apache.org>.
xy720 merged PR #27727:
URL: https://github.com/apache/doris/pull/27727


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org