You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/05/26 13:40:29 UTC

[GitHub] [incubator-doris] xinyiZzz opened a new pull request, #9803: brpc_http

xinyiZzz opened a new pull request, #9803:
URL: https://github.com/apache/incubator-doris/pull/9803

   # Proposed changes
   
   Issue Number: close #7163
   
   ## Problem Summary:
   
   When the length of `Tuple/Block data` is greater than 1.8G, serialize the protoBuf request and embed the `Tuple/Block data` into the controller attachment and transmit it through http brpc.
   
   This is to avoid errors when the length of the protoBuf request exceeds 2G: `Bad request, error_text=[E1003]Fail to compress request`.
   
   In #7164, `Tuple/Block data` was put into attachment and sent via default `baidu_std brpc`, but when the attachment exceeds 2G, it will be truncated. There is no 2G limit for sending via `http brpc`.
   
   Also, in #7921, consider putting `Tuple/Block data` into attachment transport by default, as this theoretically reduces one serialization and improves performance. However, the test found that the performance did not improve, but the memory peak increased due to the addition of a memory copy.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes)
   2. Has unit tests been added: (No)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r886315918


##########
be/src/exec/tablet_sink.cpp:
##########
@@ -515,14 +519,27 @@ void NodeChannel::try_send_batch(RuntimeState* state) {
         CHECK(_pending_batches_num == 0) << _pending_batches_num;
     }
 
-    if (_parent->_transfer_data_by_brpc_attachment && request.has_row_batch()) {
-        request_row_batch_transfer_attachment<PTabletWriterAddBatchRequest,
-                                              ReusableClosure<PTabletWriterAddBatchResult>>(
+    if (_tuple_data_buffer_ptr != nullptr && _tuple_data_buffer.size() != 0 &&
+        request.has_row_batch()) {
+        request_embed_attachment_contain_tuple<PTabletWriterAddBatchRequest,
+                                               ReusableClosure<PTabletWriterAddBatchResult>>(
                 &request, _tuple_data_buffer, _add_batch_closure);
+        std::string brpc_url;
+        brpc_url = "http://" + _node_info.host + ":" + std::to_string(_node_info.brpc_port);

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
yangzhg commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r894110798


##########
be/src/util/brpc_client_cache.h:
##########
@@ -71,10 +71,22 @@ class BrpcClientCache {
         }
 
         // new one stub and insert into map
+        auto stub = get_new_client_no_cache(host_port);
+        _stub_map.try_emplace_l(
+                host_port, [&stub](typename StubMap<T>::mapped_type& v) { stub = v; }, stub);
+        return stub;
+    }
+
+    std::shared_ptr<T> get_new_client_no_cache(const std::string& host_port,
+                                               const std::string& protocol = "baidu_std",
+                                               const std::string& connect_type = "") {
         brpc::ChannelOptions options;
         if constexpr (std::is_same_v<T, PFunctionService_Stub>) {
             options.protocol = config::function_service_protocol;
+        } else {
+            options.protocol = protocol;
         }
+        if (connect_type != "") options.connection_type = connect_type;

Review Comment:
   ```suggestion
           if (connect_type != "") { options.connection_type = connect_type; }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#issuecomment-1152514412

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#issuecomment-1153380204

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r889763246


##########
be/src/service/internal_service.cpp:
##########
@@ -66,19 +66,53 @@ void PInternalServiceImpl::transmit_data(google::protobuf::RpcController* cntl_b
                                          PTransmitDataResult* response,
                                          google::protobuf::Closure* done) {
     SCOPED_SWITCH_BTHREAD();
-    VLOG_ROW << "transmit data: fragment_instance_id=" << print_id(request->finst_id())
-             << " node=" << request->node_id();
+    // TODO(zxy) delete in 1.2 version
     brpc::Controller* cntl = static_cast<brpc::Controller*>(cntl_base);
     attachment_transfer_request_row_batch<PTransmitDataParams>(request, cntl);
+
+    _transmit_data(cntl_base, request, response, done, Status::OK());
+}
+
+void PInternalServiceImpl::transmit_data_by_http(google::protobuf::RpcController* cntl_base,
+                                                 const PEchoRequest* request,
+                                                 PTransmitDataResult* response,
+                                                 google::protobuf::Closure* done) {
+    SCOPED_SWITCH_BTHREAD();
+    PTransmitDataParams* request_raw = new PTransmitDataParams();

Review Comment:
   fix = =



##########
be/src/service/internal_service.cpp:
##########
@@ -122,20 +156,44 @@ void PInternalServiceImpl::tablet_writer_add_block(google::protobuf::RpcControll
                                                    const PTabletWriterAddBlockRequest* request,
                                                    PTabletWriterAddBlockResult* response,
                                                    google::protobuf::Closure* done) {
+    // TODO(zxy) delete in 1.2 version
+    brpc::Controller* cntl = static_cast<brpc::Controller*>(cntl_base);
+    attachment_transfer_request_block<PTabletWriterAddBlockRequest>(request, cntl);
+
+    _tablet_writer_add_block(cntl_base, request, response, done);
+}
+
+void PInternalServiceImpl::tablet_writer_add_block_by_http(
+        google::protobuf::RpcController* cntl_base, const ::doris::PEchoRequest* request,
+        PTabletWriterAddBlockResult* response, google::protobuf::Closure* done) {
+    PTabletWriterAddBlockRequest* request_raw = new PTabletWriterAddBlockRequest();

Review Comment:
   fix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
yangzhg commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r883309248


##########
be/src/exec/tablet_sink.cpp:
##########
@@ -515,14 +519,27 @@ void NodeChannel::try_send_batch(RuntimeState* state) {
         CHECK(_pending_batches_num == 0) << _pending_batches_num;
     }
 
-    if (_parent->_transfer_data_by_brpc_attachment && request.has_row_batch()) {
-        request_row_batch_transfer_attachment<PTabletWriterAddBatchRequest,
-                                              ReusableClosure<PTabletWriterAddBatchResult>>(
+    if (_tuple_data_buffer_ptr != nullptr && _tuple_data_buffer.size() != 0 &&
+        request.has_row_batch()) {
+        request_embed_attachment_contain_tuple<PTabletWriterAddBatchRequest,
+                                               ReusableClosure<PTabletWriterAddBatchResult>>(
                 &request, _tuple_data_buffer, _add_batch_closure);
+        std::string brpc_url;
+        brpc_url = "http://" + _node_info.host + ":" + std::to_string(_node_info.brpc_port);

Review Comment:
   ```suggestion
           brpc_url = fmt::format("http://{}:{}", _node_info.host, _node_info.brpc_port);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#issuecomment-1144598040

   @yangzhg @morningman  PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
yangzhg commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r888796530


##########
gensrc/proto/internal_service.proto:
##########
@@ -456,14 +457,19 @@ message PResetRPCChannelResponse {
     repeated string channels = 2;
 };
 
+message PEchoRequest {};

Review Comment:
   brpc called EchoRequest because of it's service called `echo`
    you can use `PEmptyRequest`  or `PEmptyTransmitDataParams`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r886311030


##########
be/src/util/proto_util.h:
##########
@@ -17,10 +17,17 @@
 
 #pragma once
 
-#include "util/stack_util.h"
-
 namespace doris {
 
+// When the tuple/block data is greater than 2G * 0.9, embed the tuple/block data
+// and the request serialization string in the attachment, and use "http" brpc.
+// "http"brpc requires that only one of request and attachment be non-null.
+//
+// 2G: In the default "baidu_std" brpcd, upper limit of the request and attachment length is 2G.
+// 0.9: Reserve a buffer of 0.1 for embedding request serialization strings, etc.
+constexpr size_t MIN_HTTP_BRPC_SIZE = (1ULL << 31) * 0.9;

Review Comment:
   done, great



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
yangzhg commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r883222702


##########
be/src/util/proto_util.h:
##########
@@ -17,10 +17,17 @@
 
 #pragma once
 
-#include "util/stack_util.h"
-
 namespace doris {
 
+// When the tuple/block data is greater than 2G * 0.9, embed the tuple/block data
+// and the request serialization string in the attachment, and use "http" brpc.
+// "http"brpc requires that only one of request and attachment be non-null.
+//
+// 2G: In the default "baidu_std" brpcd, upper limit of the request and attachment length is 2G.
+// 0.9: Reserve a buffer of 0.1 for embedding request serialization strings, etc.
+constexpr size_t MIN_HTTP_BRPC_SIZE = (1ULL << 31) * 0.9;

Review Comment:
   ```suggestion
   constexpr size_t MIN_HTTP_BRPC_SIZE = (1ULL << 31) -  (1 << 27);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
yangzhg commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r883221816


##########
gensrc/proto/internal_service.proto:
##########
@@ -456,14 +457,19 @@ message PResetRPCChannelResponse {
     repeated string channels = 2;
 };
 
+message PEchoRequest {};

Review Comment:
   why named PEchoRequest ? what dose echo mean



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#issuecomment-1152890421

   > `PEchoRequest` has not been renamed, should not merge it before resolve it
   
   fix, modified to `PEmptyRequest` , thks~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r886310404


##########
be/src/common/config.h:
##########
@@ -553,9 +553,10 @@ CONF_String(default_rowset_type, "BETA");
 CONF_Int64(brpc_max_body_size, "3147483648");
 // Max unwritten bytes in each socket, if the limit is reached, Socket.Write fails with EOVERCROWDED
 CONF_Int64(brpc_socket_max_unwritten_bytes, "1073741824");
-// Whether to transfer RowBatch in ProtoBuf Request to Controller Attachment and send it
-// through brpc, this will be faster and avoid the error of Request length overflow.
-CONF_mBool(transfer_data_by_brpc_attachment, "false");
+// Whether to embed the ProtoBuf Request serialized string together with Tuple/Block data into
+// Controller Attachment and send it through http brpc when the length of the Tuple/Block data
+// is greater than 1.8G. This is to avoid the error of Request length overflow (2G).
+CONF_mBool(brpc_request_embed_attachment_send_by_http, "true");

Review Comment:
   > transfer_large_data_by_brpc
   
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r886310764


##########
be/src/common/config.h:
##########
@@ -553,9 +553,10 @@ CONF_String(default_rowset_type, "BETA");
 CONF_Int64(brpc_max_body_size, "3147483648");
 // Max unwritten bytes in each socket, if the limit is reached, Socket.Write fails with EOVERCROWDED
 CONF_Int64(brpc_socket_max_unwritten_bytes, "1073741824");
-// Whether to transfer RowBatch in ProtoBuf Request to Controller Attachment and send it
-// through brpc, this will be faster and avoid the error of Request length overflow.
-CONF_mBool(transfer_data_by_brpc_attachment, "false");
+// Whether to embed the ProtoBuf Request serialized string together with Tuple/Block data into
+// Controller Attachment and send it through http brpc when the length of the Tuple/Block data
+// is greater than 1.8G. This is to avoid the error of Request length overflow (2G).
+CONF_mBool(brpc_request_embed_attachment_send_by_http, "true");

Review Comment:
   > The default value is true. It is compatible that an new version BE send request to a old version BE?
   
   yep, should be false, we can set true in v1.3



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
morningman commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r888771409


##########
be/src/exec/tablet_sink.cpp:
##########
@@ -515,14 +513,31 @@ void NodeChannel::try_send_batch(RuntimeState* state) {
         CHECK(_pending_batches_num == 0) << _pending_batches_num;
     }
 
-    if (_parent->_transfer_data_by_brpc_attachment && request.has_row_batch()) {
-        request_row_batch_transfer_attachment<PTabletWriterAddBatchRequest,
-                                              ReusableClosure<PTabletWriterAddBatchResult>>(
-                &request, _tuple_data_buffer, _add_batch_closure);
+    if (config::transfer_large_data_by_brpc && request.has_row_batch() &&

Review Comment:
   User can change this config at runtime. So better to save its value when creating tablet sink.
   Otherwise, it may changed during loading process.



##########
be/src/runtime/data_stream_sender.cpp:
##########
@@ -149,12 +156,28 @@ Status DataStreamSender::Channel::send_batch(PRowBatch* batch, bool eos) {
     _closure->ref();
     _closure->cntl.set_timeout_ms(_brpc_timeout_ms);
 
-    if (_parent->_transfer_data_by_brpc_attachment && _brpc_request.has_row_batch()) {
-        request_row_batch_transfer_attachment<PTransmitDataParams,
-                                              RefCountClosure<PTransmitDataResult>>(
-                &_brpc_request, _parent->_tuple_data_buffer, _closure);
+    if (config::transfer_large_data_by_brpc && _brpc_request.has_row_batch() &&

Review Comment:
   Save `transfer_large_data_by_brpc` when creating data stream sender



##########
be/src/service/internal_service.cpp:
##########
@@ -66,19 +66,53 @@ void PInternalServiceImpl::transmit_data(google::protobuf::RpcController* cntl_b
                                          PTransmitDataResult* response,
                                          google::protobuf::Closure* done) {
     SCOPED_SWITCH_BTHREAD();
-    VLOG_ROW << "transmit data: fragment_instance_id=" << print_id(request->finst_id())
-             << " node=" << request->node_id();
+    // TODO(zxy) delete in 1.2 version
     brpc::Controller* cntl = static_cast<brpc::Controller*>(cntl_base);
     attachment_transfer_request_row_batch<PTransmitDataParams>(request, cntl);
+
+    _transmit_data(cntl_base, request, response, done, Status::OK());
+}
+
+void PInternalServiceImpl::transmit_data_by_http(google::protobuf::RpcController* cntl_base,
+                                                 const PEchoRequest* request,
+                                                 PTransmitDataResult* response,
+                                                 google::protobuf::Closure* done) {
+    SCOPED_SWITCH_BTHREAD();
+    PTransmitDataParams* request_raw = new PTransmitDataParams();

Review Comment:
   Memory leak?



##########
be/src/service/internal_service.cpp:
##########
@@ -122,20 +156,44 @@ void PInternalServiceImpl::tablet_writer_add_block(google::protobuf::RpcControll
                                                    const PTabletWriterAddBlockRequest* request,
                                                    PTabletWriterAddBlockResult* response,
                                                    google::protobuf::Closure* done) {
+    // TODO(zxy) delete in 1.2 version
+    brpc::Controller* cntl = static_cast<brpc::Controller*>(cntl_base);
+    attachment_transfer_request_block<PTabletWriterAddBlockRequest>(request, cntl);
+
+    _tablet_writer_add_block(cntl_base, request, response, done);
+}
+
+void PInternalServiceImpl::tablet_writer_add_block_by_http(
+        google::protobuf::RpcController* cntl_base, const ::doris::PEchoRequest* request,
+        PTabletWriterAddBlockResult* response, google::protobuf::Closure* done) {
+    PTabletWriterAddBlockRequest* request_raw = new PTabletWriterAddBlockRequest();

Review Comment:
   Memory leak?



##########
be/src/service/internal_service.cpp:
##########
@@ -66,19 +66,53 @@ void PInternalServiceImpl::transmit_data(google::protobuf::RpcController* cntl_b
                                          PTransmitDataResult* response,
                                          google::protobuf::Closure* done) {
     SCOPED_SWITCH_BTHREAD();
-    VLOG_ROW << "transmit data: fragment_instance_id=" << print_id(request->finst_id())
-             << " node=" << request->node_id();
+    // TODO(zxy) delete in 1.2 version
     brpc::Controller* cntl = static_cast<brpc::Controller*>(cntl_base);
     attachment_transfer_request_row_batch<PTransmitDataParams>(request, cntl);
+
+    _transmit_data(cntl_base, request, response, done, Status::OK());
+}
+
+void PInternalServiceImpl::transmit_data_by_http(google::protobuf::RpcController* cntl_base,
+                                                 const PEchoRequest* request,
+                                                 PTransmitDataResult* response,
+                                                 google::protobuf::Closure* done) {
+    SCOPED_SWITCH_BTHREAD();
+    PTransmitDataParams* request_raw = new PTransmitDataParams();
+    brpc::Controller* cntl = static_cast<brpc::Controller*>(cntl_base);
+    Status st = attachment_extract_request_contain_tuple<PTransmitDataParams>(request_raw, cntl);
+    _transmit_data(cntl_base, request_raw, response, done, st);
+}
+
+void PInternalServiceImpl::_transmit_data(google::protobuf::RpcController* cntl_base,
+                                          const PTransmitDataParams* request,
+                                          PTransmitDataResult* response,
+                                          google::protobuf::Closure* done, Status extract_st) {

Review Comment:
   ```suggestion
                                             google::protobuf::Closure* done, const Status& extract_st) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r886316319


##########
be/src/exec/tablet_sink.cpp:
##########
@@ -515,14 +519,27 @@ void NodeChannel::try_send_batch(RuntimeState* state) {
         CHECK(_pending_batches_num == 0) << _pending_batches_num;
     }
 
-    if (_parent->_transfer_data_by_brpc_attachment && request.has_row_batch()) {
-        request_row_batch_transfer_attachment<PTabletWriterAddBatchRequest,
-                                              ReusableClosure<PTabletWriterAddBatchResult>>(
+    if (_tuple_data_buffer_ptr != nullptr && _tuple_data_buffer.size() != 0 &&
+        request.has_row_batch()) {
+        request_embed_attachment_contain_tuple<PTabletWriterAddBatchRequest,
+                                               ReusableClosure<PTabletWriterAddBatchResult>>(
                 &request, _tuple_data_buffer, _add_batch_closure);
+        std::string brpc_url;
+        brpc_url = "http://" + _node_info.host + ":" + std::to_string(_node_info.brpc_port);

Review Comment:
   fix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
yangzhg commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r883306907


##########
be/src/common/config.h:
##########
@@ -553,9 +553,10 @@ CONF_String(default_rowset_type, "BETA");
 CONF_Int64(brpc_max_body_size, "3147483648");
 // Max unwritten bytes in each socket, if the limit is reached, Socket.Write fails with EOVERCROWDED
 CONF_Int64(brpc_socket_max_unwritten_bytes, "1073741824");
-// Whether to transfer RowBatch in ProtoBuf Request to Controller Attachment and send it
-// through brpc, this will be faster and avoid the error of Request length overflow.
-CONF_mBool(transfer_data_by_brpc_attachment, "false");
+// Whether to embed the ProtoBuf Request serialized string together with Tuple/Block data into
+// Controller Attachment and send it through http brpc when the length of the Tuple/Block data
+// is greater than 1.8G. This is to avoid the error of Request length overflow (2G).
+CONF_mBool(brpc_request_embed_attachment_send_by_http, "true");

Review Comment:
   maybe `transfer_large_data_by_brpc`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
yangzhg commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r883312140


##########
be/src/util/proto_util.h:
##########
@@ -17,10 +17,17 @@
 
 #pragma once
 
-#include "util/stack_util.h"
-
 namespace doris {
 
+// When the tuple/block data is greater than 2G * 0.9, embed the tuple/block data
+// and the request serialization string in the attachment, and use "http" brpc.
+// "http"brpc requires that only one of request and attachment be non-null.
+//
+// 2G: In the default "baidu_std" brpcd, upper limit of the request and attachment length is 2G.
+// 0.9: Reserve a buffer of 0.1 for embedding request serialization strings, etc.
+constexpr size_t MIN_HTTP_BRPC_SIZE = (1ULL << 31) * 0.9;

Review Comment:
   just add comment to say this 2GB - 256M



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r883235362


##########
be/src/util/proto_util.h:
##########
@@ -17,10 +17,17 @@
 
 #pragma once
 
-#include "util/stack_util.h"
-
 namespace doris {
 
+// When the tuple/block data is greater than 2G * 0.9, embed the tuple/block data
+// and the request serialization string in the attachment, and use "http" brpc.
+// "http"brpc requires that only one of request and attachment be non-null.
+//
+// 2G: In the default "baidu_std" brpcd, upper limit of the request and attachment length is 2G.
+// 0.9: Reserve a buffer of 0.1 for embedding request serialization strings, etc.
+constexpr size_t MIN_HTTP_BRPC_SIZE = (1ULL << 31) * 0.9;

Review Comment:
   It seems more difficult to understand



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r883235473


##########
gensrc/proto/internal_service.proto:
##########
@@ -456,14 +457,19 @@ message PResetRPCChannelResponse {
     repeated string channels = 2;
 };
 
+message PEchoRequest {};

Review Comment:
   In the official example of brpc, use EchoRequest to represent empty Request.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
yangzhg commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r888792101


##########
be/src/runtime/row_batch.cpp:
##########
@@ -300,17 +290,13 @@ Status RowBatch::serialize(PRowBatch* output_batch, size_t* uncompressed_size,
 
     // return compressed and uncompressed size
     size_t pb_size = get_batch_size(*output_batch);
-    if (allocated_buf == nullptr) {
-        *uncompressed_size = pb_size - mutable_tuple_data->size() + tuple_byte_size;
-        *compressed_size = pb_size;
-        if (pb_size > std::numeric_limits<int32_t>::max()) {
-            // the protobuf has a hard limit of 2GB for serialized data.
-            return Status::InternalError(fmt::format(
-                    "The rowbatch is large than 2GB({}), can not send by Protobuf.", pb_size));
-        }
-    } else {
-        *uncompressed_size = pb_size + tuple_byte_size;
-        *compressed_size = pb_size + mutable_tuple_data->size();
+    *uncompressed_size = pb_size - mutable_tuple_data->size() + tuple_byte_size;
+    *compressed_size = pb_size;
+    if (config::transfer_large_data_by_brpc == false &&

Review Comment:
   ```suggestion
       if (!config::transfer_large_data_by_brpc &&
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r895001966


##########
be/src/util/brpc_client_cache.h:
##########
@@ -71,10 +71,22 @@ class BrpcClientCache {
         }
 
         // new one stub and insert into map
+        auto stub = get_new_client_no_cache(host_port);
+        _stub_map.try_emplace_l(
+                host_port, [&stub](typename StubMap<T>::mapped_type& v) { stub = v; }, stub);
+        return stub;
+    }
+
+    std::shared_ptr<T> get_new_client_no_cache(const std::string& host_port,
+                                               const std::string& protocol = "baidu_std",
+                                               const std::string& connect_type = "") {
         brpc::ChannelOptions options;
         if constexpr (std::is_same_v<T, PFunctionService_Stub>) {
             options.protocol = config::function_service_protocol;
+        } else {
+            options.protocol = protocol;
         }
+        if (connect_type != "") options.connection_type = connect_type;

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r889754615


##########
be/src/exec/tablet_sink.cpp:
##########
@@ -515,14 +513,31 @@ void NodeChannel::try_send_batch(RuntimeState* state) {
         CHECK(_pending_batches_num == 0) << _pending_batches_num;
     }
 
-    if (_parent->_transfer_data_by_brpc_attachment && request.has_row_batch()) {
-        request_row_batch_transfer_attachment<PTabletWriterAddBatchRequest,
-                                              ReusableClosure<PTabletWriterAddBatchResult>>(
-                &request, _tuple_data_buffer, _add_batch_closure);
+    if (config::transfer_large_data_by_brpc && request.has_row_batch() &&

Review Comment:
   fix, I understand the purpose of your previous code : )



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r889759625


##########
be/src/runtime/data_stream_sender.cpp:
##########
@@ -149,12 +156,28 @@ Status DataStreamSender::Channel::send_batch(PRowBatch* batch, bool eos) {
     _closure->ref();
     _closure->cntl.set_timeout_ms(_brpc_timeout_ms);
 
-    if (_parent->_transfer_data_by_brpc_attachment && _brpc_request.has_row_batch()) {
-        request_row_batch_transfer_attachment<PTransmitDataParams,
-                                              RefCountClosure<PTransmitDataResult>>(
-                &_brpc_request, _parent->_tuple_data_buffer, _closure);
+    if (config::transfer_large_data_by_brpc && _brpc_request.has_row_batch() &&

Review Comment:
   fix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r886310764


##########
be/src/common/config.h:
##########
@@ -553,9 +553,10 @@ CONF_String(default_rowset_type, "BETA");
 CONF_Int64(brpc_max_body_size, "3147483648");
 // Max unwritten bytes in each socket, if the limit is reached, Socket.Write fails with EOVERCROWDED
 CONF_Int64(brpc_socket_max_unwritten_bytes, "1073741824");
-// Whether to transfer RowBatch in ProtoBuf Request to Controller Attachment and send it
-// through brpc, this will be faster and avoid the error of Request length overflow.
-CONF_mBool(transfer_data_by_brpc_attachment, "false");
+// Whether to embed the ProtoBuf Request serialized string together with Tuple/Block data into
+// Controller Attachment and send it through http brpc when the length of the Tuple/Block data
+// is greater than 1.8G. This is to avoid the error of Request length overflow (2G).
+CONF_mBool(brpc_request_embed_attachment_send_by_http, "true");

Review Comment:
   > The default value is true. It is compatible that an new version BE send request to a old version BE?
   
   yep, should be false, we can set true in v1.2



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r886308720


##########
be/src/vec/sink/vdata_stream_sender.cpp:
##########
@@ -138,13 +144,27 @@ Status VDataStreamSender::Channel::send_block(PBlock* block, bool eos) {
     _closure->ref();
     _closure->cntl.set_timeout_ms(_brpc_timeout_ms);
 
-    if (_brpc_request.has_block()) {
-        request_block_transfer_attachment<PTransmitDataParams,
-                                          RefCountClosure<PTransmitDataResult>>(
+    if (_parent->_column_values_buffer_ptr != nullptr && _brpc_request.has_block() &&
+        !_brpc_request.block().has_column_values()) {
+        DCHECK(_parent->_column_values_buffer.size() != 0);
+        request_embed_attachment_contain_block<PTransmitDataParams,
+                                               RefCountClosure<PTransmitDataResult>>(
                 &_brpc_request, _parent->_column_values_buffer, _closure);
+        std::string brpc_url;
+        brpc_url =

Review Comment:
   `brpc_url` and `_brpc_http_stub` are only used in rare cases where tuple/block > 2G, and no initialization is required in most cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#issuecomment-1152514470

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
yangzhg commented on PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#issuecomment-1152809258

   `PEchoRequest`  has not been renamed, should not merge it before resolve it 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r886309105


##########
be/src/util/proto_util.h:
##########
@@ -74,4 +84,88 @@ inline void attachment_transfer_request_block(const Params* brpc_request, brpc::
     }
 }
 
+// Embed tuple_data and brpc request serialization string in controller attachment.
+template <typename Params, typename Closure>
+inline void request_embed_attachment_contain_tuple(Params* brpc_request,
+                                                   const std::string& tuple_data,
+                                                   Closure* closure) {
+    auto row_batch = brpc_request->mutable_row_batch();
+    row_batch->set_tuple_data("");
+    request_embed_attachment(brpc_request, tuple_data, closure);
+}
+
+// Embed column_values and brpc request serialization string in controller attachment.
+template <typename Params, typename Closure>
+inline void request_embed_attachment_contain_block(Params* brpc_request,
+                                                   const std::string& column_values,
+                                                   Closure* closure) {
+    auto block = brpc_request->mutable_block();
+    block->set_column_values("");
+    request_embed_attachment(brpc_request, column_values, closure);
+}
+
+template <typename Params, typename Closure>
+inline void request_embed_attachment(Params* brpc_request, const std::string& data,
+                                     Closure* closure) {
+    LOG(WARNING) << "http_brpc, request_embed_attachment start, data.size: " << data.size()

Review Comment:
   Remove this LOG, this is for test



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r889754417


##########
be/src/runtime/row_batch.cpp:
##########
@@ -300,17 +290,13 @@ Status RowBatch::serialize(PRowBatch* output_batch, size_t* uncompressed_size,
 
     // return compressed and uncompressed size
     size_t pb_size = get_batch_size(*output_batch);
-    if (allocated_buf == nullptr) {
-        *uncompressed_size = pb_size - mutable_tuple_data->size() + tuple_byte_size;
-        *compressed_size = pb_size;
-        if (pb_size > std::numeric_limits<int32_t>::max()) {
-            // the protobuf has a hard limit of 2GB for serialized data.
-            return Status::InternalError(fmt::format(
-                    "The rowbatch is large than 2GB({}), can not send by Protobuf.", pb_size));
-        }
-    } else {
-        *uncompressed_size = pb_size + tuple_byte_size;
-        *compressed_size = pb_size + mutable_tuple_data->size();
+    *uncompressed_size = pb_size - mutable_tuple_data->size() + tuple_byte_size;
+    *compressed_size = pb_size;
+    if (config::transfer_large_data_by_brpc == false &&

Review Comment:
   fix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
yangzhg commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r883222702


##########
be/src/util/proto_util.h:
##########
@@ -17,10 +17,17 @@
 
 #pragma once
 
-#include "util/stack_util.h"
-
 namespace doris {
 
+// When the tuple/block data is greater than 2G * 0.9, embed the tuple/block data
+// and the request serialization string in the attachment, and use "http" brpc.
+// "http"brpc requires that only one of request and attachment be non-null.
+//
+// 2G: In the default "baidu_std" brpcd, upper limit of the request and attachment length is 2G.
+// 0.9: Reserve a buffer of 0.1 for embedding request serialization strings, etc.
+constexpr size_t MIN_HTTP_BRPC_SIZE = (1ULL << 31) * 0.9;

Review Comment:
   ```suggestion
   constexpr size_t MIN_HTTP_BRPC_SIZE = (1ULL << 31) -  (1 << 26);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r889762556


##########
be/src/service/internal_service.cpp:
##########
@@ -66,19 +66,53 @@ void PInternalServiceImpl::transmit_data(google::protobuf::RpcController* cntl_b
                                          PTransmitDataResult* response,
                                          google::protobuf::Closure* done) {
     SCOPED_SWITCH_BTHREAD();
-    VLOG_ROW << "transmit data: fragment_instance_id=" << print_id(request->finst_id())
-             << " node=" << request->node_id();
+    // TODO(zxy) delete in 1.2 version
     brpc::Controller* cntl = static_cast<brpc::Controller*>(cntl_base);
     attachment_transfer_request_row_batch<PTransmitDataParams>(request, cntl);
+
+    _transmit_data(cntl_base, request, response, done, Status::OK());
+}
+
+void PInternalServiceImpl::transmit_data_by_http(google::protobuf::RpcController* cntl_base,
+                                                 const PEchoRequest* request,
+                                                 PTransmitDataResult* response,
+                                                 google::protobuf::Closure* done) {
+    SCOPED_SWITCH_BTHREAD();
+    PTransmitDataParams* request_raw = new PTransmitDataParams();
+    brpc::Controller* cntl = static_cast<brpc::Controller*>(cntl_base);
+    Status st = attachment_extract_request_contain_tuple<PTransmitDataParams>(request_raw, cntl);
+    _transmit_data(cntl_base, request_raw, response, done, st);
+}
+
+void PInternalServiceImpl::_transmit_data(google::protobuf::RpcController* cntl_base,
+                                          const PTransmitDataParams* request,
+                                          PTransmitDataResult* response,
+                                          google::protobuf::Closure* done, Status extract_st) {

Review Comment:
   fix



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman merged pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
morningman merged PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r895002953


##########
be/src/service/internal_service.cpp:
##########
@@ -66,19 +66,53 @@ void PInternalServiceImpl::transmit_data(google::protobuf::RpcController* cntl_b
                                          PTransmitDataResult* response,
                                          google::protobuf::Closure* done) {
     SCOPED_SWITCH_BTHREAD();
-    VLOG_ROW << "transmit data: fragment_instance_id=" << print_id(request->finst_id())
-             << " node=" << request->node_id();
+    // TODO(zxy) delete in 1.2 version
     brpc::Controller* cntl = static_cast<brpc::Controller*>(cntl_base);
     attachment_transfer_request_row_batch<PTransmitDataParams>(request, cntl);
+
+    _transmit_data(cntl_base, request, response, done, Status::OK());
+}
+
+void PInternalServiceImpl::transmit_data_by_http(google::protobuf::RpcController* cntl_base,
+                                                 const PEchoRequest* request,
+                                                 PTransmitDataResult* response,
+                                                 google::protobuf::Closure* done) {
+    SCOPED_SWITCH_BTHREAD();
+    PTransmitDataParams* request_raw = new PTransmitDataParams();

Review Comment:
   In the new NewHttpClosure delete req and call the old done.run.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a diff in pull request #9803: [fix](brpc) Embed serialized request into the attachment and transmit it through http brpc

Posted by GitBox <gi...@apache.org>.
morningman commented on code in PR #9803:
URL: https://github.com/apache/incubator-doris/pull/9803#discussion_r884250864


##########
be/src/util/proto_util.h:
##########
@@ -74,4 +84,88 @@ inline void attachment_transfer_request_block(const Params* brpc_request, brpc::
     }
 }
 
+// Embed tuple_data and brpc request serialization string in controller attachment.
+template <typename Params, typename Closure>
+inline void request_embed_attachment_contain_tuple(Params* brpc_request,
+                                                   const std::string& tuple_data,
+                                                   Closure* closure) {
+    auto row_batch = brpc_request->mutable_row_batch();
+    row_batch->set_tuple_data("");
+    request_embed_attachment(brpc_request, tuple_data, closure);
+}
+
+// Embed column_values and brpc request serialization string in controller attachment.
+template <typename Params, typename Closure>
+inline void request_embed_attachment_contain_block(Params* brpc_request,
+                                                   const std::string& column_values,
+                                                   Closure* closure) {
+    auto block = brpc_request->mutable_block();
+    block->set_column_values("");
+    request_embed_attachment(brpc_request, column_values, closure);
+}
+
+template <typename Params, typename Closure>
+inline void request_embed_attachment(Params* brpc_request, const std::string& data,
+                                     Closure* closure) {
+    LOG(WARNING) << "http_brpc, request_embed_attachment start, data.size: " << data.size()

Review Comment:
   VLOG



##########
be/src/common/config.h:
##########
@@ -553,9 +553,10 @@ CONF_String(default_rowset_type, "BETA");
 CONF_Int64(brpc_max_body_size, "3147483648");
 // Max unwritten bytes in each socket, if the limit is reached, Socket.Write fails with EOVERCROWDED
 CONF_Int64(brpc_socket_max_unwritten_bytes, "1073741824");
-// Whether to transfer RowBatch in ProtoBuf Request to Controller Attachment and send it
-// through brpc, this will be faster and avoid the error of Request length overflow.
-CONF_mBool(transfer_data_by_brpc_attachment, "false");
+// Whether to embed the ProtoBuf Request serialized string together with Tuple/Block data into
+// Controller Attachment and send it through http brpc when the length of the Tuple/Block data
+// is greater than 1.8G. This is to avoid the error of Request length overflow (2G).
+CONF_mBool(brpc_request_embed_attachment_send_by_http, "true");

Review Comment:
   The default value is true. It is compatible that an new version BE send request to a old version BE?



##########
be/src/vec/sink/vdata_stream_sender.cpp:
##########
@@ -138,13 +144,27 @@ Status VDataStreamSender::Channel::send_block(PBlock* block, bool eos) {
     _closure->ref();
     _closure->cntl.set_timeout_ms(_brpc_timeout_ms);
 
-    if (_brpc_request.has_block()) {
-        request_block_transfer_attachment<PTransmitDataParams,
-                                          RefCountClosure<PTransmitDataResult>>(
+    if (_parent->_column_values_buffer_ptr != nullptr && _brpc_request.has_block() &&
+        !_brpc_request.block().has_column_values()) {
+        DCHECK(_parent->_column_values_buffer.size() != 0);
+        request_embed_attachment_contain_block<PTransmitDataParams,
+                                               RefCountClosure<PTransmitDataResult>>(
                 &_brpc_request, _parent->_column_values_buffer, _closure);
+        std::string brpc_url;
+        brpc_url =

Review Comment:
   this `brpc_url` and `_brpc_http_stub` can be initialized when creating data_stream_sender. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org