You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/05/21 15:40:37 UTC

[GitHub] [incubator-doris] morningman opened a new pull request, #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

morningman opened a new pull request, #9720:
URL: https://github.com/apache/incubator-doris/pull/9720

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   This CL mainly changes:
   1. Merge multiple fragment instances on the same BE to send requests to reduce the number of send fragment rpcs, thereby reducing the rpc timeout problem caused by rpc waiting for the worker thread of brpc.
   
   2. Set the timeout of send fragment rpc to the query timeout to ensure the consistency of users' expectation of query timeout period.
   
   3. Do not close the connection anymore when rpc timeout occurs.
   
   NOTICE:
   1. Change the definition of execPlanFragment rpc, must first upgrade BE.
   2. Remove FE config `remote_fragment_exec_timeout_ms`
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes)
   3. Has unit tests been added: (No Need)
   4. Has document been added or modified: (Yes)
   5. Does it need to update dependencies: (No)
   6. Are there any changes that cannot be rolled back: (No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
yangzhg commented on PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720#issuecomment-1141622786

   I think it is better to divide into two rpc interfaces. `exec_plan_fragment` processing only  fragments size < 3. When 2 phase RPC is required, use two new rpc interfaces `exec_plan_fragment_prepare` and `exec_plan_fragment_start`. This may be more clear


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720#discussion_r885126765


##########
be/src/runtime/fragment_mgr.cpp:
##########
@@ -225,6 +230,11 @@ Status FragmentExecState::prepare(const TExecPlanFragmentParams& params) {
 }
 
 Status FragmentExecState::execute() {
+    if (_need_wait_execution_trigger) {
+        // if _need_wait_execution_trigger is true, which means this instance
+        // is prepared but need to wait for the signal to do the rest execution.
+        _fragments_ctx->wait_for_start();

Review Comment:
   Add a timeout here, avoid occupy the running thread too much time during exceptions.



##########
be/src/runtime/fragment_mgr.cpp:
##########
@@ -225,6 +230,11 @@ Status FragmentExecState::prepare(const TExecPlanFragmentParams& params) {
 }
 
 Status FragmentExecState::execute() {
+    if (_need_wait_execution_trigger) {
+        // if _need_wait_execution_trigger is true, which means this instance
+        // is prepared but need to wait for the signal to do the rest execution.
+        _fragments_ctx->wait_for_start();

Review Comment:
   Add a timeout here, avoid occupy the running thread too much time during exceptions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720#issuecomment-1135256585

   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
morningman commented on PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720#issuecomment-1142987017

   > In addition to the observed reduction in the number of RPCs, are there any mitigations for the timeout issue?
   
   I tested a sql with 3 fragments, 1FE, 1BE, jmeter, thread num 100.
   before: send fragment timeout after running a few seconds.
   after: no error.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
morningman commented on PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720#issuecomment-1140383679

   I have made following tests:
   1. Complex query with 47 fragments. Before: 47 RPCs. After: 2 RPCs
   2. High concurrency test for query with 3 fragments: there is no impact on QPS.
   3. High concurrency test for query with 2 fragments: these is no impact on QPS.
   4. The new BE is compatible with request from old FE.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yangzhg commented on pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
yangzhg commented on PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720#issuecomment-1142036392

   > I have made following tests:
   > 
   > 1. Complex query with 47 fragments. Before: 47 RPCs. After: 2 RPCs
   > 2. High concurrency test for query with 3 fragments: there is no impact on QPS.
   > 3. High concurrency test for query with 2 fragments: these is no impact on QPS.
   > 4. The new BE is compatible with request from old FE.
   
   In addition to the observed reduction in the number of RPCs, are there any mitigations for the timeout issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720#issuecomment-1145553533

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
morningman commented on PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720#issuecomment-1142982598

   > I think it is better to divide into two rpc interfaces. `exec_plan_fragment` processing only fragments size < 3. When 2 phase RPC is required, use two new rpc interfaces `exec_plan_fragment_prepare` and `exec_plan_fragment_start`. This may be more clear
   
   Let me try


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman merged pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
morningman merged PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
morningman commented on PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720#issuecomment-1140397966

   @xinyiZzz @yangzhg PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xinyiZzz commented on a diff in pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
xinyiZzz commented on code in PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720#discussion_r879949182


##########
gensrc/thrift/PaloInternalService.thrift:
##########
@@ -346,6 +346,10 @@ struct TExecPlanFragmentParams {
   19: optional TGlobalDict global_dict  // scan node could use the global dict to encode the string value to an integer
 }
 
+struct TExecPlanFragmentParamsList {
+    1: optional list<TExecPlanFragmentParams> paramsList;

Review Comment:
   Add comment, better



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a diff in pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
morningman commented on code in PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720#discussion_r885155756


##########
be/src/runtime/fragment_mgr.cpp:
##########
@@ -225,6 +230,11 @@ Status FragmentExecState::prepare(const TExecPlanFragmentParams& params) {
 }
 
 Status FragmentExecState::execute() {
+    if (_need_wait_execution_trigger) {
+        // if _need_wait_execution_trigger is true, which means this instance
+        // is prepared but need to wait for the signal to do the rest execution.
+        _fragments_ctx->wait_for_start();

Review Comment:
   No need, there is a "timeout checker" on BE side to check and notify this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9720: [improvement] Optimize send fragment logic to reduce send fragment timeout error

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9720:
URL: https://github.com/apache/incubator-doris/pull/9720#issuecomment-1145553516

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org