You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/04/02 08:38:20 UTC

[GitHub] [incubator-doris] Henry2SS opened a new pull request #8824: Routine load task doesn't reallocate when previous BE is down.

Henry2SS opened a new pull request #8824:
URL: https://github.com/apache/incubator-doris/pull/8824


   # Proposed changes
   
   Issue Number: close #8823 
   
   ## Problem Summary:
   
   if previous be is not alive, should assigned another available BE instead.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (No)
   2. Has unit tests been added: (No Need)
   3. Has document been added or modified: (No Need)
   4. Does it need to update dependencies: (No)
   5. Are there any changes that cannot be rolled back: (No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] Henry2SS commented on a change in pull request #8824: Routine load task doesn't reallocate when previous BE is down.

Posted by GitBox <gi...@apache.org>.
Henry2SS commented on a change in pull request #8824:
URL: https://github.com/apache/incubator-doris/pull/8824#discussion_r841098571



##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/routineload/RoutineLoadTaskScheduler.java
##########
@@ -289,6 +289,13 @@ private void submitTask(long beId, TRoutineLoadTask tTask) throws LoadException
     private boolean allocateTaskToBe(RoutineLoadTaskInfo routineLoadTaskInfo) throws LoadException {
         long beId = routineLoadManager.getAvailableBeForTask(routineLoadTaskInfo.getPreviousBeId(), routineLoadTaskInfo.getClusterName());
         if (beId == -1L) {

Review comment:
       Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #8824: Routine load task doesn't reallocate when previous BE is down.

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #8824:
URL: https://github.com/apache/incubator-doris/pull/8824#discussion_r841155048



##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/routineload/RoutineLoadManager.java
##########
@@ -410,36 +411,43 @@ public long getMinTaskBeId(String clusterName) throws LoadException {
     // check if the specified BE is available for running task
     // return true if it is available. return false if otherwise.
     // throw exception if unrecoverable errors happen.
-    public long getAvailableBeForTask(long previoudBeId, String clusterName) throws LoadException {
+    public long getAvailableBeForTask(long previousBeId, String clusterName) throws LoadException {
         List<Long> beIdsInCluster = Catalog.getCurrentSystemInfo().getClusterBackendIds(clusterName, true);
         if (beIdsInCluster == null) {
             throw new LoadException("The " + clusterName + " has been deleted");
         }
 
-        if (previoudBeId != -1L && !beIdsInCluster.contains(previoudBeId)) {
+        if (previousBeId != -1L && !beIdsInCluster.contains(previousBeId)) {

Review comment:
       If previous BE is down, this method still return -1 and task will be failed to allocated.
   So nothing change?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] caiconghui commented on a change in pull request #8824: Routine load task doesn't reallocate when previous BE is down.

Posted by GitBox <gi...@apache.org>.
caiconghui commented on a change in pull request #8824:
URL: https://github.com/apache/incubator-doris/pull/8824#discussion_r841062607



##########
File path: docs/zh-CN/administrator-guide/config/fe_config.md
##########
@@ -2220,3 +2220,13 @@ load 标签清理器将每隔 `label_clean_interval_second` 运行一次以清
 是否可以动态配置:false
 
 是否为 Master FE 节点独有的配置项:false
+
+### routine_load_task_reallocate_times
+
+一个routine load task尝试分配的最大次数,达到这个值,任务会重新分配

Review comment:
       ```suggestion
   一个routine load task尝试分配到同一个be的最大失败重试次数,达到这个值,任务会重新分配
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] caiconghui commented on a change in pull request #8824: Routine load task doesn't reallocate when previous BE is down.

Posted by GitBox <gi...@apache.org>.
caiconghui commented on a change in pull request #8824:
URL: https://github.com/apache/incubator-doris/pull/8824#discussion_r841062657



##########
File path: docs/en/administrator-guide/config/fe_config.md
##########
@@ -2197,3 +2197,13 @@ Default: 10
 Is it possible to dynamically configure: false
 
 Is it a configuration item unique to the Master FE node: false
+
+### routine_load_task_reallocate_times
+
+The maximum times of a routine load do allocate before re-allocate to another BE. Once reached this number, the task will be re-allocate to another BE.

Review comment:
       same as Chinese document 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #8824: Routine load task doesn't reallocate when previous BE is down.

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #8824:
URL: https://github.com/apache/incubator-doris/pull/8824#discussion_r841082310



##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/routineload/RoutineLoadTaskScheduler.java
##########
@@ -289,6 +289,13 @@ private void submitTask(long beId, TRoutineLoadTask tTask) throws LoadException
     private boolean allocateTaskToBe(RoutineLoadTaskInfo routineLoadTaskInfo) throws LoadException {
         long beId = routineLoadManager.getAvailableBeForTask(routineLoadTaskInfo.getPreviousBeId(), routineLoadTaskInfo.getClusterName());
         if (beId == -1L) {

Review comment:
       So that we don't need to add config for this.

##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/routineload/RoutineLoadTaskScheduler.java
##########
@@ -289,6 +289,13 @@ private void submitTask(long beId, TRoutineLoadTask tTask) throws LoadException
     private boolean allocateTaskToBe(RoutineLoadTaskInfo routineLoadTaskInfo) throws LoadException {
         long beId = routineLoadManager.getAvailableBeForTask(routineLoadTaskInfo.getPreviousBeId(), routineLoadTaskInfo.getClusterName());
         if (beId == -1L) {

Review comment:
       I think we can just modify the `getAvailableBeForTask()` method?
   In that method: check if previous Backend is alive, if yes, go on. If no, find a new one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] Henry2SS commented on a change in pull request #8824: Routine load task doesn't reallocate when previous BE is down.

Posted by GitBox <gi...@apache.org>.
Henry2SS commented on a change in pull request #8824:
URL: https://github.com/apache/incubator-doris/pull/8824#discussion_r841156963



##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/routineload/RoutineLoadManager.java
##########
@@ -410,36 +411,43 @@ public long getMinTaskBeId(String clusterName) throws LoadException {
     // check if the specified BE is available for running task
     // return true if it is available. return false if otherwise.
     // throw exception if unrecoverable errors happen.
-    public long getAvailableBeForTask(long previoudBeId, String clusterName) throws LoadException {
+    public long getAvailableBeForTask(long previousBeId, String clusterName) throws LoadException {
         List<Long> beIdsInCluster = Catalog.getCurrentSystemInfo().getClusterBackendIds(clusterName, true);
         if (beIdsInCluster == null) {
             throw new LoadException("The " + clusterName + " has been deleted");
         }
 
-        if (previoudBeId != -1L && !beIdsInCluster.contains(previoudBeId)) {
+        if (previousBeId != -1L && !beIdsInCluster.contains(previousBeId)) {

Review comment:
       Fixed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] caiconghui commented on a change in pull request #8824: Routine load task doesn't reallocate when previous BE is down.

Posted by GitBox <gi...@apache.org>.
caiconghui commented on a change in pull request #8824:
URL: https://github.com/apache/incubator-doris/pull/8824#discussion_r841062990



##########
File path: docs/zh-CN/administrator-guide/config/fe_config.md
##########
@@ -2220,3 +2220,13 @@ load 标签清理器将每隔 `label_clean_interval_second` 运行一次以清
 是否可以动态配置:false
 
 是否为 Master FE 节点独有的配置项:false
+
+### routine_load_task_reallocate_times

Review comment:
       ```suggestion
   ### max_retry_reallocate_times_for_same_be
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] caiconghui commented on a change in pull request #8824: Routine load task doesn't reallocate when previous BE is down.

Posted by GitBox <gi...@apache.org>.
caiconghui commented on a change in pull request #8824:
URL: https://github.com/apache/incubator-doris/pull/8824#discussion_r841062990



##########
File path: docs/zh-CN/administrator-guide/config/fe_config.md
##########
@@ -2220,3 +2220,13 @@ load 标签清理器将每隔 `label_clean_interval_second` 运行一次以清
 是否可以动态配置:false
 
 是否为 Master FE 节点独有的配置项:false
+
+### routine_load_task_reallocate_times

Review comment:
       ```suggestion
   ### max_retry_allocate_times_for_same_be
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org