You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/02/21 08:10:19 UTC

[GitHub] [incubator-doris] morningman opened a new issue #2964: [Bug][RoutineLoad] Routine Load encounter "label already used" exception

morningman opened a new issue #2964: [Bug][RoutineLoad] Routine Load encounter "label already used" exception
URL: https://github.com/apache/incubator-doris/issues/2964
 
 
   **Describe Bug**
   
   In some scenarios, the "Routine Load" with encounter "label already used" error. This error will cause the job to be paused, but the `resume` command can resume the job.
   
   **Why**
   The reason for this problem is that when FE schedules a certain routine load task, it call `beginTxn()` succeeds, but call `submitTask()` fails. After the submit task fails, the task will be put back into the queue. This will cause the task to re-call the `beginTxn()` the next time it is scheduled, thus reporting an error: "label already used".
   
   The `submitTask()` failed because BE returned an error: `TOO_MANY_TASKS`. In the routine load scenario, this error should not have occurred, because the FE has controlled the degree of concurrency of each BE execution task. The reason for this error is that we do not have good control over the actual execution time of each task. Each task may encounter an rpc timeout error when it is executed in the BE, and the timeout time is a fixed 10 minutes. This results in a task that originally took 10 seconds to execute, which may take 10 minutes, which can take up threads for a long time.
   
   **How to fix**
   To fix the above problems, we need to modify two places:
   
   1. After submit task fails, the task is no longer put back in the queue, but "pretends" that the task submission is successful. And this task will be discarded because of timeout. This can guarantee that the task will not be executed `beginTxn()` again, and the job will not be paused due to a failure in submitting a task.
   
   2. The rpc timeout of each task executed in the BE is set to the `query_timeout` of this task to minimize the problem of the task occupying resources for a long time. Although this modification may still cause the job to run longer than expected, it can significantly alleviate some problems.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] chaoyli closed issue #2964: [Bug][RoutineLoad] Routine Load encounter "label already used" exception

Posted by GitBox <gi...@apache.org>.
chaoyli closed issue #2964: [Bug][RoutineLoad] Routine Load encounter "label already used" exception
URL: https://github.com/apache/incubator-doris/issues/2964
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org