You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2021/12/10 10:20:13 UTC

[GitHub] [incubator-doris] morningman opened a new pull request #7371: [fix][refactor](broker load) refactor the scheduling logic of broker load

morningman opened a new pull request #7371:
URL: https://github.com/apache/incubator-doris/pull/7371


   ## Proposed changes
   
   1. Refactor the scheduling logic of broker load. Details see #7367 
   2. Fix bug that loadedBytes in SHOW LOAD result is wrong.
   3. Cancel the thread of LoadTimeoutChecker
      Now for PENDING load jobs, there will be no timeout. And the timeout of a load job
      start when pending load task is scheduled.
   4. Fix a bug that the loading task is never submitted to the pool.
      The logic of BlockedPolicy is wrong. We should make sure the task is submitted to the pool,
      or the RejectedExecutionException should be thrown.
   5. Now the transaction of a load job will begin in pending task, instead of when submitting the job.
   
   
   ## Types of changes
   
   What types of changes does your code introduce to Doris?
   _Put an `x` in the boxes that apply_
   
   - [ ] Bugfix (non-breaking change which fixes an issue)
   - [ ] New feature (non-breaking change which adds functionality)
   - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
   - [ ] Documentation Update (if none of the other choices apply)
   - [ ] Code refactor (Modify the code structure, format the code, etc...)
   - [ ] Optimization. Including functional usability improvements and performance improvements.
   - [ ] Dependency. Such as changes related to third-party components.
   - [ ] Other.
   
   ## Checklist
   
   _Put an `x` in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code._
   
   - [ ] I have created an issue on (Fix #7367) and described the bug/feature there in detail
   - [ ] Compiling and unit tests pass locally with my changes
   - [ ] I have added tests that prove my fix is effective or that my feature works
   - [ ] If these changes need document changes, I have updated the document
   - [ ] Any dependent changes have been merged
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman merged pull request #7371: [fix][refactor](broker load) refactor the scheduling logic of broker load

Posted by GitBox <gi...@apache.org>.
morningman merged pull request #7371:
URL: https://github.com/apache/incubator-doris/pull/7371


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] caiconghui commented on a change in pull request #7371: [fix][refactor](broker load) refactor the scheduling logic of broker load

Posted by GitBox <gi...@apache.org>.
caiconghui commented on a change in pull request #7371:
URL: https://github.com/apache/incubator-doris/pull/7371#discussion_r769216843



##########
File path: docs/zh-CN/administrator-guide/load-data/broker-load-manual.md
##########
@@ -469,6 +469,18 @@ LoadFinishTime: 2019-07-27 11:50:16
         注意:一般用户的环境可能达不到 10M/s 的速度,所以建议超过 500G 的文件都进行文件切分,再导入。
         
         ```
+        
+### 作业调度
+
+系统会限制一个集群内,正在运行的 Broker Load 作业数量,以防止同时运行过多的 Load 作业。
+
+首先, FE 的配置参数:`desired_max_waiting_jobs` 会限制一个集群内,正在运行(作业状态为 PENDING 或 LOADING)的 Broker Load 作业数量。默认为 100。如果超过这个阈值,新提交的作业将会被直接拒绝。

Review comment:
       ```suggestion
   首先, FE 的配置参数:`desired_max_waiting_jobs` 会限制一个集群内,未开始和正在运行(作业状态为 PENDING 或 LOADING)的 Broker Load 作业数量。默认为 100。如果超过这个阈值,新提交的作业将会被直接拒绝。
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #7371: [fix][refactor](broker load) refactor the scheduling logic of broker load

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #7371:
URL: https://github.com/apache/incubator-doris/pull/7371#discussion_r769236698



##########
File path: docs/zh-CN/administrator-guide/load-data/broker-load-manual.md
##########
@@ -469,6 +469,18 @@ LoadFinishTime: 2019-07-27 11:50:16
         注意:一般用户的环境可能达不到 10M/s 的速度,所以建议超过 500G 的文件都进行文件切分,再导入。
         
         ```
+        
+### 作业调度
+
+系统会限制一个集群内,正在运行的 Broker Load 作业数量,以防止同时运行过多的 Load 作业。
+
+首先, FE 的配置参数:`desired_max_waiting_jobs` 会限制一个集群内,正在运行(作业状态为 PENDING 或 LOADING)的 Broker Load 作业数量。默认为 100。如果超过这个阈值,新提交的作业将会被直接拒绝。

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] caiconghui commented on a change in pull request #7371: [fix][refactor](broker load) refactor the scheduling logic of broker load

Posted by GitBox <gi...@apache.org>.
caiconghui commented on a change in pull request #7371:
URL: https://github.com/apache/incubator-doris/pull/7371#discussion_r766568467



##########
File path: fe/fe-core/src/main/java/org/apache/doris/catalog/Catalog.java
##########
@@ -582,14 +580,19 @@ private Catalog(boolean isCheckpointCatalog) {
         this.tabletScheduler = new TabletScheduler(this, systemInfo, tabletInvertedIndex, stat, Config.tablet_rebalancer_type);
         this.tabletChecker = new TabletChecker(this, systemInfo, tabletScheduler, stat);
 
+        // The pendingLoadTaskScheduler's queue size should not less than Config.desired_max_waiting_jobs.
+        // So that we can guarantee that all submitted load jobs can be scheduled without being starved.
         this.pendingLoadTaskScheduler = new MasterTaskExecutor("pending_load_task_scheduler", Config.async_pending_load_task_pool_size,
-                Config.async_pending_load_task_pool_size, !isCheckpointCatalog);
+                Config.desired_max_waiting_jobs, !isCheckpointCatalog);
+        // The loadingLoadTaskScheduler's queue size is unlimited, so that it can receive all loading tasks
+        // created after pending tasks finish. And don't worry about the high concurrency, because the

Review comment:
       pending task cost less time than loading task. so my idea is that if load task thread pool is full, not submit task of not started job, for that once start job and still no resource to execute loading task, the job will wait to be timeout, not because of slow running but absent of thread resource. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #7371: [fix][refactor](broker load) refactor the scheduling logic of broker load

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #7371:
URL: https://github.com/apache/incubator-doris/pull/7371#discussion_r768788897



##########
File path: fe/fe-core/src/main/java/org/apache/doris/catalog/Catalog.java
##########
@@ -582,14 +580,19 @@ private Catalog(boolean isCheckpointCatalog) {
         this.tabletScheduler = new TabletScheduler(this, systemInfo, tabletInvertedIndex, stat, Config.tablet_rebalancer_type);
         this.tabletChecker = new TabletChecker(this, systemInfo, tabletScheduler, stat);
 
+        // The pendingLoadTaskScheduler's queue size should not less than Config.desired_max_waiting_jobs.
+        // So that we can guarantee that all submitted load jobs can be scheduled without being starved.
         this.pendingLoadTaskScheduler = new MasterTaskExecutor("pending_load_task_scheduler", Config.async_pending_load_task_pool_size,
-                Config.async_pending_load_task_pool_size, !isCheckpointCatalog);
+                Config.desired_max_waiting_jobs, !isCheckpointCatalog);
+        // The loadingLoadTaskScheduler's queue size is unlimited, so that it can receive all loading tasks
+        // created after pending tasks finish. And don't worry about the high concurrency, because the

Review comment:
       OK. I changed. Now when submit a broker load job to the thread pool, I will check if there are available worker thread in `loading_load_task_pool`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #7371: [fix][refactor](broker load) refactor the scheduling logic of broker load

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7371:
URL: https://github.com/apache/incubator-doris/pull/7371#issuecomment-994291686






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] caiconghui commented on a change in pull request #7371: [fix][refactor](broker load) refactor the scheduling logic of broker load

Posted by GitBox <gi...@apache.org>.
caiconghui commented on a change in pull request #7371:
URL: https://github.com/apache/incubator-doris/pull/7371#discussion_r769220162



##########
File path: fe/fe-core/src/main/java/org/apache/doris/catalog/Catalog.java
##########
@@ -582,14 +580,19 @@ private Catalog(boolean isCheckpointCatalog) {
         this.tabletScheduler = new TabletScheduler(this, systemInfo, tabletInvertedIndex, stat, Config.tablet_rebalancer_type);
         this.tabletChecker = new TabletChecker(this, systemInfo, tabletScheduler, stat);
 
+        // The pendingLoadTaskScheduler's queue size should not less than Config.desired_max_waiting_jobs.
+        // So that we can guarantee that all submitted load jobs can be scheduled without being starved.
         this.pendingLoadTaskScheduler = new MasterTaskExecutor("pending_load_task_scheduler", Config.async_pending_load_task_pool_size,
-                Config.async_pending_load_task_pool_size, !isCheckpointCatalog);
+                Config.desired_max_waiting_jobs, !isCheckpointCatalog);
+        // The loadingLoadTaskScheduler's queue size is unlimited, so that it can receive all loading tasks
+        // created after pending tasks finish. And don't worry about the high concurrency, because the
+        // concurrency is limited by Config.desired_max_waiting_jobs and Config.async_loading_load_task_pool_size.
         this.loadingLoadTaskScheduler = new MasterTaskExecutor("loading_load_task_scheduler", Config.async_loading_load_task_pool_size,
-                Config.async_loading_load_task_pool_size / 5, !isCheckpointCatalog);
+                Integer.MAX_VALUE, !isCheckpointCatalog);
+
         this.loadJobScheduler = new LoadJobScheduler();
         this.loadManager = new LoadManager(loadJobScheduler);
         this.streamLoadRecordMgr = new StreamLoadRecordMgr("stream_load_record_manager", Config.fetch_stream_load_record_interval_second * 1000);
-        this.loadTimeoutChecker = new LoadTimeoutChecker(loadManager);

Review comment:
       why remove this checker?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #7371: [fix][refactor](broker load) refactor the scheduling logic of broker load

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #7371:
URL: https://github.com/apache/incubator-doris/pull/7371#discussion_r769236029



##########
File path: fe/fe-core/src/main/java/org/apache/doris/catalog/Catalog.java
##########
@@ -582,14 +580,19 @@ private Catalog(boolean isCheckpointCatalog) {
         this.tabletScheduler = new TabletScheduler(this, systemInfo, tabletInvertedIndex, stat, Config.tablet_rebalancer_type);
         this.tabletChecker = new TabletChecker(this, systemInfo, tabletScheduler, stat);
 
+        // The pendingLoadTaskScheduler's queue size should not less than Config.desired_max_waiting_jobs.
+        // So that we can guarantee that all submitted load jobs can be scheduled without being starved.
         this.pendingLoadTaskScheduler = new MasterTaskExecutor("pending_load_task_scheduler", Config.async_pending_load_task_pool_size,
-                Config.async_pending_load_task_pool_size, !isCheckpointCatalog);
+                Config.desired_max_waiting_jobs, !isCheckpointCatalog);
+        // The loadingLoadTaskScheduler's queue size is unlimited, so that it can receive all loading tasks
+        // created after pending tasks finish. And don't worry about the high concurrency, because the
+        // concurrency is limited by Config.desired_max_waiting_jobs and Config.async_loading_load_task_pool_size.
         this.loadingLoadTaskScheduler = new MasterTaskExecutor("loading_load_task_scheduler", Config.async_loading_load_task_pool_size,
-                Config.async_loading_load_task_pool_size / 5, !isCheckpointCatalog);
+                Integer.MAX_VALUE, !isCheckpointCatalog);
+
         this.loadJobScheduler = new LoadJobScheduler();
         this.loadManager = new LoadManager(loadJobScheduler);
         this.streamLoadRecordMgr = new StreamLoadRecordMgr("stream_load_record_manager", Config.fetch_stream_load_record_interval_second * 1000);
-        this.loadTimeoutChecker = new LoadTimeoutChecker(loadManager);

Review comment:
       This checker is no logger needed. Because now the timeout is only calculated when the task is actually executed.
   So for now, if a job is in pending state, it will never be cleared.
   
   And the length limit of the pending queue will ensure that there will not be too many pending jobs




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org