You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/08/17 05:58:50 UTC

[GitHub] [incubator-doris] weizuo93 opened a new pull request #4373: [Optimize]Optimize the disk selection strategy on BE for tablet creation

weizuo93 opened a new pull request #4373:
URL: https://github.com/apache/incubator-doris/pull/4373


   ## Proposed changes
   
   When creating a tablet, it is necessary to select a disk from all disks that meet the requirements on the BE node to store the tablet. In doris, the current disk selection strategy is to randomly select a disk from all disks that meet the requirements for tablet creation. After the cluster has been running for a long time, we found that the distribution of the number of tablets on different disks in a BE node is unbalanced. In order to solve this problem, we introduced the algorithm of "two random choices" for disk selection when creating the tablet:
   (1) Select two disks from all disks that meet the requirements on the BE node randomly;
   (2) Choose the disk with a smaller number of tablet from the two disks selected in (1) for tablet creation.
   
   ## Types of changes
   
   What types of changes does your code introduce to Doris?
   _Put an `x` in the boxes that apply_
   
   - [] Bugfix (non-breaking change which fixes an issue)
   - [x] New feature (non-breaking change which adds functionality)
   - [] Breaking change (fix or feature that would cause existing functionality to not work as expected)
   - [] Documentation Update (if none of the other choices apply)
   - [] Code refactor (Modify the code structure, format the code, etc...)
   
   ## Checklist
   
   _Put an `x` in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code._
   
   - [x] I have create an issue on (Fix #4329), and have described the bug/feature there in detail
   - [x] Compiling and unit tests pass locally with my changes
   - [] I have added tests that prove my fix is effective or that my feature works
   - [] If this change need a document change, I have updated the document
   - [x] Any dependent changes have been merged
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #4373: [Optimize]Optimize the disk selection strategy on BE for tablet creation

Posted by GitBox <gi...@apache.org>.
weizuo93 commented on a change in pull request #4373:
URL: https://github.com/apache/incubator-doris/pull/4373#discussion_r475640545



##########
File path: be/src/olap/storage_engine.cpp
##########
@@ -428,6 +428,18 @@ std::vector<DataDir*> StorageEngine::get_stores_for_create_tablet(
     std::random_device rd;
     srand(rd());
     std::random_shuffle(stores.begin(), stores.end());
+    // Two random choices
+    for (int i = 0; i < stores.size(); i++) {
+        int j = i + 1;
+        if (j < stores.size()) {
+            if (stores[i]->tablet_set().size() > stores[j]->tablet_set().size()) {
+                std::swap(stores[i], stores[j]);
+            }

Review comment:
       @acelyc111 
   If we use std::sort directly, all tablets will be created on the same disk which has the least amount of tablets. I think it would be better to take into account both the number of tablets on the disk and the randomness of disk selection.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] chaoyli commented on a change in pull request #4373: [Optimize]Optimize the disk selection strategy on BE for tablet creation

Posted by GitBox <gi...@apache.org>.
chaoyli commented on a change in pull request #4373:
URL: https://github.com/apache/incubator-doris/pull/4373#discussion_r473534122



##########
File path: be/src/olap/storage_engine.cpp
##########
@@ -428,6 +428,18 @@ std::vector<DataDir*> StorageEngine::get_stores_for_create_tablet(
     std::random_device rd;
     srand(rd());
     std::random_shuffle(stores.begin(), stores.end());
+    // Two random choices
+    for (int i = 0; i < stores.size(); i++) {
+        int j = i + 1;
+        if (j < stores.size()) {
+            if (stores[i]->tablet_set().size() > stores[j]->tablet_set().size()) {
+                std::swap(stores[i], stores[j]);
+            }
+            std::random_shuffle(stores.begin() + j, stores.end());

Review comment:
       What's the purpose of the this random_shuffle?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman merged pull request #4373: [Optimize]Optimize the disk selection strategy on BE for tablet creation

Posted by GitBox <gi...@apache.org>.
morningman merged pull request #4373:
URL: https://github.com/apache/incubator-doris/pull/4373


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] acelyc111 commented on a change in pull request #4373: [Optimize]Optimize the disk selection strategy on BE for tablet creation

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on a change in pull request #4373:
URL: https://github.com/apache/incubator-doris/pull/4373#discussion_r476063262



##########
File path: be/src/olap/storage_engine.cpp
##########
@@ -428,6 +428,18 @@ std::vector<DataDir*> StorageEngine::get_stores_for_create_tablet(
     std::random_device rd;
     srand(rd());
     std::random_shuffle(stores.begin(), stores.end());
+    // Two random choices
+    for (int i = 0; i < stores.size(); i++) {
+        int j = i + 1;
+        if (j < stores.size()) {
+            if (stores[i]->tablet_set().size() > stores[j]->tablet_set().size()) {
+                std::swap(stores[i], stores[j]);
+            }

Review comment:
       @weizuo93 I got it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] acelyc111 commented on a change in pull request #4373: [Optimize]Optimize the disk selection strategy on BE for tablet creation

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on a change in pull request #4373:
URL: https://github.com/apache/incubator-doris/pull/4373#discussion_r473556403



##########
File path: be/src/olap/storage_engine.cpp
##########
@@ -428,6 +428,18 @@ std::vector<DataDir*> StorageEngine::get_stores_for_create_tablet(
     std::random_device rd;
     srand(rd());
     std::random_shuffle(stores.begin(), stores.end());
+    // Two random choices
+    for (int i = 0; i < stores.size(); i++) {
+        int j = i + 1;
+        if (j < stores.size()) {
+            if (stores[i]->tablet_set().size() > stores[j]->tablet_set().size()) {
+                std::swap(stores[i], stores[j]);
+            }

Review comment:
       @weizuo93 you can use `std::sort` directly, and `std::random_shuffle` is not needed?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] acelyc111 commented on a change in pull request #4373: [Optimize]Optimize the disk selection strategy on BE for tablet creation

Posted by GitBox <gi...@apache.org>.
acelyc111 commented on a change in pull request #4373:
URL: https://github.com/apache/incubator-doris/pull/4373#discussion_r473556403



##########
File path: be/src/olap/storage_engine.cpp
##########
@@ -428,6 +428,18 @@ std::vector<DataDir*> StorageEngine::get_stores_for_create_tablet(
     std::random_device rd;
     srand(rd());
     std::random_shuffle(stores.begin(), stores.end());
+    // Two random choices
+    for (int i = 0; i < stores.size(); i++) {
+        int j = i + 1;
+        if (j < stores.size()) {
+            if (stores[i]->tablet_set().size() > stores[j]->tablet_set().size()) {
+                std::swap(stores[i], stores[j]);
+            }

Review comment:
       @weizuo93 you can use `std::sort` directly, and `std::random_shuffle` in line 430 is not needed?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #4373: [Optimize]Optimize the disk selection strategy on BE for tablet creation

Posted by GitBox <gi...@apache.org>.
weizuo93 commented on a change in pull request #4373:
URL: https://github.com/apache/incubator-doris/pull/4373#discussion_r475558052



##########
File path: be/src/olap/storage_engine.cpp
##########
@@ -428,6 +428,18 @@ std::vector<DataDir*> StorageEngine::get_stores_for_create_tablet(
     std::random_device rd;
     srand(rd());
     std::random_shuffle(stores.begin(), stores.end());
+    // Two random choices
+    for (int i = 0; i < stores.size(); i++) {
+        int j = i + 1;
+        if (j < stores.size()) {
+            if (stores[i]->tablet_set().size() > stores[j]->tablet_set().size()) {
+                std::swap(stores[i], stores[j]);
+            }
+            std::random_shuffle(stores.begin() + j, stores.end());

Review comment:
       @chaoyli 
   The purpose of the this random_shuffle is that:
          If it failed to create tablet on the first disk selected using "two random choices", it is necessary to select another disk from the remaining disks to create the tablet by using "two random choices" again. This random_shuffle is the beginning of the next "two random choices".




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org