You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/12/24 12:00:37 UTC

[GitHub] [incubator-doris] weizuo93 opened a new pull request #5143: [Optimize] Add candidate tablets mechanism for compaction producer

weizuo93 opened a new pull request #5143:
URL: https://github.com/apache/incubator-doris/pull/5143


   ## Proposed changes
   
   The current tablet selection strategy for compaction task is to traverse all tablets and then find a tablet with the highest score for compaction task. This mechanism is expensive to select a tablet for compaction task.
   
   This patch adds a `candidate tablets mechanism` for compaction producer which is described in ISSUE #4988 .
   
   ## Types of changes
   
   What types of changes does your code introduce to Doris?
   _Put an `x` in the boxes that apply_
   
   - [] Bugfix (non-breaking change which fixes an issue)
   - [x] New feature (non-breaking change which adds functionality)
   - [] Breaking change (fix or feature that would cause existing functionality to not work as expected)
   - [x] Documentation Update (if none of the other choices apply)
   - [x] Code refactor (Modify the code structure, format the code, etc...)
   
   ## Checklist
   
   _Put an `x` in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code._
   
   - [x] I have create an issue on (ISSUE #4988 ), and have described the bug/feature there in detail
   - [x] Compiling and unit tests pass locally with my changes
   - [] I have added tests that prove my fix is effective or that my feature works
   - [x] If this change need a document change, I have updated the document
   - [x] Any dependent changes have been merged
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #5143: [Optimize][Compaction] Add candidate tablets mechanism for compaction producer

Posted by GitBox <gi...@apache.org>.
weizuo93 commented on a change in pull request #5143:
URL: https://github.com/apache/incubator-doris/pull/5143#discussion_r549195382



##########
File path: be/src/olap/data_dir.cpp
##########
@@ -1018,4 +1018,163 @@ void DataDir::disks_compaction_score_increment(int64_t delta) {
 void DataDir::disks_compaction_num_increment(int64_t delta) {
     disks_compaction_num->increment(delta);
 }
+
+void DataDir::init_compaction_heap() {
+    for (std::set<TabletInfo>::iterator it = _tablet_set.begin(); it != _tablet_set.end(); it++) {
+        TabletSharedPtr tablet = StorageEngine::instance()->tablet_manager()->get_tablet(
+                it->tablet_id, it->schema_hash);
+        if (tablet != nullptr) {
+            push_tablet_into_compaction_heap(CompactionType::BASE_COMPACTION, tablet);
+            push_tablet_into_compaction_heap(CompactionType::CUMULATIVE_COMPACTION, tablet);
+        }
+    }
+}
+
+void DataDir::push_tablet_into_compaction_heap(CompactionType compaction_type,
+                                               TabletSharedPtr tablet) {
+    OlapStopWatch watch;
+    if (compaction_type == CompactionType::BASE_COMPACTION) {
+        std::unique_lock<std::mutex> lock(_base_compaction_heap_mutex);
+        std::vector<TabletSharedPtr>::iterator it =
+                find(_base_compaction_heap.begin(), _base_compaction_heap.end(), tablet);
+        if (it == _base_compaction_heap.end()) {
+            _base_compaction_heap.push_back(tablet);
+        }
+        std::make_heap(_base_compaction_heap.begin(), _base_compaction_heap.end(),
+                       TabletScoreComparator(CompactionType::BASE_COMPACTION));

Review comment:
       @morningman 
   Because the `priority_queue` only supports pushing element through the rear of queue and popping element through the head of queue, but we need to pop an element through the `queue head` when selecting a tablet for compaction and pop an element through the `queue rear` when sacrificing a tablet. Thus `priority_queue`is not enough.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #5143: [Optimize][Compaction] Add candidate tablets mechanism for compaction producer

Posted by GitBox <gi...@apache.org>.
weizuo93 commented on a change in pull request #5143:
URL: https://github.com/apache/incubator-doris/pull/5143#discussion_r549197433



##########
File path: be/src/exec/olap_scanner.cpp
##########
@@ -538,6 +538,14 @@ Status OlapScanner::close(RuntimeState* state) {
     _reader.reset();
     Expr::close(_conjunct_ctxs, state);
     _is_closed = true;
+    if (config::scan_count_push_tablet_into_compaction_heap != 0 &&
+        _tablet->query_scan_count->value() % config::scan_count_push_tablet_into_compaction_heap ==
+                0) {
+        _tablet->data_dir()->push_tablet_into_compaction_heap(CompactionType::BASE_COMPACTION,

Review comment:
       @morningman 
   The default max size of compaction heap is 20. After test, the cost of `push_tablet_into_compaction_heap`for each tablet is `0.05 ms` averagely.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #5143: [Optimize][Compaction] Add candidate tablets mechanism for compaction producer

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #5143:
URL: https://github.com/apache/incubator-doris/pull/5143#discussion_r548980516



##########
File path: be/src/olap/data_dir.cpp
##########
@@ -1018,4 +1018,163 @@ void DataDir::disks_compaction_score_increment(int64_t delta) {
 void DataDir::disks_compaction_num_increment(int64_t delta) {
     disks_compaction_num->increment(delta);
 }
+
+void DataDir::init_compaction_heap() {
+    for (std::set<TabletInfo>::iterator it = _tablet_set.begin(); it != _tablet_set.end(); it++) {
+        TabletSharedPtr tablet = StorageEngine::instance()->tablet_manager()->get_tablet(
+                it->tablet_id, it->schema_hash);
+        if (tablet != nullptr) {
+            push_tablet_into_compaction_heap(CompactionType::BASE_COMPACTION, tablet);
+            push_tablet_into_compaction_heap(CompactionType::CUMULATIVE_COMPACTION, tablet);
+        }
+    }
+}
+
+void DataDir::push_tablet_into_compaction_heap(CompactionType compaction_type,
+                                               TabletSharedPtr tablet) {
+    OlapStopWatch watch;
+    if (compaction_type == CompactionType::BASE_COMPACTION) {
+        std::unique_lock<std::mutex> lock(_base_compaction_heap_mutex);
+        std::vector<TabletSharedPtr>::iterator it =
+                find(_base_compaction_heap.begin(), _base_compaction_heap.end(), tablet);
+        if (it == _base_compaction_heap.end()) {
+            _base_compaction_heap.push_back(tablet);
+        }
+        std::make_heap(_base_compaction_heap.begin(), _base_compaction_heap.end(),
+                       TabletScoreComparator(CompactionType::BASE_COMPACTION));

Review comment:
       why not just using a priority queue?

##########
File path: be/src/exec/olap_scanner.cpp
##########
@@ -538,6 +538,14 @@ Status OlapScanner::close(RuntimeState* state) {
     _reader.reset();
     Expr::close(_conjunct_ctxs, state);
     _is_closed = true;
+    if (config::scan_count_push_tablet_into_compaction_heap != 0 &&
+        _tablet->query_scan_count->value() % config::scan_count_push_tablet_into_compaction_heap ==
+                0) {
+        _tablet->data_dir()->push_tablet_into_compaction_heap(CompactionType::BASE_COMPACTION,

Review comment:
       Have you test the performance of `push_tablet_into_compaction_heap`.
   This is on the critical path of READ, and the default `scan_count_push_tablet_into_compaction_heap` is 1,
   so the tablet will call this method every time it being read.
   
   If the query concurrency is high, i doubt it may impact the performance.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] weizuo93 commented on pull request #5143: [Optimize][Compaction] Add candidate tablets mechanism for compaction producer

Posted by GitBox <gi...@apache.org>.
weizuo93 commented on pull request #5143:
URL: https://github.com/apache/incubator-doris/pull/5143#issuecomment-751554974


   > I have a question about how to deal with the tablet which being sacrifice and popped out of the candidates vector?
   > 
   > For example:
   > there are 3 tablet is vector with compaction score:
   > [10, 9, 8]
   > 
   > And then a tablet with score 11 is pushed to the vector, so the tablet with score 8 will be
   > sacrificed. Finally the vector remains:
   > [11, 10, 9]
   
   @morningman 
   The sacrificed tablet will be `pop_back` from underlying container vector because tablets in vector are sorted by score.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] weizuo93 closed pull request #5143: [Optimize][Compaction] Add candidate tablets mechanism for compaction producer

Posted by GitBox <gi...@apache.org>.
weizuo93 closed pull request #5143:
URL: https://github.com/apache/incubator-doris/pull/5143


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] weizuo93 commented on a change in pull request #5143: [Optimize][Compaction] Add candidate tablets mechanism for compaction producer

Posted by GitBox <gi...@apache.org>.
weizuo93 commented on a change in pull request #5143:
URL: https://github.com/apache/incubator-doris/pull/5143#discussion_r549197768



##########
File path: be/src/exec/olap_scanner.cpp
##########
@@ -538,6 +538,14 @@ Status OlapScanner::close(RuntimeState* state) {
     _reader.reset();
     Expr::close(_conjunct_ctxs, state);
     _is_closed = true;
+    if (config::scan_count_push_tablet_into_compaction_heap != 0 &&
+        _tablet->query_scan_count->value() % config::scan_count_push_tablet_into_compaction_heap ==
+                0) {
+        _tablet->data_dir()->push_tablet_into_compaction_heap(CompactionType::BASE_COMPACTION,

Review comment:
       The default `scan_count_push_tablet_into_compaction_heap` is 1 and this value can be set dynamically.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org