You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2023/01/11 02:25:39 UTC

[GitHub] [doris] zbtzbtzbt opened a new pull request, #15796: [enhancement](load) delete skiplist for duplicate table in memtable

zbtzbtzbt opened a new pull request, #15796:
URL: https://github.com/apache/doris/pull/15796

   # Proposed changes
   
   delete skiplist for dup table when data load.
   ## Problem summary
   
   when load data, data will be insert into skiplist in memtable, for each row skiplist need to **Find** and **Sort**, which time cost is O(log(n)), I don't think it's a good way.
   
   There are two ways to insert data into memtable:
   
   1. insert into skiplist: **O(log(n))**, when need flush, the data is already sorted
   2. insert into a block(append only): **O(1)**, when need flush, sort once.
   
   this pr implement way2 for duplicate table: **append data to a block and sort when need flush.**
   
   I don't know how good or bad it turned out
   This pr needs to use the community's testing framework
   So I submit first to see the result.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #15796: [enhancement](load) delete skiplist for duplicate table in memtable

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #15796:
URL: https://github.com/apache/doris/pull/15796#issuecomment-1378185726

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] zbtzbtzbt closed pull request #15796: [enhancement](load) delete skiplist for duplicate table in memtable

Posted by GitBox <gi...@apache.org>.
zbtzbtzbt closed pull request #15796: [enhancement](load)  delete skiplist for duplicate table in memtable
URL: https://github.com/apache/doris/pull/15796


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15796: [enhancement](load) delete skiplist for duplicate table in memtable

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on code in PR #15796:
URL: https://github.com/apache/doris/pull/15796#discussion_r1066518076


##########
be/src/olap/memtable.cpp:
##########
@@ -281,21 +301,20 @@ void MemTable::_collect_vskiplist_results() {
                     vectorized::MutableBlock::build_mutable_block(empty_input_block.get());
             _output_mutable_block.clear_column_data();
         }
-    }
 
-    if (is_final) {
-        _vec_skip_list.reset();
+        if (is_final) {
+            _vec_skip_list.reset();
+        }
     }
 }
 
 void MemTable::shrink_memtable_by_agg() {
     SCOPED_CONSUME_MEM_TRACKER(_insert_mem_tracker_use_hook.get());
-    if (_keys_type == KeysType::DUP_KEYS) {
-        return;
-    }
-    _collect_vskiplist_results<false>();
+    DCHECK(_keys_type == KeysType::AGG_KEYS);
+    _collect_memtable_results<false>();
 }
 
+bool MemTable::need_flush() const {
 bool MemTable::need_flush() const {

Review Comment:
   warning: function definition is not allowed here [clang-diagnostic-error]
   ```cpp
   bool MemTable::need_flush() const {
                                     ^
   ```
   



##########
be/src/olap/memtable.cpp:
##########
@@ -357,7 +376,7 @@
 
 Status MemTable::_do_flush(int64_t& duration_ns) {

Review Comment:
   warning: function definition is not allowed here [clang-diagnostic-error]
   ```cpp
   Status MemTable::_do_flush(int64_t& duration_ns) {
                                                    ^
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15796: [enhancement](load) delete skiplist for duplicate table in memtable

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on code in PR #15796:
URL: https://github.com/apache/doris/pull/15796#discussion_r1066531592


##########
be/src/olap/memtable.cpp:
##########
@@ -281,90 +301,90 @@
                     vectorized::MutableBlock::build_mutable_block(empty_input_block.get());
             _output_mutable_block.clear_column_data();
         }
-    }
 
-    if (is_final) {
-        _vec_skip_list.reset();
+        if (is_final) {
+            _vec_skip_list.reset();
+        }
     }
 }
 
 void MemTable::shrink_memtable_by_agg() {
     SCOPED_CONSUME_MEM_TRACKER(_insert_mem_tracker_use_hook.get());
-    if (_keys_type == KeysType::DUP_KEYS) {
-        return;
-    }
-    _collect_vskiplist_results<false>();
+    DCHECK(_keys_type == KeysType::AGG_KEYS);
+    _collect_memtable_results<false>();
 }
 
 bool MemTable::need_flush() const {
-    return memory_usage() >= config::write_buffer_size;
-}
-
-bool MemTable::need_agg() const {
-    if (_keys_type == KeysType::AGG_KEYS) {
-        return memory_usage() >= config::write_buffer_size_for_agg;
+    bool MemTable::need_flush() const {
+        return memory_usage() >= config::write_buffer_size;
     }
 
-    return false;
-}
+    bool MemTable::need_agg() const {
+        if (_keys_type == KeysType::AGG_KEYS) {
+            return memory_usage() >= config::write_buffer_size_for_agg;
+        }
 
-Status MemTable::_generate_delete_bitmap(int64_t atomic_num_segments_before_flush,
-                                         int64_t atomic_num_segments_after_flush) {
-    // generate delete bitmap, build a tmp rowset and load recent segment
-    if (!_tablet->enable_unique_key_merge_on_write()) {
-        return Status::OK();
-    }
-    auto rowset = _rowset_writer->build_tmp();
-    auto beta_rowset = reinterpret_cast<BetaRowset*>(rowset.get());
-    std::vector<segment_v2::SegmentSharedPtr> segments;
-    if (atomic_num_segments_before_flush >= atomic_num_segments_after_flush) {
-        return Status::OK();
+        return false;
     }
-    RETURN_IF_ERROR(beta_rowset->load_segments(atomic_num_segments_before_flush,
-                                               atomic_num_segments_after_flush, &segments));
-    std::shared_lock meta_rlock(_tablet->get_header_lock());
-    // tablet is under alter process. The delete bitmap will be calculated after conversion.
-    if (_tablet->tablet_state() == TABLET_NOTREADY &&
-        SchemaChangeHandler::tablet_in_converting(_tablet->tablet_id())) {
+
+    Status MemTable::_generate_delete_bitmap(int64_t atomic_num_segments_before_flush,
+                                             int64_t atomic_num_segments_after_flush) {
+        // generate delete bitmap, build a tmp rowset and load recent segment
+        if (!_tablet->enable_unique_key_merge_on_write()) {
+            return Status::OK();
+        }
+        auto rowset = _rowset_writer->build_tmp();
+        auto beta_rowset = reinterpret_cast<BetaRowset*>(rowset.get());
+        std::vector<segment_v2::SegmentSharedPtr> segments;
+        if (atomic_num_segments_before_flush >= atomic_num_segments_after_flush) {
+            return Status::OK();
+        }
+        RETURN_IF_ERROR(beta_rowset->load_segments(atomic_num_segments_before_flush,
+                                                   atomic_num_segments_after_flush, &segments));
+        std::shared_lock meta_rlock(_tablet->get_header_lock());
+        // tablet is under alter process. The delete bitmap will be calculated after conversion.
+        if (_tablet->tablet_state() == TABLET_NOTREADY &&
+            SchemaChangeHandler::tablet_in_converting(_tablet->tablet_id())) {
+            return Status::OK();
+        }
+        RETURN_IF_ERROR(_tablet->calc_delete_bitmap(beta_rowset->rowset_id(), segments,
+                                                    &_rowset_ids, _delete_bitmap,
+                                                    _cur_max_version));
         return Status::OK();
     }
-    RETURN_IF_ERROR(_tablet->calc_delete_bitmap(beta_rowset->rowset_id(), segments, &_rowset_ids,
-                                                _delete_bitmap, _cur_max_version));
-    return Status::OK();
-}
 
-Status MemTable::flush() {
-    SCOPED_CONSUME_MEM_TRACKER(_flush_mem_tracker);
-    VLOG_CRITICAL << "begin to flush memtable for tablet: " << tablet_id()
-                  << ", memsize: " << memory_usage() << ", rows: " << _rows;
-    int64_t duration_ns = 0;
-    // For merge_on_write table, it must get all segments in this flush.
-    // The id of new segment is set by the _num_segment of beta_rowset_writer,
-    // and new segment ids is between [atomic_num_segments_before_flush, atomic_num_segments_after_flush),
-    // and use the ids to load segment data file for calc delete bitmap.
-    int64_t atomic_num_segments_before_flush = _rowset_writer->get_atomic_num_segment();
-    RETURN_NOT_OK(_do_flush(duration_ns));
-    int64_t atomic_num_segments_after_flush = _rowset_writer->get_atomic_num_segment();
-    RETURN_NOT_OK(_generate_delete_bitmap(atomic_num_segments_before_flush,
-                                          atomic_num_segments_after_flush));
-    DorisMetrics::instance()->memtable_flush_total->increment(1);
-    DorisMetrics::instance()->memtable_flush_duration_us->increment(duration_ns / 1000);
-    VLOG_CRITICAL << "after flush memtable for tablet: " << tablet_id()
-                  << ", flushsize: " << _flush_size;
+    Status MemTable::flush() {

Review Comment:
   warning: function definition is not allowed here [clang-diagnostic-error]
   ```cpp
       Status MemTable::flush() {
                                ^
   ```
   



##########
be/src/olap/memtable.cpp:
##########
@@ -281,90 +301,90 @@
                     vectorized::MutableBlock::build_mutable_block(empty_input_block.get());
             _output_mutable_block.clear_column_data();
         }
-    }
 
-    if (is_final) {
-        _vec_skip_list.reset();
+        if (is_final) {
+            _vec_skip_list.reset();
+        }
     }
 }
 
 void MemTable::shrink_memtable_by_agg() {
     SCOPED_CONSUME_MEM_TRACKER(_insert_mem_tracker_use_hook.get());
-    if (_keys_type == KeysType::DUP_KEYS) {
-        return;
-    }
-    _collect_vskiplist_results<false>();
+    DCHECK(_keys_type == KeysType::AGG_KEYS);
+    _collect_memtable_results<false>();
 }
 
 bool MemTable::need_flush() const {
-    return memory_usage() >= config::write_buffer_size;
-}
-
-bool MemTable::need_agg() const {
-    if (_keys_type == KeysType::AGG_KEYS) {
-        return memory_usage() >= config::write_buffer_size_for_agg;
+    bool MemTable::need_flush() const {
+        return memory_usage() >= config::write_buffer_size;
     }
 
-    return false;
-}
+    bool MemTable::need_agg() const {
+        if (_keys_type == KeysType::AGG_KEYS) {
+            return memory_usage() >= config::write_buffer_size_for_agg;
+        }
 
-Status MemTable::_generate_delete_bitmap(int64_t atomic_num_segments_before_flush,
-                                         int64_t atomic_num_segments_after_flush) {
-    // generate delete bitmap, build a tmp rowset and load recent segment
-    if (!_tablet->enable_unique_key_merge_on_write()) {
-        return Status::OK();
-    }
-    auto rowset = _rowset_writer->build_tmp();
-    auto beta_rowset = reinterpret_cast<BetaRowset*>(rowset.get());
-    std::vector<segment_v2::SegmentSharedPtr> segments;
-    if (atomic_num_segments_before_flush >= atomic_num_segments_after_flush) {
-        return Status::OK();
+        return false;
     }
-    RETURN_IF_ERROR(beta_rowset->load_segments(atomic_num_segments_before_flush,
-                                               atomic_num_segments_after_flush, &segments));
-    std::shared_lock meta_rlock(_tablet->get_header_lock());
-    // tablet is under alter process. The delete bitmap will be calculated after conversion.
-    if (_tablet->tablet_state() == TABLET_NOTREADY &&
-        SchemaChangeHandler::tablet_in_converting(_tablet->tablet_id())) {
+
+    Status MemTable::_generate_delete_bitmap(int64_t atomic_num_segments_before_flush,
+                                             int64_t atomic_num_segments_after_flush) {
+        // generate delete bitmap, build a tmp rowset and load recent segment
+        if (!_tablet->enable_unique_key_merge_on_write()) {
+            return Status::OK();
+        }
+        auto rowset = _rowset_writer->build_tmp();
+        auto beta_rowset = reinterpret_cast<BetaRowset*>(rowset.get());
+        std::vector<segment_v2::SegmentSharedPtr> segments;
+        if (atomic_num_segments_before_flush >= atomic_num_segments_after_flush) {
+            return Status::OK();
+        }
+        RETURN_IF_ERROR(beta_rowset->load_segments(atomic_num_segments_before_flush,
+                                                   atomic_num_segments_after_flush, &segments));
+        std::shared_lock meta_rlock(_tablet->get_header_lock());
+        // tablet is under alter process. The delete bitmap will be calculated after conversion.
+        if (_tablet->tablet_state() == TABLET_NOTREADY &&
+            SchemaChangeHandler::tablet_in_converting(_tablet->tablet_id())) {
+            return Status::OK();
+        }
+        RETURN_IF_ERROR(_tablet->calc_delete_bitmap(beta_rowset->rowset_id(), segments,
+                                                    &_rowset_ids, _delete_bitmap,
+                                                    _cur_max_version));
         return Status::OK();
     }
-    RETURN_IF_ERROR(_tablet->calc_delete_bitmap(beta_rowset->rowset_id(), segments, &_rowset_ids,
-                                                _delete_bitmap, _cur_max_version));
-    return Status::OK();
-}
 
-Status MemTable::flush() {
-    SCOPED_CONSUME_MEM_TRACKER(_flush_mem_tracker);
-    VLOG_CRITICAL << "begin to flush memtable for tablet: " << tablet_id()
-                  << ", memsize: " << memory_usage() << ", rows: " << _rows;
-    int64_t duration_ns = 0;
-    // For merge_on_write table, it must get all segments in this flush.
-    // The id of new segment is set by the _num_segment of beta_rowset_writer,
-    // and new segment ids is between [atomic_num_segments_before_flush, atomic_num_segments_after_flush),
-    // and use the ids to load segment data file for calc delete bitmap.
-    int64_t atomic_num_segments_before_flush = _rowset_writer->get_atomic_num_segment();
-    RETURN_NOT_OK(_do_flush(duration_ns));
-    int64_t atomic_num_segments_after_flush = _rowset_writer->get_atomic_num_segment();
-    RETURN_NOT_OK(_generate_delete_bitmap(atomic_num_segments_before_flush,
-                                          atomic_num_segments_after_flush));
-    DorisMetrics::instance()->memtable_flush_total->increment(1);
-    DorisMetrics::instance()->memtable_flush_duration_us->increment(duration_ns / 1000);
-    VLOG_CRITICAL << "after flush memtable for tablet: " << tablet_id()
-                  << ", flushsize: " << _flush_size;
+    Status MemTable::flush() {
+        SCOPED_CONSUME_MEM_TRACKER(_flush_mem_tracker);
+        VLOG_CRITICAL << "begin to flush memtable for tablet: " << tablet_id()
+                      << ", memsize: " << memory_usage() << ", rows: " << _rows;
+        int64_t duration_ns = 0;
+        // For merge_on_write table, it must get all segments in this flush.
+        // The id of new segment is set by the _num_segment of beta_rowset_writer,
+        // and new segment ids is between [atomic_num_segments_before_flush, atomic_num_segments_after_flush),
+        // and use the ids to load segment data file for calc delete bitmap.
+        int64_t atomic_num_segments_before_flush = _rowset_writer->get_atomic_num_segment();
+        RETURN_NOT_OK(_do_flush(duration_ns));
+        int64_t atomic_num_segments_after_flush = _rowset_writer->get_atomic_num_segment();
+        RETURN_NOT_OK(_generate_delete_bitmap(atomic_num_segments_before_flush,
+                                              atomic_num_segments_after_flush));
+        DorisMetrics::instance()->memtable_flush_total->increment(1);
+        DorisMetrics::instance()->memtable_flush_duration_us->increment(duration_ns / 1000);
+        VLOG_CRITICAL << "after flush memtable for tablet: " << tablet_id()
+                      << ", flushsize: " << _flush_size;
 
-    return Status::OK();
-}
+        return Status::OK();
+    }
 
-Status MemTable::_do_flush(int64_t& duration_ns) {
-    SCOPED_RAW_TIMER(&duration_ns);
-    _collect_vskiplist_results<true>();
-    vectorized::Block block = _output_mutable_block.to_block();
-    RETURN_NOT_OK(_rowset_writer->flush_single_memtable(&block, &_flush_size));
-    return Status::OK();
-}
+    Status MemTable::_do_flush(int64_t & duration_ns) {
+        SCOPED_RAW_TIMER(&duration_ns);
+        _collect_memtable_results<true>();
+        vectorized::Block block = _output_mutable_block.to_block();
+        RETURN_NOT_OK(_rowset_writer->flush_single_memtable(&block, &_flush_size));
+        return Status::OK();
+    }
 
-Status MemTable::close() {
-    return flush();
-}
+    Status MemTable::close() {

Review Comment:
   warning: function definition is not allowed here [clang-diagnostic-error]
   ```cpp
       Status MemTable::close() {
                                ^
   ```
   



##########
be/src/olap/memtable.cpp:
##########
@@ -281,90 +301,90 @@
                     vectorized::MutableBlock::build_mutable_block(empty_input_block.get());
             _output_mutable_block.clear_column_data();
         }
-    }
 
-    if (is_final) {
-        _vec_skip_list.reset();
+        if (is_final) {
+            _vec_skip_list.reset();
+        }
     }
 }
 
 void MemTable::shrink_memtable_by_agg() {
     SCOPED_CONSUME_MEM_TRACKER(_insert_mem_tracker_use_hook.get());
-    if (_keys_type == KeysType::DUP_KEYS) {
-        return;
-    }
-    _collect_vskiplist_results<false>();
+    DCHECK(_keys_type == KeysType::AGG_KEYS);
+    _collect_memtable_results<false>();
 }
 
 bool MemTable::need_flush() const {
-    return memory_usage() >= config::write_buffer_size;
-}
-
-bool MemTable::need_agg() const {
-    if (_keys_type == KeysType::AGG_KEYS) {
-        return memory_usage() >= config::write_buffer_size_for_agg;
+    bool MemTable::need_flush() const {
+        return memory_usage() >= config::write_buffer_size;
     }
 
-    return false;
-}
+    bool MemTable::need_agg() const {
+        if (_keys_type == KeysType::AGG_KEYS) {
+            return memory_usage() >= config::write_buffer_size_for_agg;
+        }
 
-Status MemTable::_generate_delete_bitmap(int64_t atomic_num_segments_before_flush,
-                                         int64_t atomic_num_segments_after_flush) {
-    // generate delete bitmap, build a tmp rowset and load recent segment
-    if (!_tablet->enable_unique_key_merge_on_write()) {
-        return Status::OK();
-    }
-    auto rowset = _rowset_writer->build_tmp();
-    auto beta_rowset = reinterpret_cast<BetaRowset*>(rowset.get());
-    std::vector<segment_v2::SegmentSharedPtr> segments;
-    if (atomic_num_segments_before_flush >= atomic_num_segments_after_flush) {
-        return Status::OK();
+        return false;
     }
-    RETURN_IF_ERROR(beta_rowset->load_segments(atomic_num_segments_before_flush,
-                                               atomic_num_segments_after_flush, &segments));
-    std::shared_lock meta_rlock(_tablet->get_header_lock());
-    // tablet is under alter process. The delete bitmap will be calculated after conversion.
-    if (_tablet->tablet_state() == TABLET_NOTREADY &&
-        SchemaChangeHandler::tablet_in_converting(_tablet->tablet_id())) {
+
+    Status MemTable::_generate_delete_bitmap(int64_t atomic_num_segments_before_flush,
+                                             int64_t atomic_num_segments_after_flush) {
+        // generate delete bitmap, build a tmp rowset and load recent segment
+        if (!_tablet->enable_unique_key_merge_on_write()) {
+            return Status::OK();
+        }
+        auto rowset = _rowset_writer->build_tmp();
+        auto beta_rowset = reinterpret_cast<BetaRowset*>(rowset.get());
+        std::vector<segment_v2::SegmentSharedPtr> segments;
+        if (atomic_num_segments_before_flush >= atomic_num_segments_after_flush) {
+            return Status::OK();
+        }
+        RETURN_IF_ERROR(beta_rowset->load_segments(atomic_num_segments_before_flush,
+                                                   atomic_num_segments_after_flush, &segments));
+        std::shared_lock meta_rlock(_tablet->get_header_lock());
+        // tablet is under alter process. The delete bitmap will be calculated after conversion.
+        if (_tablet->tablet_state() == TABLET_NOTREADY &&
+            SchemaChangeHandler::tablet_in_converting(_tablet->tablet_id())) {
+            return Status::OK();
+        }
+        RETURN_IF_ERROR(_tablet->calc_delete_bitmap(beta_rowset->rowset_id(), segments,
+                                                    &_rowset_ids, _delete_bitmap,
+                                                    _cur_max_version));
         return Status::OK();
     }
-    RETURN_IF_ERROR(_tablet->calc_delete_bitmap(beta_rowset->rowset_id(), segments, &_rowset_ids,
-                                                _delete_bitmap, _cur_max_version));
-    return Status::OK();
-}
 
-Status MemTable::flush() {
-    SCOPED_CONSUME_MEM_TRACKER(_flush_mem_tracker);
-    VLOG_CRITICAL << "begin to flush memtable for tablet: " << tablet_id()
-                  << ", memsize: " << memory_usage() << ", rows: " << _rows;
-    int64_t duration_ns = 0;
-    // For merge_on_write table, it must get all segments in this flush.
-    // The id of new segment is set by the _num_segment of beta_rowset_writer,
-    // and new segment ids is between [atomic_num_segments_before_flush, atomic_num_segments_after_flush),
-    // and use the ids to load segment data file for calc delete bitmap.
-    int64_t atomic_num_segments_before_flush = _rowset_writer->get_atomic_num_segment();
-    RETURN_NOT_OK(_do_flush(duration_ns));
-    int64_t atomic_num_segments_after_flush = _rowset_writer->get_atomic_num_segment();
-    RETURN_NOT_OK(_generate_delete_bitmap(atomic_num_segments_before_flush,
-                                          atomic_num_segments_after_flush));
-    DorisMetrics::instance()->memtable_flush_total->increment(1);
-    DorisMetrics::instance()->memtable_flush_duration_us->increment(duration_ns / 1000);
-    VLOG_CRITICAL << "after flush memtable for tablet: " << tablet_id()
-                  << ", flushsize: " << _flush_size;
+    Status MemTable::flush() {
+        SCOPED_CONSUME_MEM_TRACKER(_flush_mem_tracker);
+        VLOG_CRITICAL << "begin to flush memtable for tablet: " << tablet_id()
+                      << ", memsize: " << memory_usage() << ", rows: " << _rows;
+        int64_t duration_ns = 0;
+        // For merge_on_write table, it must get all segments in this flush.
+        // The id of new segment is set by the _num_segment of beta_rowset_writer,
+        // and new segment ids is between [atomic_num_segments_before_flush, atomic_num_segments_after_flush),
+        // and use the ids to load segment data file for calc delete bitmap.
+        int64_t atomic_num_segments_before_flush = _rowset_writer->get_atomic_num_segment();
+        RETURN_NOT_OK(_do_flush(duration_ns));
+        int64_t atomic_num_segments_after_flush = _rowset_writer->get_atomic_num_segment();
+        RETURN_NOT_OK(_generate_delete_bitmap(atomic_num_segments_before_flush,
+                                              atomic_num_segments_after_flush));
+        DorisMetrics::instance()->memtable_flush_total->increment(1);
+        DorisMetrics::instance()->memtable_flush_duration_us->increment(duration_ns / 1000);
+        VLOG_CRITICAL << "after flush memtable for tablet: " << tablet_id()
+                      << ", flushsize: " << _flush_size;
 
-    return Status::OK();
-}
+        return Status::OK();
+    }
 
-Status MemTable::_do_flush(int64_t& duration_ns) {
-    SCOPED_RAW_TIMER(&duration_ns);
-    _collect_vskiplist_results<true>();
-    vectorized::Block block = _output_mutable_block.to_block();
-    RETURN_NOT_OK(_rowset_writer->flush_single_memtable(&block, &_flush_size));
-    return Status::OK();
-}
+    Status MemTable::_do_flush(int64_t & duration_ns) {
+        SCOPED_RAW_TIMER(&duration_ns);
+        _collect_memtable_results<true>();
+        vectorized::Block block = _output_mutable_block.to_block();
+        RETURN_NOT_OK(_rowset_writer->flush_single_memtable(&block, &_flush_size));
+        return Status::OK();
+    }
 
-Status MemTable::close() {
-    return flush();
-}
+    Status MemTable::close() {
+        return flush();
+    }
 
 } // namespace doris

Review Comment:
   warning: expected '}' [clang-diagnostic-error]
   ```cpp
   } // namespace doris
                       ^
   ```
   **be/src/olap/memtable.cpp:34:** to match this '{'
   ```cpp
   namespace doris {
                   ^
   ```
   



##########
be/src/olap/memtable.cpp:
##########
@@ -281,90 +301,90 @@ void MemTable::_collect_vskiplist_results() {
                     vectorized::MutableBlock::build_mutable_block(empty_input_block.get());
             _output_mutable_block.clear_column_data();
         }
-    }
 
-    if (is_final) {
-        _vec_skip_list.reset();
+        if (is_final) {
+            _vec_skip_list.reset();
+        }
     }
 }
 
 void MemTable::shrink_memtable_by_agg() {
     SCOPED_CONSUME_MEM_TRACKER(_insert_mem_tracker_use_hook.get());
-    if (_keys_type == KeysType::DUP_KEYS) {
-        return;
-    }
-    _collect_vskiplist_results<false>();
+    DCHECK(_keys_type == KeysType::AGG_KEYS);
+    _collect_memtable_results<false>();
 }
 
 bool MemTable::need_flush() const {
-    return memory_usage() >= config::write_buffer_size;
-}
-
-bool MemTable::need_agg() const {
-    if (_keys_type == KeysType::AGG_KEYS) {
-        return memory_usage() >= config::write_buffer_size_for_agg;
+    bool MemTable::need_flush() const {

Review Comment:
   warning: function definition is not allowed here [clang-diagnostic-error]
   ```cpp
       bool MemTable::need_flush() const {
                                         ^
   ```
   



##########
be/src/olap/memtable.cpp:
##########
@@ -281,90 +301,90 @@
                     vectorized::MutableBlock::build_mutable_block(empty_input_block.get());
             _output_mutable_block.clear_column_data();
         }
-    }
 
-    if (is_final) {
-        _vec_skip_list.reset();
+        if (is_final) {
+            _vec_skip_list.reset();
+        }
     }
 }
 
 void MemTable::shrink_memtable_by_agg() {
     SCOPED_CONSUME_MEM_TRACKER(_insert_mem_tracker_use_hook.get());
-    if (_keys_type == KeysType::DUP_KEYS) {
-        return;
-    }
-    _collect_vskiplist_results<false>();
+    DCHECK(_keys_type == KeysType::AGG_KEYS);
+    _collect_memtable_results<false>();
 }
 
 bool MemTable::need_flush() const {
-    return memory_usage() >= config::write_buffer_size;
-}
-
-bool MemTable::need_agg() const {
-    if (_keys_type == KeysType::AGG_KEYS) {
-        return memory_usage() >= config::write_buffer_size_for_agg;
+    bool MemTable::need_flush() const {
+        return memory_usage() >= config::write_buffer_size;
     }
 
-    return false;
-}
+    bool MemTable::need_agg() const {

Review Comment:
   warning: function definition is not allowed here [clang-diagnostic-error]
   ```cpp
       bool MemTable::need_agg() const {
                                       ^
   ```
   



##########
be/src/olap/memtable.cpp:
##########
@@ -281,90 +301,90 @@
                     vectorized::MutableBlock::build_mutable_block(empty_input_block.get());
             _output_mutable_block.clear_column_data();
         }
-    }
 
-    if (is_final) {
-        _vec_skip_list.reset();
+        if (is_final) {
+            _vec_skip_list.reset();
+        }
     }
 }
 
 void MemTable::shrink_memtable_by_agg() {
     SCOPED_CONSUME_MEM_TRACKER(_insert_mem_tracker_use_hook.get());
-    if (_keys_type == KeysType::DUP_KEYS) {
-        return;
-    }
-    _collect_vskiplist_results<false>();
+    DCHECK(_keys_type == KeysType::AGG_KEYS);
+    _collect_memtable_results<false>();
 }
 
 bool MemTable::need_flush() const {
-    return memory_usage() >= config::write_buffer_size;
-}
-
-bool MemTable::need_agg() const {
-    if (_keys_type == KeysType::AGG_KEYS) {
-        return memory_usage() >= config::write_buffer_size_for_agg;
+    bool MemTable::need_flush() const {
+        return memory_usage() >= config::write_buffer_size;
     }
 
-    return false;
-}
+    bool MemTable::need_agg() const {
+        if (_keys_type == KeysType::AGG_KEYS) {
+            return memory_usage() >= config::write_buffer_size_for_agg;
+        }
 
-Status MemTable::_generate_delete_bitmap(int64_t atomic_num_segments_before_flush,
-                                         int64_t atomic_num_segments_after_flush) {
-    // generate delete bitmap, build a tmp rowset and load recent segment
-    if (!_tablet->enable_unique_key_merge_on_write()) {
-        return Status::OK();
-    }
-    auto rowset = _rowset_writer->build_tmp();
-    auto beta_rowset = reinterpret_cast<BetaRowset*>(rowset.get());
-    std::vector<segment_v2::SegmentSharedPtr> segments;
-    if (atomic_num_segments_before_flush >= atomic_num_segments_after_flush) {
-        return Status::OK();
+        return false;
     }
-    RETURN_IF_ERROR(beta_rowset->load_segments(atomic_num_segments_before_flush,
-                                               atomic_num_segments_after_flush, &segments));
-    std::shared_lock meta_rlock(_tablet->get_header_lock());
-    // tablet is under alter process. The delete bitmap will be calculated after conversion.
-    if (_tablet->tablet_state() == TABLET_NOTREADY &&
-        SchemaChangeHandler::tablet_in_converting(_tablet->tablet_id())) {
+
+    Status MemTable::_generate_delete_bitmap(int64_t atomic_num_segments_before_flush,
+                                             int64_t atomic_num_segments_after_flush) {
+        // generate delete bitmap, build a tmp rowset and load recent segment
+        if (!_tablet->enable_unique_key_merge_on_write()) {
+            return Status::OK();
+        }
+        auto rowset = _rowset_writer->build_tmp();
+        auto beta_rowset = reinterpret_cast<BetaRowset*>(rowset.get());
+        std::vector<segment_v2::SegmentSharedPtr> segments;
+        if (atomic_num_segments_before_flush >= atomic_num_segments_after_flush) {
+            return Status::OK();
+        }
+        RETURN_IF_ERROR(beta_rowset->load_segments(atomic_num_segments_before_flush,
+                                                   atomic_num_segments_after_flush, &segments));
+        std::shared_lock meta_rlock(_tablet->get_header_lock());
+        // tablet is under alter process. The delete bitmap will be calculated after conversion.
+        if (_tablet->tablet_state() == TABLET_NOTREADY &&
+            SchemaChangeHandler::tablet_in_converting(_tablet->tablet_id())) {
+            return Status::OK();
+        }
+        RETURN_IF_ERROR(_tablet->calc_delete_bitmap(beta_rowset->rowset_id(), segments,
+                                                    &_rowset_ids, _delete_bitmap,
+                                                    _cur_max_version));
         return Status::OK();
     }
-    RETURN_IF_ERROR(_tablet->calc_delete_bitmap(beta_rowset->rowset_id(), segments, &_rowset_ids,
-                                                _delete_bitmap, _cur_max_version));
-    return Status::OK();
-}
 
-Status MemTable::flush() {
-    SCOPED_CONSUME_MEM_TRACKER(_flush_mem_tracker);
-    VLOG_CRITICAL << "begin to flush memtable for tablet: " << tablet_id()
-                  << ", memsize: " << memory_usage() << ", rows: " << _rows;
-    int64_t duration_ns = 0;
-    // For merge_on_write table, it must get all segments in this flush.
-    // The id of new segment is set by the _num_segment of beta_rowset_writer,
-    // and new segment ids is between [atomic_num_segments_before_flush, atomic_num_segments_after_flush),
-    // and use the ids to load segment data file for calc delete bitmap.
-    int64_t atomic_num_segments_before_flush = _rowset_writer->get_atomic_num_segment();
-    RETURN_NOT_OK(_do_flush(duration_ns));
-    int64_t atomic_num_segments_after_flush = _rowset_writer->get_atomic_num_segment();
-    RETURN_NOT_OK(_generate_delete_bitmap(atomic_num_segments_before_flush,
-                                          atomic_num_segments_after_flush));
-    DorisMetrics::instance()->memtable_flush_total->increment(1);
-    DorisMetrics::instance()->memtable_flush_duration_us->increment(duration_ns / 1000);
-    VLOG_CRITICAL << "after flush memtable for tablet: " << tablet_id()
-                  << ", flushsize: " << _flush_size;
+    Status MemTable::flush() {
+        SCOPED_CONSUME_MEM_TRACKER(_flush_mem_tracker);
+        VLOG_CRITICAL << "begin to flush memtable for tablet: " << tablet_id()
+                      << ", memsize: " << memory_usage() << ", rows: " << _rows;
+        int64_t duration_ns = 0;
+        // For merge_on_write table, it must get all segments in this flush.
+        // The id of new segment is set by the _num_segment of beta_rowset_writer,
+        // and new segment ids is between [atomic_num_segments_before_flush, atomic_num_segments_after_flush),
+        // and use the ids to load segment data file for calc delete bitmap.
+        int64_t atomic_num_segments_before_flush = _rowset_writer->get_atomic_num_segment();
+        RETURN_NOT_OK(_do_flush(duration_ns));
+        int64_t atomic_num_segments_after_flush = _rowset_writer->get_atomic_num_segment();
+        RETURN_NOT_OK(_generate_delete_bitmap(atomic_num_segments_before_flush,
+                                              atomic_num_segments_after_flush));
+        DorisMetrics::instance()->memtable_flush_total->increment(1);
+        DorisMetrics::instance()->memtable_flush_duration_us->increment(duration_ns / 1000);
+        VLOG_CRITICAL << "after flush memtable for tablet: " << tablet_id()
+                      << ", flushsize: " << _flush_size;
 
-    return Status::OK();
-}
+        return Status::OK();
+    }
 
-Status MemTable::_do_flush(int64_t& duration_ns) {
-    SCOPED_RAW_TIMER(&duration_ns);
-    _collect_vskiplist_results<true>();
-    vectorized::Block block = _output_mutable_block.to_block();
-    RETURN_NOT_OK(_rowset_writer->flush_single_memtable(&block, &_flush_size));
-    return Status::OK();
-}
+    Status MemTable::_do_flush(int64_t & duration_ns) {

Review Comment:
   warning: function definition is not allowed here [clang-diagnostic-error]
   ```cpp
       Status MemTable::_do_flush(int64_t & duration_ns) {
                                                         ^
   ```
   



##########
be/src/olap/memtable.cpp:
##########
@@ -281,90 +301,90 @@
                     vectorized::MutableBlock::build_mutable_block(empty_input_block.get());
             _output_mutable_block.clear_column_data();
         }
-    }
 
-    if (is_final) {
-        _vec_skip_list.reset();
+        if (is_final) {
+            _vec_skip_list.reset();
+        }
     }
 }
 
 void MemTable::shrink_memtable_by_agg() {
     SCOPED_CONSUME_MEM_TRACKER(_insert_mem_tracker_use_hook.get());
-    if (_keys_type == KeysType::DUP_KEYS) {
-        return;
-    }
-    _collect_vskiplist_results<false>();
+    DCHECK(_keys_type == KeysType::AGG_KEYS);
+    _collect_memtable_results<false>();
 }
 
 bool MemTable::need_flush() const {
-    return memory_usage() >= config::write_buffer_size;
-}
-
-bool MemTable::need_agg() const {
-    if (_keys_type == KeysType::AGG_KEYS) {
-        return memory_usage() >= config::write_buffer_size_for_agg;
+    bool MemTable::need_flush() const {
+        return memory_usage() >= config::write_buffer_size;
     }
 
-    return false;
-}
+    bool MemTable::need_agg() const {
+        if (_keys_type == KeysType::AGG_KEYS) {
+            return memory_usage() >= config::write_buffer_size_for_agg;
+        }
 
-Status MemTable::_generate_delete_bitmap(int64_t atomic_num_segments_before_flush,
-                                         int64_t atomic_num_segments_after_flush) {
-    // generate delete bitmap, build a tmp rowset and load recent segment
-    if (!_tablet->enable_unique_key_merge_on_write()) {
-        return Status::OK();
-    }
-    auto rowset = _rowset_writer->build_tmp();
-    auto beta_rowset = reinterpret_cast<BetaRowset*>(rowset.get());
-    std::vector<segment_v2::SegmentSharedPtr> segments;
-    if (atomic_num_segments_before_flush >= atomic_num_segments_after_flush) {
-        return Status::OK();
+        return false;
     }
-    RETURN_IF_ERROR(beta_rowset->load_segments(atomic_num_segments_before_flush,
-                                               atomic_num_segments_after_flush, &segments));
-    std::shared_lock meta_rlock(_tablet->get_header_lock());
-    // tablet is under alter process. The delete bitmap will be calculated after conversion.
-    if (_tablet->tablet_state() == TABLET_NOTREADY &&
-        SchemaChangeHandler::tablet_in_converting(_tablet->tablet_id())) {
+
+    Status MemTable::_generate_delete_bitmap(int64_t atomic_num_segments_before_flush,
+                                             int64_t atomic_num_segments_after_flush) {

Review Comment:
   warning: function definition is not allowed here [clang-diagnostic-error]
   ```cpp
                                                int64_t atomic_num_segments_after_flush) {
                                                                                         ^
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org