You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/05/31 12:46:41 UTC

[GitHub] [incubator-doris] jacktengg opened a new pull request, #9890: [feature] support convert alpha rowset

jacktengg opened a new pull request, #9890:
URL: https://github.com/apache/incubator-doris/pull/9890

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#issuecomment-1146528543

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r886317614


##########
be/src/olap/olap_server.cpp:
##########
@@ -304,6 +320,27 @@ void StorageEngine::_tablet_checkpoint_callback(const std::vector<DataDir*>& dat
     } while (!_stop_background_threads_latch.wait_for(std::chrono::seconds(interval)));
 }
 
+void StorageEngine::_alpha_rowset_scan_thread_callback() {
+    LOG(INFO) << "try to start alpha rowset scan thread!";
+
+    do {
+        std::vector<TabletSharedPtr> tablet_have_alpha_rowset;
+        _tablet_manager->find_tablet_have_alpha_rowset(tablet_have_alpha_rowset);
+        for (int i = 0; i < tablet_have_alpha_rowset.size(); ++i) {

Review Comment:
   我觉得submit 完最多线程数个task之后,就wait 他们都结束,然后再submit 下一堆



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r886316488


##########
be/src/olap/olap_server.cpp:
##########
@@ -304,6 +320,27 @@ void StorageEngine::_tablet_checkpoint_callback(const std::vector<DataDir*>& dat
     } while (!_stop_background_threads_latch.wait_for(std::chrono::seconds(interval)));
 }
 
+void StorageEngine::_alpha_rowset_scan_thread_callback() {
+    LOG(INFO) << "try to start alpha rowset scan thread!";
+
+    do {
+        std::vector<TabletSharedPtr> tablet_have_alpha_rowset;
+        _tablet_manager->find_tablet_have_alpha_rowset(tablet_have_alpha_rowset);

Review Comment:
   这中间shuffle 一下



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r885674955


##########
be/src/olap/convert_rowset.cpp:
##########
@@ -0,0 +1,169 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "olap/convert_rowset.h"
+
+#include "util/trace.h"
+namespace doris {
+Status ConvertRowset::prepare_convert() {
+    if (!_tablet->init_succeeded()) {
+        return Status::OLAPInternalError(OLAP_ERR_INPUT_PARAMETER_ERROR);
+    }
+    // reuse cumulative compaction lock
+    std::unique_lock<std::mutex> lock(_tablet->get_cumulative_compaction_lock(), std::try_to_lock);
+    if (!lock.owns_lock()) {
+        LOG(INFO) << "The tablet is under cumulative compaction. tablet=" << _tablet->full_name();
+        return Status::OLAPInternalError(OLAP_ERR_CE_TRY_CE_LOCK_ERROR);
+    }
+    TRACE("got cumulative compaction lock");
+
+    pick_rowsets_to_convert();
+    TRACE("rowsets picked");
+
+    TRACE_COUNTER_INCREMENT("input_rowsets_count", _input_rowsets.size());
+    _tablet->set_clone_occurred(false);
+
+    return Status::OK();
+}
+void ConvertRowset::pick_rowsets_to_convert() {
+    _tablet->pick_rowsets_to_convert(&_input_rowsets);
+}
+
+Status ConvertRowset::do_convert() {
+    RETURN_NOT_OK(construct_input_rowset_readers());
+
+    Merger::Statistics stats;
+    Status res;
+    for (size_t i = 0; i < _input_rowsets.size(); ++i) {
+        OlapStopWatch watch;
+
+        Version output_version =
+                Version(_input_rowsets[i]->start_version(), _input_rowsets[i]->end_version());
+        std::unique_ptr<RowsetWriter> output_rs_writer;
+        _tablet->create_rowset_writer(output_version, VISIBLE, NONOVERLAPPING, &output_rs_writer);
+        res = Merger::merge_rowsets(_tablet, ReaderType::READER_CUMULATIVE_COMPACTION,
+                                    {_input_rs_readers[i]}, output_rs_writer.get(), &stats);
+
+        if (!res.ok()) {
+            LOG(WARNING) << "fail to convert rowset. res=" << res
+                         << ", tablet=" << _tablet->full_name()
+                         << ", output_version=" << output_version;
+        } else {
+            TRACE("convert rowset finished");
+
+            auto output_rowset = output_rs_writer->build();
+            if (output_rowset == nullptr) {
+                LOG(WARNING) << "rowset writer build failed. writer version:"
+                             << ", output_version=" << output_version;
+                return Status::OLAPInternalError(OLAP_ERR_MALLOC_ERROR);
+            }
+
+            TRACE_COUNTER_INCREMENT("output_rowset_data_size", output_rowset->data_disk_size());
+            TRACE_COUNTER_INCREMENT("output_row_num", output_rowset->num_rows());
+            TRACE_COUNTER_INCREMENT("output_segments_num", output_rowset->num_segments());
+            TRACE("output rowset built");
+
+            RETURN_NOT_OK(check_correctness(_input_rowsets[i], output_rowset, stats));
+            TRACE("check correctness finished");
+
+            _modify_rowsets(_input_rowsets[i], output_rowset);
+            TRACE("modify rowsets finished");
+
+            int64_t current_max_version;
+            {
+                std::shared_lock rdlock(_tablet->get_header_lock());
+                RowsetSharedPtr max_rowset = _tablet->rowset_with_max_version();
+                if (max_rowset == nullptr) {
+                    current_max_version = -1;
+                } else {
+                    current_max_version = _tablet->rowset_with_max_version()->end_version();
+                }
+            }
+            LOG(INFO) << "succeed to do convert rowset"
+                      << ". tablet=" << _tablet->full_name() << ", output_version=" << output_version
+                      << ", current_max_version=" << current_max_version
+                      << ", disk=" << _tablet->data_dir()->path() << ", segments=" << _input_rowsets[i]->num_segments()
+                      << ". elapsed time=" << watch.get_elapse_second()
+                      << "s.";
+        }
+    }
+    return Status::OK();
+}
+
+Status ConvertRowset::construct_input_rowset_readers() {

Review Comment:
   这个函数不要了,因为我们就一个rowset,那么就在调用的地方直接获取这一个rowset 的reader 就好了



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r886649634


##########
be/src/olap/tablet_manager.cpp:
##########
@@ -601,6 +601,22 @@ void TabletManager::get_tablet_stat(TTabletStatResult* result) {
     result->__set_tablet_stat_list(*local_cache);
 }
 
+void TabletManager::find_tablet_have_alpha_rowset(const std::vector<DataDir*>& data_dirs,
+                                                  std::vector<TabletSharedPtr>& tablets) {
+    for (auto data_dir : data_dirs) {
+        for (const auto& tablets_shard : _tablets_shards) {
+            std::shared_lock rdlock(tablets_shard.lock);
+            for (const auto& tablet_map : tablets_shard.tablet_map) {
+                const TabletSharedPtr& tablet_ptr = tablet_map.second;
+                if (!tablet_ptr->all_beta() &&
+                    tablet_ptr->can_do_compaction(data_dir->path_hash(), BASE_COMPACTION)) {

Review Comment:
   这里不用这么写,我觉得你参考的那个find compaction的代码写错了。 现在can_do_compaction 它要求传入一个data dir的参数,那么你就tablet_ptr->can_do_compaction(tablet_ptr->data_dir().... 这样硬传入一个就好了



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r886312771


##########
be/src/olap/convert_rowset.cpp:
##########
@@ -0,0 +1,127 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "olap/convert_rowset.h"
+
+namespace doris {
+
+Status ConvertRowset::do_convert() {
+    if (!_tablet->init_succeeded()) {
+        return Status::OLAPInternalError(OLAP_ERR_INPUT_PARAMETER_ERROR);
+    }
+    std::unique_lock<std::mutex> base_compaction_lock(_tablet->get_base_compaction_lock(), std::try_to_lock);
+    std::unique_lock<std::mutex> cumulative_compaction_lock(_tablet->get_cumulative_compaction_lock(), std::try_to_lock);
+    if (!base_compaction_lock.owns_lock() || !cumulative_compaction_lock.owns_lock()) {
+        LOG(INFO) << "The tablet is under compaction. tablet=" << _tablet->full_name();
+        return Status::OLAPInternalError(OLAP_ERR_CE_TRY_CE_LOCK_ERROR);
+    }
+
+    std::vector<RowsetSharedPtr> alpah_rowsets;
+    _tablet->find_alpha_rowsets(&alpah_rowsets);
+
+    Merger::Statistics stats;
+    Status res;
+    for (size_t i = 0; i < alpah_rowsets.size(); ++i) {
+        Version output_version =

Review Comment:
   这里不能一直跑, 这里可以记录一下我们写过多少个rowset,或者merge过多少行记录,就得退出,否则正常的compaction没法跑了



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r885707427


##########
be/src/olap/convert_rowset.h:
##########
@@ -0,0 +1,48 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "olap/merger.h"
+#include "olap/storage_engine.h"
+#include "olap/tablet.h"
+
+namespace doris {
+class DataDir;
+class ConvertRowset {
+public:
+    ConvertRowset(TabletSharedPtr tablet) : _tablet(tablet) {}
+    Status prepare_convert();
+    Status do_convert();

Review Comment:
   只对外暴露一个do_convert 函数,其他函数都不要对外暴露了



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r885708230


##########
be/src/olap/convert_rowset.h:
##########
@@ -0,0 +1,48 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "olap/merger.h"
+#include "olap/storage_engine.h"
+#include "olap/tablet.h"
+
+namespace doris {
+class DataDir;
+class ConvertRowset {
+public:
+    ConvertRowset(TabletSharedPtr tablet) : _tablet(tablet) {}
+    Status prepare_convert();
+    Status do_convert();
+
+private:
+    Status construct_output_rowset_writer();

Review Comment:
   construct_input 和output rowset 都没用了,里面都就1行代码,不要单独的函数了



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r886317062


##########
be/src/olap/olap_server.cpp:
##########
@@ -304,6 +320,27 @@ void StorageEngine::_tablet_checkpoint_callback(const std::vector<DataDir*>& dat
     } while (!_stop_background_threads_latch.wait_for(std::chrono::seconds(interval)));
 }
 
+void StorageEngine::_alpha_rowset_scan_thread_callback() {
+    LOG(INFO) << "try to start alpha rowset scan thread!";
+
+    do {
+        std::vector<TabletSharedPtr> tablet_have_alpha_rowset;
+        _tablet_manager->find_tablet_have_alpha_rowset(tablet_have_alpha_rowset);
+        for (int i = 0; i < tablet_have_alpha_rowset.size(); ++i) {

Review Comment:
   这里不要一下submit 这么多task,我们最多submit 线程数 * 2 个task;现在这个写法会重复submit 很多相同的tablet 到队列里啊。
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei merged pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei merged PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r885630082


##########
be/src/olap/convert_rowset.h:
##########
@@ -0,0 +1,48 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "olap/merger.h"
+#include "olap/storage_engine.h"
+#include "olap/tablet.h"
+
+namespace doris {
+class DataDir;
+class ConvertRowset {
+public:
+    ConvertRowset(TabletSharedPtr tablet) : _tablet(tablet) {}
+    Status prepare_convert();

Review Comment:
   这里面没这么多函数,我们不要prepare,convert 这一堆了。 因为我们就一个rowset,所以也不会有很多reader



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r886649936


##########
be/src/olap/olap_server.cpp:
##########
@@ -304,6 +320,49 @@ void StorageEngine::_tablet_checkpoint_callback(const std::vector<DataDir*>& dat
     } while (!_stop_background_threads_latch.wait_for(std::chrono::seconds(interval)));
 }
 
+void StorageEngine::_alpha_rowset_scan_thread_callback() {
+    LOG(INFO) << "try to start alpha rowset scan thread!";
+
+    std::vector<DataDir*> data_dirs;

Review Comment:
   这里不需要获取data dirs了



##########
be/src/olap/olap_server.cpp:
##########
@@ -304,6 +320,49 @@ void StorageEngine::_tablet_checkpoint_callback(const std::vector<DataDir*>& dat
     } while (!_stop_background_threads_latch.wait_for(std::chrono::seconds(interval)));
 }
 
+void StorageEngine::_alpha_rowset_scan_thread_callback() {
+    LOG(INFO) << "try to start alpha rowset scan thread!";
+
+    std::vector<DataDir*> data_dirs;
+    for (auto& tmp_store : _store_map) {
+        data_dirs.push_back(tmp_store.second);
+    }
+
+    auto scan_interval_sec = config::scan_alpha_rowset_min_interval_sec;
+    auto max_convert_task = config::convert_rowset_thread_num * 2;
+    do {
+        std::vector<TabletSharedPtr> tablet_have_alpha_rowset;
+        _tablet_manager->find_tablet_have_alpha_rowset(data_dirs, tablet_have_alpha_rowset);

Review Comment:
   这里不要传入data dirs参数了



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
morningman commented on PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#issuecomment-1148096379

   Hi @jacktengg , Could you please push this PR to dev-1.0.1?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r885701904


##########
be/src/olap/convert_rowset.cpp:
##########
@@ -0,0 +1,169 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "olap/convert_rowset.h"
+
+#include "util/trace.h"
+namespace doris {
+Status ConvertRowset::prepare_convert() {
+    if (!_tablet->init_succeeded()) {
+        return Status::OLAPInternalError(OLAP_ERR_INPUT_PARAMETER_ERROR);
+    }
+    // reuse cumulative compaction lock
+    std::unique_lock<std::mutex> lock(_tablet->get_cumulative_compaction_lock(), std::try_to_lock);
+    if (!lock.owns_lock()) {
+        LOG(INFO) << "The tablet is under cumulative compaction. tablet=" << _tablet->full_name();
+        return Status::OLAPInternalError(OLAP_ERR_CE_TRY_CE_LOCK_ERROR);
+    }
+    TRACE("got cumulative compaction lock");
+
+    pick_rowsets_to_convert();
+    TRACE("rowsets picked");
+
+    TRACE_COUNTER_INCREMENT("input_rowsets_count", _input_rowsets.size());
+    _tablet->set_clone_occurred(false);
+
+    return Status::OK();
+}
+void ConvertRowset::pick_rowsets_to_convert() {
+    _tablet->pick_rowsets_to_convert(&_input_rowsets);
+}
+
+Status ConvertRowset::do_convert() {
+    RETURN_NOT_OK(construct_input_rowset_readers());
+
+    Merger::Statistics stats;
+    Status res;
+    for (size_t i = 0; i < _input_rowsets.size(); ++i) {
+        OlapStopWatch watch;
+
+        Version output_version =
+                Version(_input_rowsets[i]->start_version(), _input_rowsets[i]->end_version());
+        std::unique_ptr<RowsetWriter> output_rs_writer;
+        _tablet->create_rowset_writer(output_version, VISIBLE, NONOVERLAPPING, &output_rs_writer);
+        res = Merger::merge_rowsets(_tablet, ReaderType::READER_CUMULATIVE_COMPACTION,
+                                    {_input_rs_readers[i]}, output_rs_writer.get(), &stats);
+
+        if (!res.ok()) {
+            LOG(WARNING) << "fail to convert rowset. res=" << res
+                         << ", tablet=" << _tablet->full_name()
+                         << ", output_version=" << output_version;
+        } else {
+            TRACE("convert rowset finished");
+
+            auto output_rowset = output_rs_writer->build();
+            if (output_rowset == nullptr) {
+                LOG(WARNING) << "rowset writer build failed. writer version:"
+                             << ", output_version=" << output_version;
+                return Status::OLAPInternalError(OLAP_ERR_MALLOC_ERROR);
+            }
+
+            TRACE_COUNTER_INCREMENT("output_rowset_data_size", output_rowset->data_disk_size());
+            TRACE_COUNTER_INCREMENT("output_row_num", output_rowset->num_rows());
+            TRACE_COUNTER_INCREMENT("output_segments_num", output_rowset->num_segments());
+            TRACE("output rowset built");
+
+            RETURN_NOT_OK(check_correctness(_input_rowsets[i], output_rowset, stats));
+            TRACE("check correctness finished");
+
+            _modify_rowsets(_input_rowsets[i], output_rowset);
+            TRACE("modify rowsets finished");
+
+            int64_t current_max_version;

Review Comment:
   86 到 101 行没用,删掉吧



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r885708768


##########
be/src/olap/convert_rowset.h:
##########
@@ -0,0 +1,48 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "olap/merger.h"
+#include "olap/storage_engine.h"
+#include "olap/tablet.h"
+
+namespace doris {
+class DataDir;
+class ConvertRowset {
+public:
+    ConvertRowset(TabletSharedPtr tablet) : _tablet(tablet) {}
+    Status prepare_convert();
+    Status do_convert();
+
+private:
+    Status construct_output_rowset_writer();
+    Status construct_input_rowset_readers();
+    void pick_rowsets_to_convert();
+    Status check_correctness(RowsetSharedPtr input_rowset, RowsetSharedPtr output_rowset,
+                             const Merger::Statistics& stats);
+    int64_t _get_input_num_rows_from_seg_grps(RowsetSharedPtr rowset);
+    void _modify_rowsets(RowsetSharedPtr input_rowset, RowsetSharedPtr output_rowset);
+
+private:
+    TabletSharedPtr _tablet;
+    std::vector<RowsetSharedPtr> _input_rowsets;

Review Comment:
   input rowsets和input rs readers 也不要了,



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei commented on a diff in pull request #9890: [feature] support convert alpha rowset

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #9890:
URL: https://github.com/apache/incubator-doris/pull/9890#discussion_r886632348


##########
be/src/olap/olap_server.cpp:
##########
@@ -72,6 +73,21 @@ Status StorageEngine::start_bg_threads() {
             .set_max_threads(max_thread_num)
             .build(&_compaction_thread_pool);
 
+    int32_t convert_rowset_thread_num = config::convert_rowset_thread_num;
+    if (convert_rowset_thread_num > 0) {
+        // alpha rowset scan thread
+        RETURN_IF_ERROR(Thread::create(
+                "StorageEngine", "alpha_rowset_scan_thread",
+                [this]() { this->_alpha_rowset_scan_thread_callback(); },

Review Comment:
   启动scan 线程和 初始化thread pool 是不是需要调换一下顺序,否则scan thread 启动了,但是pool没初始化可能会挂啊



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org