You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/11/26 06:53:22 UTC

[GitHub] [incubator-doris] yangzhg opened a new pull request #4958: Optimized the read performance of the table when have multi versions,

yangzhg opened a new pull request #4958:
URL: https://github.com/apache/incubator-doris/pull/4958


   changed the merge method of the unique table,
   merged the cumulative version data first, and then merged with the base version.
   For the data with only one base version, read directly without merging
   
   ## Proposed changes
   
   Optimize the read performance of AGG and UNIQUE table with too much versions
   the benchmar as follows
   
   ## 测试数据 
   
   本次测试数据集 为 tpcds 10G数据集中的catalog_sales 数据，数据分成两份 一份是完整数据（对应big 表）3G，共28802522（unqiue 后14401261）行，一份只只有200M的采样数据（对应test表）共1000000 行 ，只使用一个分区，其中完整数据共分10个桶。
   
   表均为 segment V2 格式
   
   ```
   +-------+--------------+------+-------+---------+---------+
   | Field | Type         | Null | Key   | Default | Extra   |
   +-------+--------------+------+-------+---------+---------+
   | k1    | BIGINT       | No   | true  | NULL    |         |
   | k2    | BIGINT       | No   | true  | NULL    |         |
   | k3    | BIGINT       | Yes  | true  | NULL    |         |
   | k4    | BIGINT       | Yes  | true  | NULL    |         |
   | k5    | BIGINT       | Yes  | true  | NULL    |         |
   | k6    | BIGINT       | Yes  | true  | NULL    |         |
   | k7    | BIGINT       | Yes  | true  | NULL    |         |
   | k8    | BIGINT       | Yes  | true  | NULL    |         |
   | k9    | BIGINT       | Yes  | true  | NULL    |         |
   | k10   | BIGINT       | Yes  | true  | NULL    |         |
   | v1    | BIGINT       | Yes  | false | NULL    | REPLACE |
   | v2    | BIGINT       | Yes  | false | NULL    | REPLACE |
   | v3    | BIGINT       | Yes  | false | NULL    | REPLACE |
   | v4    | BIGINT       | Yes  | false | NULL    | REPLACE |
   | v5    | BIGINT       | Yes  | false | NULL    | REPLACE |
   | v6    | BIGINT       | Yes  | false | NULL    | REPLACE |
   | v7    | BIGINT       | Yes  | false | NULL    | REPLACE |
   | v8    | BIGINT       | Yes  | false | NULL    | REPLACE |
   | v9    | BIGINT       | Yes  | false | NULL    | REPLACE |
   | v10   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v11   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v12   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v13   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v14   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v15   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v16   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v17   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v18   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v19   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v20   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v21   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v22   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v23   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   | v24   | DECIMAL(7,2) | Yes  | false | NULL    | REPLACE |
   +-------+--------------+------+-------+---------+---------+
   ```
   
   本次测试主要是测试读取性能，尤其是有大量为合并的小版本时的读取性能，因此本次的测试query 为 `select count(*) from (select k1,k2,k3,k4,k5,k6,k7,k8,k9,k10 from table_name) a;`
   
   ### UNIQUE_KEY 和 UNIQUE_KEY 对比
   
   首先本次测试对比了 UNIQUE_KEY 表和 DUPLICTE_KEY 表的读取性能的差异，两个版本中第一个版本为空版本
   
   | 表            | 数据大小  | 分区数 | 分桶数 | 行数       | 版本数 | 查询耗时（s） |
   | ------------- | --------- | ------ | ------ | ---------- | ------ | ------------- |
   | test_uniq     | 93.585 MB | 1      | 1      | 1,000,000  | 2      | 9             |
   | test_uniq_big | 1.327 GB  | 10     | 10     | 14,401,261 | 2      | 13.5          |
   | test_dup      | 93.724 MB | 1      | 1      | 1,000,000  | 2      | 4.8           |
   | test_dup_big  | 2.571 G   | 10     | 10     | 28,802,522 | 2      | 7             |
   
   可以看到duplicate 表读取速度是 unique 的**1倍**左右
   
   优化后 数据 
   
   | 表            | 数据大小  | 分区数 | 分桶数 | 行数       | 版本数 | 查询耗时（s） |
   | ------------- | --------- | ------ | ------ | ---------- | ------ | ------------- |
   | test_uniq     | 93.585 MB | 1      | 1      | 1,000,000  | 2      | 4.8           |
   | test_uniq_big | 1.327 GB  | 10     | 10     | 14,401,261 | 2      | 7.4           |
   | test_dup      | 93.724 MB | 1      | 1      | 1,000,000  | 2      | 4.7           |
   | test_dup_big  | 2.571 G   | 10     | 10     | 28,802,522 | 2      | 7             |
   
    优化后 在base 版本数比较少的情况下 查询性能已经差距不大了
   
   ### UNIQUE_KEY 多版本读取优化对比
   
   由于之后多个版本导入数据为随机数据，所以非全量版本数据有点差异, 测试query 为 `select count(*)  from table_name`
   
   | 表                             | 数据大小       | 分区数 | 分桶数 | 行数         | 版本数    | 查询耗时（s） |
   | ------------------------------ | -------------- | ------ | ------ | ------------ | --------- | ------------- |
   | test_uniq(优化前)              | 136.288 MB     | 1      | 1      | 1008592      | 10000     | 11.7          |
   | **test_uniq(优化后)**          | **136.266 MB** | **1**  | **1**  | **1008635**  | **10000** | **7.1**       |
   | test_uniq_big(优化前）         | 1.368 GB       | 1      | 10     | 14401261     | 10000     | 15.5          |
   | **test_uniq_big(优化后）**     | **1.368 GB**   | **1**  | **10** | **14401261** | **10000** | **12**        |
   | test_uniq_base(优化前)         | 94.252 MB      | 1      | 1      | 1000000      | 1         | 4             |
   | **test_uniq_base(优化后)**     | **94.252 MB**  | **1**  | **1**  | **1000000**  | **1**     | **3.75**      |
   | test_uniq_big_base(优化前)     | 1.327 GB       | 1      | 10     | 14401261     | 1         | 6             |
   | **test_uniq_big_base(优化后)** | **1.327 GB**   | **1**  | **10** | **14401261** | **1**     | **5.6**       |
   | test_uniq(优化前)              | 96.348 MB      | 1      | 1      | 1008592      | 500       | 7.1           |
   | **test_uniq(优化后)**          | **96.349 MB**  | **1**  | **1**  | **1008635**  | **500**   | **4.5**       |
   | test_uniq_big(优化前）         | 1.329 GB       | 1      | 10     | 14401261     | 500       | 8.8           |
   | **test_uniq_big(优化后）**     | **1.327 GB**   | **1**  | **10** | **14401261** | **500**   | 6.7           |
   | 1 | 1 | 1008592 | 500 | 7.1 |
   | **test_uniq (after optimization)** | **96.349 MB** | **1** | **1** | **1008635** | **500** | **4.5** |
   | test_uniq_big (before optimization) | 1.329 GB | 1 | 10 | 14401261 | 500 | 8.8 |
   | **test_uniq_big (after optimization)** | **1.327 GB** | **1** | **10** | **14401261** | **500** | 6.7 |
   ## Types of changes
   
   What types of changes does your code introduce to Doris?
   _Put an `x` in the boxes that apply_
   
   - [] Bugfix (non-breaking change which fixes an issue)
   - [x] New feature (non-breaking change which adds functionality)
   - [] Breaking change (fix or feature that would cause existing functionality to not work as expected)
   - [] Documentation Update (if none of the other choices apply)
   - [] Code refactor (Modify the code structure, format the code, etc...)
   
   ## Checklist
   
   _Put an `x` in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code._
   
   - [x] I have create an issue on (Fix #4957 ), and have described the bug/feature there in detail
   - [] Compiling and unit tests pass locally with my changes
   - [x] I have added tests that prove my fix is effective or that my feature works
   - [] If this change need a document change, I have updated the document
   - [] Any dependent changes have been merged
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] yangzhg merged pull request #4958: Optimized the read performance of the table when have multi versions

Posted by GitBox <gi...@apache.org>.

yangzhg merged pull request #4958:
URL: https://github.com/apache/incubator-doris/pull/4958


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org

[GitHub] [incubator-doris] morningman commented on a change in pull request #4958: Optimized the read performance of the table when have multi versions,

Posted by GitBox <gi...@apache.org>.

morningman commented on a change in pull request #4958:
URL: https://github.com/apache/incubator-doris/pull/4958#discussion_r531829637



##########
File path: be/src/olap/collect_iterator.cpp
##########
@@ -0,0 +1,328 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "olap/collect_iterator.h"
+
+#include "olap/reader.h"
+#include "olap/row.h"
+#include "olap/row_block.h"
+#include "olap/row_cursor.h"
+
+namespace doris {
+
+CollectIterator::~CollectIterator() {
+    for (auto child : _children) {
+        if (child != nullptr) {
+            delete child;
+            child = nullptr;
+        }
+    }
+}
+
+void CollectIterator::init(Reader* reader) {
+    _reader = reader;
+    // when aggregate is enabled or key_type is DUP_KEYS, we don't merge
+    // multiple data to aggregate for performance in user fetch
+    if (_reader->_reader_type == READER_QUERY &&
+        (_reader->_aggregation || _reader->_tablet->keys_type() == KeysType::DUP_KEYS)) {
+        _merge = false;
+    }
+}
+
+OLAPStatus CollectIterator::add_child(RowsetReaderSharedPtr rs_reader) {
+    std::unique_ptr<LevelIterator> child(new Level0Iterator(rs_reader, _reader));
+    RETURN_NOT_OK(child->init());
+    if (child->current_row() == nullptr) {
+        return OLAP_SUCCESS;
+    }
+
+    LevelIterator* child_ptr = child.release();
+    _children.push_back(child_ptr);
+    _rs_readers.push_back(rs_reader);
+    return OLAP_SUCCESS;
+}
+
+// Build a merge heap. If _merge is true, a rowset with the max rownum
+// status will be used as the base rowset, and the other rowsets will be merged first and
+// then merged with the base rowset.
+void CollectIterator::build_heap() {
+    DCHECK(_rs_readers.size() == _children.size());
+    _reverse = _reader->_tablet->tablet_schema().keys_type() == KeysType::UNIQUE_KEYS;
+    if (_children.empty()) {
+        _inner_iter.reset(nullptr);
+        return;
+    } else if (_merge) {
+        DCHECK(!_rs_readers.empty());
+        // find base rowset(max rownum),
+        RowsetReaderSharedPtr base_reader = _rs_readers[0];
+        int base_reader_idx = 0;
+        for (size_t i = 1; i < _rs_readers.size(); ++i) {

Review comment:
       We can first check the size of `_children`, then decide whether to find the base rowset.
   
   And there is a memory leak of `cumu_iter`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org