You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2023/01/09 03:11:14 UTC

[GitHub] [doris] Cai-Yao opened a new pull request, #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Cai-Yao opened a new pull request, #15718:
URL: https://github.com/apache/doris/pull/15718

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   delete schema scanner::get_next_row,change to get_to_block
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
       - [ ] Yes
       - [ ] No
       - [ ] I don't know
   2. Has unit tests been added:
       - [ ] Yes
       - [ ] No
       - [ ] No Need
   3. Has document been added or modified:
       - [ ] Yes
       - [ ] No
       - [ ] No Need
   4. Does it need to update dependencies:
       - [ ] Yes
       - [ ] No
   5. Are there any changes that cannot be rolled back:
       - [ ] Yes (If Yes, please explain WHY)
       - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] hello-stephen commented on pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
hello-stephen commented on PR #15718:
URL: https://github.com/apache/doris/pull/15718#issuecomment-1375115572

   TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 35.48 seconds
    load time: 633 seconds
    storage size: 17123553794 Bytes
    https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230109051053_clickbench_pr_75856.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei merged pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by "yiguolei (via GitHub)" <gi...@apache.org>.
yiguolei merged PR #15718:
URL: https://github.com/apache/doris/pull/15718


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1067785866


##########
be/src/exec/schema_scanner/schema_rowsets_scanner.cpp:
##########
@@ -97,80 +82,127 @@ Status SchemaRowsetsScanner::get_all_rowsets() {
     return Status::OK();
 }
 
-Status SchemaRowsetsScanner::fill_one_row(Tuple* tuple, MemPool* pool) {
-    // set all bit to not null
-    memset((void*)tuple, 0, _tuple_desc->num_null_bytes());
-    RowsetSharedPtr rowset = rowsets_[rowsets_idx_];
+Status SchemaRowsetsScanner::get_next_block(vectorized::Block* block, bool* eos) {
+    if (!_is_init) {
+        return Status::InternalError("Used before initialized.");
+    }
+    if (nullptr == block || nullptr == eos) {
+        return Status::InternalError("input pointer is nullptr.");
+    }
+
+    *eos = true;

Review Comment:
   rowsets 这个表不一定可以一个block 一下子装进去的,可能有10000000个rowset



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #15718:
URL: https://github.com/apache/doris/pull/15718#issuecomment-1380226343

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1064269057


##########
be/src/vec/exec/vschema_scan_node.cpp:
##########
@@ -258,73 +262,112 @@ Status VSchemaScanNode::get_next(RuntimeState* state, vectorized::Block* block,
     std::vector<vectorized::MutableColumnPtr> columns(_slot_num);
     bool schema_eos = false;
 
-    do {
-        bool mem_reuse = block->mem_reuse();
-        DCHECK(block->rows() == 0);
+    if (_schema_scanner->type() == TSchemaTableType::SCH_TABLES) {
+        do {
+            bool mem_reuse = block->mem_reuse();

Review Comment:
   do not use mem reuse here, just call block->clear() to clear all columns in block and then reinsert columns.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1064432227


##########
be/src/exec/schema_scanner/schema_tables_scanner.cpp:
##########
@@ -300,295 +132,172 @@ Status SchemaTablesScanner::get_next_row(Tuple* tuple, MemPool* pool, bool* eos)
     if (nullptr == tuple || nullptr == pool || nullptr == eos) {
         return Status::InternalError("input pointer is nullptr.");
     }
-    while (_table_index >= _table_result.tables.size()) {
-        if (_db_index < _db_result.dbs.size()) {
-            RETURN_IF_ERROR(get_new_table());
-        } else {
-            *eos = true;
-            return Status::OK();
-        }
-    }
-    *eos = false;
     return fill_one_row(tuple, pool);
 }
 
-Status SchemaTablesScanner::fill_dest_column(vectorized::Block* block, void* data,
-                                             const SlotDescriptor* slot_desc) {
-    if (!block->has(slot_desc->col_name())) {
-        return Status::OK();
-    }
-    vectorized::IColumn* col_ptr = const_cast<vectorized::IColumn*>(
-            block->get_by_name(slot_desc->col_name()).column.get());
-
-    if (data == nullptr) {
-        auto* nullable_column = reinterpret_cast<vectorized::ColumnNullable*>(col_ptr);
-        nullable_column->get_null_map_data().push_back(1);
-        nullable_column->insert_data(nullptr, 0);
-        return Status::OK();
-    }
-    if (slot_desc->is_nullable()) {
-        auto* nullable_column = reinterpret_cast<vectorized::ColumnNullable*>(col_ptr);
-        nullable_column->get_null_map_data().push_back(0);
-        // if (data == nullptr) {
-        //     nullable_column->insert_data(nullptr, 0);
-        //     return Status::OK();
-        // }
-        col_ptr = &nullable_column->get_nested_column();
-    }
-    switch (slot_desc->type().type) {
-    case TYPE_HLL: {
-        HyperLogLog* hll_slot = reinterpret_cast<HyperLogLog*>(data);
-        reinterpret_cast<vectorized::ColumnHLL*>(col_ptr)->get_data().emplace_back(*hll_slot);
-        break;
-    }
-    case TYPE_VARCHAR:
-    case TYPE_CHAR:
-    case TYPE_STRING: {
-        StringValue* str_slot = reinterpret_cast<StringValue*>(data);
-        reinterpret_cast<vectorized::ColumnString*>(col_ptr)->insert_data(str_slot->ptr,
-                                                                          str_slot->len);
-        break;
-    }
-
-    case TYPE_BOOLEAN: {
-        uint8_t num = *reinterpret_cast<bool*>(data);
-        reinterpret_cast<vectorized::ColumnVector<vectorized::UInt8>*>(col_ptr)->insert_value(num);
-        break;
-    }
-
-    case TYPE_TINYINT: {
-        int8_t num = *reinterpret_cast<int8_t*>(data);
-        reinterpret_cast<vectorized::ColumnVector<vectorized::Int8>*>(col_ptr)->insert_value(num);
-        break;
-    }
-
-    case TYPE_SMALLINT: {
-        int16_t num = *reinterpret_cast<int16_t*>(data);
-        reinterpret_cast<vectorized::ColumnVector<vectorized::Int16>*>(col_ptr)->insert_value(num);
-        break;
-    }
-
-    case TYPE_INT: {
-        int32_t num = *reinterpret_cast<int32_t*>(data);
-        reinterpret_cast<vectorized::ColumnVector<vectorized::Int32>*>(col_ptr)->insert_value(num);
-        break;
-    }
-
-    case TYPE_BIGINT: {
-        int64_t num = *reinterpret_cast<int64_t*>(data);
-        reinterpret_cast<vectorized::ColumnVector<vectorized::Int64>*>(col_ptr)->insert_value(num);
-        break;
-    }
-
-    case TYPE_LARGEINT: {
-        __int128 num;
-        memcpy(&num, data, sizeof(__int128));
-        reinterpret_cast<vectorized::ColumnVector<vectorized::Int128>*>(col_ptr)->insert_value(num);
-        break;
-    }
-
-    case TYPE_FLOAT: {
-        float num = *reinterpret_cast<float*>(data);
-        reinterpret_cast<vectorized::ColumnVector<vectorized::Float32>*>(col_ptr)->insert_value(
-                num);
-        break;
-    }
-
-    case TYPE_DOUBLE: {
-        double num = *reinterpret_cast<double*>(data);
-        reinterpret_cast<vectorized::ColumnVector<vectorized::Float64>*>(col_ptr)->insert_value(
-                num);
-        break;
-    }
-
-    case TYPE_DATE: {
-        vectorized::VecDateTimeValue value;
-        DateTimeValue* ts_slot = reinterpret_cast<DateTimeValue*>(data);
-        value.convert_dt_to_vec_dt(ts_slot);
-        reinterpret_cast<vectorized::ColumnVector<vectorized::Int64>*>(col_ptr)->insert_data(
-                reinterpret_cast<char*>(&value), 0);
-        break;
-    }
-
-    case TYPE_DATEV2: {
-        uint32_t num = *reinterpret_cast<uint32_t*>(data);
-        reinterpret_cast<vectorized::ColumnVector<vectorized::UInt32>*>(col_ptr)->insert_value(num);
-        break;
-    }
-
-    case TYPE_DATETIME: {
-        vectorized::VecDateTimeValue value;
-        DateTimeValue* ts_slot = reinterpret_cast<DateTimeValue*>(data);
-        value.convert_dt_to_vec_dt(ts_slot);
-        reinterpret_cast<vectorized::ColumnVector<vectorized::Int64>*>(col_ptr)->insert_data(
-                reinterpret_cast<char*>(&value), 0);
-        break;
-    }
-
-    case TYPE_DATETIMEV2: {
-        uint32_t num = *reinterpret_cast<uint64_t*>(data);
-        reinterpret_cast<vectorized::ColumnVector<vectorized::UInt64>*>(col_ptr)->insert_value(num);
-        break;
-    }
-
-    case TYPE_DECIMALV2: {
-        const vectorized::Int128 num = (reinterpret_cast<PackedInt128*>(data))->value;
-        reinterpret_cast<vectorized::ColumnDecimal128*>(col_ptr)->insert_data(
-                reinterpret_cast<const char*>(&num), 0);
-        break;
-    }
-    case TYPE_DECIMAL128I: {
-        const vectorized::Int128 num = (reinterpret_cast<PackedInt128*>(data))->value;
-        reinterpret_cast<vectorized::ColumnDecimal128I*>(col_ptr)->insert_data(
-                reinterpret_cast<const char*>(&num), 0);
-        break;
-    }
-
-    case TYPE_DECIMAL32: {
-        const int32_t num = *reinterpret_cast<int32_t*>(data);
-        reinterpret_cast<vectorized::ColumnDecimal32*>(col_ptr)->insert_data(
-                reinterpret_cast<const char*>(&num), 0);
-        break;
-    }
-
-    case TYPE_DECIMAL64: {
-        const int64_t num = *reinterpret_cast<int64_t*>(data);
-        reinterpret_cast<vectorized::ColumnDecimal64*>(col_ptr)->insert_data(
-                reinterpret_cast<const char*>(&num), 0);
-        break;
-    }
-
-    default: {
-        DCHECK(false) << "bad slot type: " << slot_desc->type();
-        std::stringstream ss;
-        ss << "Fail to convert schema type:'" << slot_desc->type() << " on column:`"
-           << slot_desc->col_name() + "`";
-        return Status::InternalError(ss.str());
-    }
-    }
-
-    return Status::OK();
-}
-
-Status SchemaTablesScanner::fill_one_row(vectorized::Block* block) {
-    const TTableStatus& tbl_status = _table_result.tables[_table_index];
+Status SchemaTablesScanner::fill_block_imp(vectorized::Block* block) {

Review Comment:
   fill_block_impl



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1064259517


##########
be/src/exec/schema_scanner/schema_tables_scanner.h:
##########
@@ -29,10 +31,13 @@ class SchemaTablesScanner : public SchemaScanner {
 
     virtual Status start(RuntimeState* state);
     virtual Status get_next_row(Tuple* tuple, MemPool* pool, bool* eos);
+    virtual Status get_next_block(vectorized::Block* block, bool* eos);

Review Comment:
   warning: prefer using 'override' or (rarely) 'final' instead of 'virtual' [modernize-use-override]
   
   ```suggestion
       Status get_next_block(vectorized::Block* block, bool* eos) override;
   ```
   



##########
be/src/vec/exec/vschema_scan_node.cpp:
##########
@@ -258,73 +262,112 @@ Status VSchemaScanNode::get_next(RuntimeState* state, vectorized::Block* block,
     std::vector<vectorized::MutableColumnPtr> columns(_slot_num);
     bool schema_eos = false;
 
-    do {
-        bool mem_reuse = block->mem_reuse();
-        DCHECK(block->rows() == 0);
+    if (_schema_scanner->type() == TSchemaTableType::SCH_TABLES) {
+        do {
+            bool mem_reuse = block->mem_reuse();
+            DCHECK(block->rows() == 0);
 
-        columns.resize(_slot_num);
-        for (int i = 0; i < _slot_num; ++i) {
-            if (mem_reuse) {
-                columns[i] = std::move(*block->get_by_position(i).column).mutate();
-            } else {
-                columns[i] = _dest_tuple_desc->slots()[i]->get_empty_mutable_column();
+            if (!mem_reuse) {
+                for (int i = 0; i < _slot_num; ++i) {
+                    int j = _index_map[i];
+                    const auto slot_desc = _src_tuple_desc->slots()[j];
+                    block->insert(ColumnWithTypeAndName(slot_desc->get_empty_mutable_column(),
+                                                        slot_desc->get_data_type_ptr(),
+                                                        slot_desc->col_name()));
+                }
             }
-        }
-        while (true) {
-            RETURN_IF_CANCELLED(state);
-
-            // get all slots from schema table.
-            RETURN_IF_ERROR(_schema_scanner->get_next_row(_src_single_tuple, _tuple_pool.get(),
-                                                          &schema_eos));
-            if (schema_eos) {
-                *eos = true;
-                break;
+
+            while (true) {
+                RETURN_IF_CANCELLED(state);
+
+                // get all slots from schema table.
+                RETURN_IF_ERROR(_schema_scanner->get_next_block(block, &schema_eos));
+
+                if (schema_eos) {
+                    *eos = true;
+                    break;
+                }
+
+                if (block->rows() == state->batch_size()) {
+                    break;
+                }
             }
-            // tuple project
-            project_tuple();
 
+            if (block->rows()) {
+                RETURN_IF_ERROR(VExprContext::filter_block(_vconjunct_ctx_ptr, block,
+                                                           _dest_tuple_desc->slots().size()));
+                VLOG_ROW << "VSchemaScanNode output rows: " << block->rows();
+            }
+        } while (block->rows() == 0 && !(*eos));
+    } else {
+        do {
+            bool mem_reuse = block->mem_reuse();
+            DCHECK(block->rows() == 0);
+
+            columns.resize(_slot_num);
             for (int i = 0; i < _slot_num; ++i) {
-                auto slot_desc = _dest_tuple_desc->slots()[i];
-                if (!slot_desc->is_materialized()) {
-                    continue;
+                if (mem_reuse) {
+                    columns[i] = std::move(*block->get_by_position(i).column).mutate();

Review Comment:
   warning: std::move of the const expression has no effect; remove std::move() [performance-move-const-arg]
   
   ```suggestion
                       columns[i] = *block->get_by_position(i).column.mutate();
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1066619623


##########
be/src/exec/schema_scanner/schema_files_scanner.h:
##########
@@ -28,7 +28,7 @@ class SchemaFilesScanner : public SchemaScanner {
     virtual ~SchemaFilesScanner();
 
     virtual Status start(RuntimeState* state);

Review Comment:
   warning: 'start' overrides a member function but is not marked 'override' [clang-diagnostic-inconsistent-missing-override]
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   **be/src/exec/schema_scanner.h:84:** overridden virtual function is here
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   



##########
be/src/exec/schema_scanner/schema_columns_scanner.h:
##########
@@ -29,19 +29,18 @@ class SchemaColumnsScanner : public SchemaScanner {
     SchemaColumnsScanner();
     virtual ~SchemaColumnsScanner();
     virtual Status start(RuntimeState* state);

Review Comment:
   warning: 'start' overrides a member function but is not marked 'override' [clang-diagnostic-inconsistent-missing-override]
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   **be/src/exec/schema_scanner.h:84:** overridden virtual function is here
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   



##########
be/src/exec/schema_scanner/schema_partitions_scanner.h:
##########
@@ -28,7 +28,7 @@ class SchemaPartitionsScanner : public SchemaScanner {
     virtual ~SchemaPartitionsScanner();
 
     virtual Status start(RuntimeState* state);

Review Comment:
   warning: 'start' overrides a member function but is not marked 'override' [clang-diagnostic-inconsistent-missing-override]
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   **be/src/exec/schema_scanner.h:84:** overridden virtual function is here
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   



##########
be/src/exec/schema_scanner/schema_dummy_scanner.h:
##########
@@ -26,7 +26,7 @@ class SchemaDummyScanner : public SchemaScanner {
     SchemaDummyScanner();
     virtual ~SchemaDummyScanner();
     virtual Status start(RuntimeState* state = nullptr);

Review Comment:
   warning: 'start' overrides a member function but is not marked 'override' [clang-diagnostic-inconsistent-missing-override]
   ```cpp
       virtual Status start(RuntimeState* state = nullptr);
                      ^
   ```
   **be/src/exec/schema_scanner.h:84:** overridden virtual function is here
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   



##########
be/src/exec/schema_scanner/schema_table_privileges_scanner.h:
##########
@@ -28,14 +28,12 @@ class SchemaTablePrivilegesScanner : public SchemaScanner {
     virtual ~SchemaTablePrivilegesScanner();
 
     virtual Status start(RuntimeState* state);

Review Comment:
   warning: 'start' overrides a member function but is not marked 'override' [clang-diagnostic-inconsistent-missing-override]
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   **be/src/exec/schema_scanner.h:84:** overridden virtual function is here
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   



##########
be/src/exec/schema_scanner/schema_user_privileges_scanner.h:
##########
@@ -28,14 +28,12 @@ class SchemaUserPrivilegesScanner : public SchemaScanner {
     virtual ~SchemaUserPrivilegesScanner();
 
     virtual Status start(RuntimeState* state);

Review Comment:
   warning: 'start' overrides a member function but is not marked 'override' [clang-diagnostic-inconsistent-missing-override]
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   **be/src/exec/schema_scanner.h:84:** overridden virtual function is here
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   



##########
be/src/exec/schema_scanner/schema_views_scanner.h:
##########
@@ -28,14 +28,13 @@ class SchemaViewsScanner : public SchemaScanner {
     virtual ~SchemaViewsScanner();
 
     virtual Status start(RuntimeState* state);

Review Comment:
   warning: 'start' overrides a member function but is not marked 'override' [clang-diagnostic-inconsistent-missing-override]
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   **be/src/exec/schema_scanner.h:84:** overridden virtual function is here
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   



##########
be/src/exec/schema_scanner/schema_schemata_scanner.h:
##########
@@ -28,12 +28,11 @@ class SchemaSchemataScanner : public SchemaScanner {
     virtual ~SchemaSchemataScanner();
 
     virtual Status start(RuntimeState* state);

Review Comment:
   warning: 'start' overrides a member function but is not marked 'override' [clang-diagnostic-inconsistent-missing-override]
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   **be/src/exec/schema_scanner.h:84:** overridden virtual function is here
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   



##########
be/src/exec/schema_scanner/schema_variables_scanner.h:
##########
@@ -31,21 +31,20 @@ class SchemaVariablesScanner : public SchemaScanner {
     virtual ~SchemaVariablesScanner();
 
     virtual Status start(RuntimeState* state);

Review Comment:
   warning: 'start' overrides a member function but is not marked 'override' [clang-diagnostic-inconsistent-missing-override]
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   **be/src/exec/schema_scanner.h:84:** overridden virtual function is here
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   



##########
be/src/exec/schema_scanner/schema_schema_privileges_scanner.h:
##########
@@ -28,14 +28,12 @@ class SchemaSchemaPrivilegesScanner : public SchemaScanner {
     virtual ~SchemaSchemaPrivilegesScanner();
 
     virtual Status start(RuntimeState* state);

Review Comment:
   warning: 'start' overrides a member function but is not marked 'override' [clang-diagnostic-inconsistent-missing-override]
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   **be/src/exec/schema_scanner.h:84:** overridden virtual function is here
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1070201936


##########
be/src/exec/schema_scanner.cpp:
##########
@@ -231,4 +232,161 @@ Status SchemaScanner::create_tuple_desc(ObjectPool* pool) {
     return Status::OK();
 }
 
+Status SchemaScanner::fill_dest_column(vectorized::Block* block, void* data,
+                                       const SlotDescriptor* slot_desc) {
+    if (!block->has(slot_desc->col_name())) {
+        return Status::OK();
+    }
+    vectorized::IColumn* col_ptr = const_cast<vectorized::IColumn*>(

Review Comment:
   这里你不要直接使用裸指针, 尽量使用 ColumnPtr 这种类似智能指针的结构



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1070919026


##########
be/src/exec/schema_scanner.cpp:
##########
@@ -231,4 +233,162 @@ Status SchemaScanner::create_tuple_desc(ObjectPool* pool) {
     return Status::OK();
 }
 
+Status SchemaScanner::fill_dest_column(vectorized::Block* block, void* data,
+                                       const SlotDescriptor* slot_desc) {
+    if (!block->has(slot_desc->col_name())) {
+        return Status::OK();
+    }
+    vectorized::MutableColumnPtr column_ptr =
+            std::move(*block->get_by_name(slot_desc->col_name()).column).mutate();

Review Comment:
   warning: std::move of the const expression has no effect; remove std::move() [performance-move-const-arg]
   
   ```suggestion
               *block->get_by_name(slot_desc->col_name()).column.mutate();
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1071188209


##########
be/src/exec/schema_scanner.cpp:
##########
@@ -231,4 +233,162 @@ Status SchemaScanner::create_tuple_desc(ObjectPool* pool) {
     return Status::OK();
 }
 
+Status SchemaScanner::fill_dest_column(vectorized::Block* block, void* data,
+                                       const SlotDescriptor* slot_desc) {
+    if (!block->has(slot_desc->col_name())) {
+        return Status::OK();
+    }
+    vectorized::MutableColumnPtr column_ptr =
+            std::move(*block->get_by_name(slot_desc->col_name()).column).assume_mutable();

Review Comment:
   warning: std::move of the const expression has no effect; remove std::move() [performance-move-const-arg]
   
   ```suggestion
               *block->get_by_name(slot_desc->col_name()).column.assume_mutable();
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by github-actions.
github-actions[bot] commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1089869561


##########
be/src/exec/schema_scanner.cpp:
##########
@@ -231,4 +233,156 @@ Status SchemaScanner::create_tuple_desc(ObjectPool* pool) {
     return Status::OK();
 }
 
+Status SchemaScanner::fill_dest_column(vectorized::Block* block, void* data,
+                                       const ColumnDesc& col_desc) {
+    if (!block->has(col_desc.name)) {
+        return Status::OK();
+    }
+    vectorized::MutableColumnPtr column_ptr =
+            std::move(*block->get_by_name(col_desc.name).column).assume_mutable();

Review Comment:
   warning: std::move of the const expression has no effect; remove std::move() [performance-move-const-arg]
   
   ```suggestion
               *block->get_by_name(col_desc.name).column.assume_mutable();
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by github-actions.
github-actions[bot] commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1089757755


##########
be/src/vec/exec/vschema_scan_node.cpp:
##########
@@ -255,74 +257,61 @@ Status VSchemaScanNode::get_next(RuntimeState* state, vectorized::Block* block,
         return Status::InternalError("used before initialize.");
     }
     RETURN_IF_CANCELLED(state);
-    std::vector<vectorized::MutableColumnPtr> columns(_slot_num);
     bool schema_eos = false;
 
+    block->clear();
+
+    std::vector<int> index_map_inv(_src_tuple_desc->slots().size());
+    for (int i = 0; i < _slot_num; ++i) {
+        auto dest_slot_desc = _dest_tuple_desc->slots()[i];
+        block->insert(ColumnWithTypeAndName(dest_slot_desc->get_empty_mutable_column(),
+                                            dest_slot_desc->get_data_type_ptr(),
+                                            dest_slot_desc->col_name()));
+
+        // Map from index in column of schema table to slots.
+        index_map_inv[_index_map[i]] = i;
+    }
+
     do {
-        bool mem_reuse = block->mem_reuse();
-        DCHECK(block->rows() == 0);
-
-        columns.resize(_slot_num);
-        for (int i = 0; i < _slot_num; ++i) {
-            if (mem_reuse) {
-                columns[i] = std::move(*block->get_by_position(i).column).mutate();
-            } else {
-                columns[i] = _dest_tuple_desc->slots()[i]->get_empty_mutable_column();
-            }
+        vectorized::Block src_block;
+
+        for (int i = 0; i < _src_tuple_desc->slots().size(); ++i) {
+            int j = index_map_inv[i];
+            auto slot_desc = _dest_tuple_desc->slots()[j];
+            src_block.insert(ColumnWithTypeAndName(slot_desc->get_empty_mutable_column(),
+                                                   slot_desc->get_data_type_ptr(),
+                                                   slot_desc->col_name()));
         }
         while (true) {
             RETURN_IF_CANCELLED(state);
 
             // get all slots from schema table.
-            RETURN_IF_ERROR(_schema_scanner->get_next_row(_src_single_tuple, _tuple_pool.get(),
-                                                          &schema_eos));
+            RETURN_IF_ERROR(_schema_scanner->get_next_block(&src_block, &schema_eos));
+
             if (schema_eos) {
                 *eos = true;
                 break;
             }
-            // tuple project
-            project_tuple();
 
-            for (int i = 0; i < _slot_num; ++i) {
-                auto slot_desc = _dest_tuple_desc->slots()[i];
-                if (!slot_desc->is_materialized()) {
-                    continue;
-                }
-
-                if (_dest_single_tuple->is_null(slot_desc->null_indicator_offset())) {
-                    if (slot_desc->is_nullable()) {
-                        auto* nullable_column =
-                                reinterpret_cast<vectorized::ColumnNullable*>(columns[i].get());
-                        nullable_column->insert_data(nullptr, 0);
-                    } else {
-                        return Status::InternalError(
-                                "nonnull column contains NULL. table={}, column={}", _table_name,
-                                slot_desc->col_name());
-                    }
-                } else {
-                    RETURN_IF_ERROR(write_slot_to_vectorized_column(
-                            _dest_single_tuple->get_slot(slot_desc->tuple_offset()), slot_desc,
-                            &columns[i]));
-                }
-            }
-            if (columns[0]->size() == state->batch_size()) {
+            if (src_block.rows() >= state->batch_size()) {
                 break;
             }
         }
-        if (!columns.empty() && !columns[0]->empty()) {
-            auto n_columns = 0;
-            if (!mem_reuse) {
-                for (const auto slot_desc : _dest_tuple_desc->slots()) {
-                    block->insert(ColumnWithTypeAndName(std::move(columns[n_columns++]),
-                                                        slot_desc->get_data_type_ptr(),
-                                                        slot_desc->col_name()));
-                }
-            } else {
-                columns.clear();
+
+        if (src_block.rows()) {
+            // block->check_number_of_rows();
+            for (int i = 0; i < _slot_num; ++i) {
+                auto dest_slot_desc = _dest_tuple_desc->slots()[i];
+                vectorized::MutableColumnPtr column_ptr =
+                        std::move(*block->get_by_position(i).column).mutate();

Review Comment:
   warning: std::move of the const expression has no effect; remove std::move() [performance-move-const-arg]
   
   ```suggestion
                           *block->get_by_position(i).column.mutate();
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by github-actions.
github-actions[bot] commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1089915305


##########
be/src/exec/schema_scanner.cpp:
##########
@@ -33,34 +33,24 @@
 #include "exec/schema_scanner/schema_variables_scanner.h"
 #include "exec/schema_scanner/schema_views_scanner.h"
 #include "runtime/define_primitive_type.h"
+#include "vec/columns/column.h"
 #include "vec/common/string_ref.h"
+#include "vec/core/block.h"
 
 namespace doris {
 
 DorisServer* SchemaScanner::_s_doris_server;
 
-SchemaScanner::SchemaScanner(ColumnDesc* columns, int column_num)
+SchemaScanner::SchemaScanner(const std::vector<ColumnDesc>& columns)
         : _is_init(false),
           _param(nullptr),
           _columns(columns),
-          _column_num(column_num),
-          _tuple_desc(nullptr),
           _schema_table_type(TSchemaTableType::SCH_INVALID) {}
 
-SchemaScanner::SchemaScanner(ColumnDesc* columns, int column_num, TSchemaTableType::type type)
-        : _is_init(false),
-          _param(nullptr),
-          _columns(columns),
-          _column_num(column_num),
-          _tuple_desc(nullptr),
-          _schema_table_type(type) {}
+SchemaScanner::SchemaScanner(const std::vector<ColumnDesc>& columns, TSchemaTableType::type type)
+        : _is_init(false), _param(nullptr), _columns(columns), _schema_table_type(type) {}
 
-SchemaScanner::~SchemaScanner() {
-    if (_is_create_columns == true && _columns != nullptr) {
-        delete[] _columns;
-        _columns = nullptr;
-    }
-}
+SchemaScanner::~SchemaScanner() {}

Review Comment:
   warning: use '= default' to define a trivial destructor [modernize-use-equals-default]
   
   ```suggestion
   SchemaScanner::~SchemaScanner() = default;
   ```
   



##########
be/src/vec/exec/vschema_scan_node.cpp:
##########
@@ -33,26 +40,13 @@ VSchemaScanNode::VSchemaScanNode(ObjectPool* pool, const TPlanNode& tnode,
           _is_init(false),
           _table_name(tnode.schema_scan_node.table_name),
           _tuple_id(tnode.schema_scan_node.tuple_id),
-          _src_tuple_desc(nullptr),
           _dest_tuple_desc(nullptr),
           _tuple_idx(0),
           _slot_num(0),
           _tuple_pool(nullptr),
-          _schema_scanner(nullptr),
-          _src_tuple(nullptr),
-          _src_single_tuple(nullptr),
-          _dest_single_tuple(nullptr) {}
-
-VSchemaScanNode::~VSchemaScanNode() {
-    delete[] reinterpret_cast<char*>(_src_tuple);
-    _src_tuple = nullptr;
+          _schema_scanner(nullptr) {}
 
-    delete[] reinterpret_cast<char*>(_src_single_tuple);
-    _src_single_tuple = nullptr;
-
-    delete[] reinterpret_cast<char*>(_dest_single_tuple);
-    _dest_single_tuple = nullptr;
-}
+VSchemaScanNode::~VSchemaScanNode() {}

Review Comment:
   warning: use '= default' to define a trivial destructor [modernize-use-equals-default]
   
   ```suggestion
   VSchemaScanNode::~VSchemaScanNode() = default;
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1064269297


##########
be/src/exec/schema_scanner/schema_tables_scanner.cpp:
##########
@@ -52,7 +55,9 @@ SchemaTablesScanner::SchemaTablesScanner()
         : SchemaScanner(_s_tbls_columns,
                         sizeof(_s_tbls_columns) / sizeof(SchemaScanner::ColumnDesc)),
           _db_index(0),
-          _table_index(0) {}
+          _table_index(0) {

Review Comment:
   _table_index is useless



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1064429007


##########
be/src/exec/schema_scanner/schema_tables_scanner.h:
##########
@@ -29,13 +31,14 @@ class SchemaTablesScanner : public SchemaScanner {
 
     virtual Status start(RuntimeState* state);

Review Comment:
   warning: 'start' overrides a member function but is not marked 'override' [clang-diagnostic-inconsistent-missing-override]
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   **be/src/exec/schema_scanner.h:84:** overridden virtual function is here
   ```cpp
       virtual Status start(RuntimeState* state);
                      ^
   ```
   



##########
be/src/exec/schema_scanner/schema_tables_scanner.h:
##########
@@ -29,13 +31,14 @@
 
     virtual Status start(RuntimeState* state);
     virtual Status get_next_row(Tuple* tuple, MemPool* pool, bool* eos);

Review Comment:
   warning: 'get_next_row' overrides a member function but is not marked 'override' [clang-diagnostic-inconsistent-missing-override]
   ```cpp
       virtual Status get_next_row(Tuple* tuple, MemPool* pool, bool* eos);
                      ^
   ```
   **be/src/exec/schema_scanner.h:85:** overridden virtual function is here
   ```cpp
       virtual Status get_next_row(Tuple* tuple, MemPool* pool, bool* eos);
                      ^
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #15718: [fix](schema scanner)change schema_scanner::get_next_row to get_next_block

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #15718:
URL: https://github.com/apache/doris/pull/15718#discussion_r1067780272


##########
be/src/exec/schema_scan_node.cpp:
##########
@@ -279,7 +279,7 @@ Status SchemaScanNode::get_next(RuntimeState* state, RowBatch* row_batch, bool*
             return Status::OK();
         }
 
-        RETURN_IF_ERROR(_schema_scanner->get_next_row(_src_tuple, _tuple_pool.get(), &scanner_eos));
+        // RETURN_IF_ERROR(_schema_scanner->get_next_row(_src_tuple, _tuple_pool.get(), &scanner_eos));

Review Comment:
   just remove the code



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org