You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/07/27 14:10:41 UTC

[GitHub] [doris] BePPPower opened a new pull request, #11266: [feature](information_schema) add 'segments' table into information_s…

BePPPower opened a new pull request, #11266:
URL: https://github.com/apache/doris/pull/11266

   …chema
   
   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   In some cases, we would like to check the status of `rowsets` and `segments`. So I add the `segments` table into information_schema database .
   The table schema of `segments` is:
   ```
   MySQL [(none)]> desc information_schema.segments;
   +-------------------+------------+------+-------+---------+-------+
   | Field             | Type       | Null | Key   | Default | Extra |
   +-------------------+------------+------+-------+---------+-------+
   | ROWSET_ID         | VARCHAR(*) | Yes  | false | NULL    |       |
   | TABLET_ID         | BIGINT     | Yes  | false | NULL    |       |
   | ROWSET_NUM_ROWS   | BIGINT     | Yes  | false | NULL    |       |
   | TXN_ID            | BIGINT     | Yes  | false | NULL    |       |
   | PARTITION_ID      | BIGINT     | Yes  | false | NULL    |       |
   | NUM_SEGMENTS      | BIGINT     | Yes  | false | NULL    |       |
   | START_VERSION     | BIGINT     | Yes  | false | NULL    |       |
   | END_VERSION       | BIGINT     | Yes  | false | NULL    |       |
   | INDEX_DISK_SIZE   | BIGINT     | Yes  | false | NULL    |       |
   | DATA_DISK_SIZE    | BIGINT     | Yes  | false | NULL    |       |
   | SEGMENT_VERSION   | BIGINT     | Yes  | false | NULL    |       |
   | SEGMENTS_NUM_ROWS | BIGINT     | Yes  | false | NULL    |       |
   +-------------------+------------+------+-------+---------+-------+
   ```
   
   and then we can search `rowsets` and `segments` info from `segments` table, like this:
   ```
   MySQL [(none)]> select * from information_schema.segments limit 10;
   +--------------------------------------------------+-----------+-----------------+--------+--------------+--------------+---------------+-------------+-----------------+----------------+-----------------+-------------------+
   | ROWSET_ID                                        | TABLET_ID | ROWSET_NUM_ROWS | TXN_ID | PARTITION_ID | NUM_SEGMENTS | START_VERSION | END_VERSION | INDEX_DISK_SIZE | DATA_DISK_SIZE | SEGMENT_VERSION | SEGMENTS_NUM_ROWS |
   +--------------------------------------------------+-----------+-----------------+--------+--------------+--------------+---------------+-------------+-----------------+----------------+-----------------+-------------------+
   | 0200000000000186ed4e15ad2de2d6c659b97e076eb43383 |     12583 |               5 |   1014 |        12580 |            1 |             2 |           2 |             147 |           1116 |               1 |                 5 |
   | 0200000000000190ed4e15ad2de2d6c659b97e076eb43383 |     12573 |             831 |   1016 |        12548 |            1 |             2 |           2 |             614 |          78556 |               1 |               831 |
   | 0200000000000191ed4e15ad2de2d6c659b97e076eb43383 |     12571 |             829 |   1016 |        12548 |            1 |             2 |           2 |             558 |          78484 |               1 |               829 |
   | 0200000000000189ed4e15ad2de2d6c659b97e076eb43383 |     12569 |             835 |   1016 |        12548 |            1 |             2 |           2 |             573 |          79380 |               1 |               835 |
   | 020000000000018eed4e15ad2de2d6c659b97e076eb43383 |     12567 |             844 |   1016 |        12548 |            1 |             2 |           2 |             590 |          79990 |               1 |               844 |
   | 020000000000018ced4e15ad2de2d6c659b97e076eb43383 |     12565 |             846 |   1016 |        12548 |            1 |             2 |           2 |             615 |          80299 |               1 |               846 |
   | 020000000000018bed4e15ad2de2d6c659b97e076eb43383 |     12563 |             823 |   1016 |        12548 |            1 |             2 |           2 |             580 |          77952 |               1 |               823 |
   | 020000000000018ded4e15ad2de2d6c659b97e076eb43383 |     12561 |             842 |   1016 |        12548 |            1 |             2 |           2 |             582 |          79726 |               1 |               842 |
   | 0200000000000188ed4e15ad2de2d6c659b97e076eb43383 |     12559 |             803 |   1016 |        12548 |            1 |             2 |           2 |             559 |          75996 |               1 |               803 |
   | 0200000000000192ed4e15ad2de2d6c659b97e076eb43383 |     12557 |             823 |   1016 |        12548 |            1 |             2 |           2 |             629 |          78780 |               1 |               823 |
   +--------------------------------------------------+-----------+-----------------+--------+--------------+--------------+---------------+-------------+-----------------+----------------+-----------------+-------------------+
   10 rows in set (0.58 sec)
   ```
   
   
   ## Checklist(Required)
   
   1. Type of your changes:
       - [ ] Improvement
       - [ ] Fix
       - [ ] Feature-WIP
       - [x] Feature
       - [ ] Doc
       - [ ] Refator
       - [ ] Others: 
   2. Does it affect the original behavior: 
       - [ ] Yes
       - [x] No
       - [ ] I don't know
   3. Has unit tests been added:
       - [ ] Yes
       - [x] No
       - [ ] No Need
   4. Has document been added or modified:
       - [ ] Yes
       - [ ] No
       - [x] No Need
   5. Does it need to update dependencies:
       - [ ] Yes
       - [x] No
   6. Are there any changes that cannot be rolled back:
       - [ ] Yes
       - [x] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] BePPPower commented on a diff in pull request #11266: [feature](information_schema) add 'segments' table into information_s…

Posted by GitBox <gi...@apache.org>.
BePPPower commented on code in PR #11266:
URL: https://github.com/apache/doris/pull/11266#discussion_r940068980


##########
fe/fe-core/src/main/java/org/apache/doris/planner/SingleNodePlanner.java:
##########
@@ -1699,7 +1699,11 @@ private PlanNode createScanNode(Analyzer analyzer, TableRef tblRef, SelectStmt s
                 scanNode = new MysqlScanNode(ctx.getNextNodeId(), tblRef.getDesc(), (MysqlTable) tblRef.getTable());
                 break;
             case SCHEMA:
-                scanNode = new SchemaScanNode(ctx.getNextNodeId(), tblRef.getDesc());
+                if (BackendSchemaScanNode.isBackendSchemaTable(tblRef.getDesc().getTable().getName())) {

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] BePPPower commented on a diff in pull request #11266: [feature](information_schema) add 'segments' table into information_s…

Posted by GitBox <gi...@apache.org>.
BePPPower commented on code in PR #11266:
URL: https://github.com/apache/doris/pull/11266#discussion_r940068779


##########
be/src/exec/schema_scanner/schema_segments_scanner.cpp:
##########
@@ -0,0 +1,195 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "exec/schema_scanner/schema_segments_scanner.h"
+
+#include <cstddef>
+
+#include "common/status.h"
+#include "gutil/integral_types.h"
+#include "olap/rowset/beta_rowset.h"
+#include "olap/rowset/rowset.h"
+#include "olap/rowset/segment_v2/segment.h"
+#include "olap/segment_loader.h"
+#include "olap/storage_engine.h"
+#include "olap/tablet.h"
+#include "runtime/descriptors.h"
+#include "runtime/primitive_type.h"
+#include "runtime/string_value.h"
+namespace doris {
+SchemaScanner::ColumnDesc SchemaSegmentsScanner::_s_tbls_columns[] = {
+        //   name,       type,          size,     is_null
+        {"BACKEND_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"ROWSET_ID", TYPE_VARCHAR, sizeof(StringValue), true},
+        {"TABLET_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"ROWSET_NUM_ROWS", TYPE_BIGINT, sizeof(int64_t), true},
+        {"TXN_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"PARTITION_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"NUM_SEGMENTS", TYPE_BIGINT, sizeof(int64_t), true},
+        {"START_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        {"END_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        // size_t or int64_t???
+        {"INDEX_DISK_SIZE", TYPE_BIGINT, sizeof(size_t), true},
+        {"DATA_DISK_SIZE", TYPE_BIGINT, sizeof(size_t), true},
+        {"SEGMENT_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        {"SEGMENTS_NUM_ROWS", TYPE_BIGINT, sizeof(int64_t), true},
+};
+
+SchemaSegmentsScanner::SchemaSegmentsScanner()
+        : SchemaScanner(_s_tbls_columns,
+                        sizeof(_s_tbls_columns) / sizeof(SchemaScanner::ColumnDesc)),
+          backend_id_(0),
+          rowsets_idx_(0),
+          segments_idx_(0) {};
+
+Status SchemaSegmentsScanner::start(RuntimeState* state) {
+    if (!_is_init) {
+        return Status::InternalError("used before initialized.");
+    }
+    backend_id_ = state->backend_id();
+    RETURN_IF_ERROR(get_all_rowsets());
+    return Status::OK();
+}
+
+Status SchemaSegmentsScanner::get_next_row(Tuple* tuple, MemPool* pool, bool* eos) {
+    if (!_is_init) {
+        return Status::InternalError("Used before initialized.");
+    }
+    if (nullptr == tuple || nullptr == pool || nullptr == eos) {
+        return Status::InternalError("input pointer is nullptr.");
+    }
+    while (segments_idx_ >= segments_.size()) {
+        if (rowsets_idx_ < rowsets_.size()) {
+            RETURN_IF_ERROR(get_new_segments());
+        } else {
+            *eos = true;
+            return Status::OK();
+        }
+    }
+    *eos = false;
+    return fill_one_row(tuple, pool);
+}
+
+Status SchemaSegmentsScanner::get_all_rowsets() {
+    std::vector<TabletSharedPtr> tablets =
+            StorageEngine::instance()->tablet_manager()->get_all_tablet();
+    for (const auto& tablet : tablets) {
+        TabletMetaSharedPtr tabletMetas = tablet->tablet_meta();

Review Comment:
   it's not used. done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] BePPPower commented on a diff in pull request #11266: [feature](information_schema) add `rowsets` table into information_s…

Posted by GitBox <gi...@apache.org>.
BePPPower commented on code in PR #11266:
URL: https://github.com/apache/doris/pull/11266#discussion_r940803305


##########
be/src/exec/schema_scanner/schema_segments_scanner.cpp:
##########
@@ -0,0 +1,195 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "exec/schema_scanner/schema_segments_scanner.h"
+
+#include <cstddef>
+
+#include "common/status.h"
+#include "gutil/integral_types.h"
+#include "olap/rowset/beta_rowset.h"
+#include "olap/rowset/rowset.h"
+#include "olap/rowset/segment_v2/segment.h"
+#include "olap/segment_loader.h"
+#include "olap/storage_engine.h"
+#include "olap/tablet.h"
+#include "runtime/descriptors.h"
+#include "runtime/primitive_type.h"
+#include "runtime/string_value.h"
+namespace doris {
+SchemaScanner::ColumnDesc SchemaSegmentsScanner::_s_tbls_columns[] = {
+        //   name,       type,          size,     is_null
+        {"BACKEND_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"ROWSET_ID", TYPE_VARCHAR, sizeof(StringValue), true},
+        {"TABLET_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"ROWSET_NUM_ROWS", TYPE_BIGINT, sizeof(int64_t), true},
+        {"TXN_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"PARTITION_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"NUM_SEGMENTS", TYPE_BIGINT, sizeof(int64_t), true},
+        {"START_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        {"END_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        // size_t or int64_t???
+        {"INDEX_DISK_SIZE", TYPE_BIGINT, sizeof(size_t), true},
+        {"DATA_DISK_SIZE", TYPE_BIGINT, sizeof(size_t), true},
+        {"SEGMENT_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        {"SEGMENTS_NUM_ROWS", TYPE_BIGINT, sizeof(int64_t), true},
+};
+
+SchemaSegmentsScanner::SchemaSegmentsScanner()
+        : SchemaScanner(_s_tbls_columns,
+                        sizeof(_s_tbls_columns) / sizeof(SchemaScanner::ColumnDesc)),
+          backend_id_(0),
+          rowsets_idx_(0),
+          segments_idx_(0) {};
+
+Status SchemaSegmentsScanner::start(RuntimeState* state) {
+    if (!_is_init) {
+        return Status::InternalError("used before initialized.");
+    }
+    backend_id_ = state->backend_id();
+    RETURN_IF_ERROR(get_all_rowsets());
+    return Status::OK();
+}
+
+Status SchemaSegmentsScanner::get_next_row(Tuple* tuple, MemPool* pool, bool* eos) {
+    if (!_is_init) {
+        return Status::InternalError("Used before initialized.");
+    }
+    if (nullptr == tuple || nullptr == pool || nullptr == eos) {
+        return Status::InternalError("input pointer is nullptr.");
+    }
+    while (segments_idx_ >= segments_.size()) {
+        if (rowsets_idx_ < rowsets_.size()) {
+            RETURN_IF_ERROR(get_new_segments());
+        } else {
+            *eos = true;
+            return Status::OK();
+        }
+    }
+    *eos = false;
+    return fill_one_row(tuple, pool);
+}
+
+Status SchemaSegmentsScanner::get_all_rowsets() {
+    std::vector<TabletSharedPtr> tablets =
+            StorageEngine::instance()->tablet_manager()->get_all_tablet();
+    for (const auto& tablet : tablets) {
+        TabletMetaSharedPtr tabletMetas = tablet->tablet_meta();
+
+        // all rowset
+        std::vector<std::pair<Version, RowsetSharedPtr>> all_rowsets;
+        {
+            std::shared_lock rowset_ldlock(tablet->get_header_lock());
+            tablet->acquire_version_and_rowsets(&all_rowsets);
+        }
+        for (const auto& version_and_rowset : all_rowsets) {
+            RowsetSharedPtr rowset = version_and_rowset.second;
+            rowsets_.emplace_back(rowset);
+        }
+    }
+    return Status::OK();
+}
+
+Status SchemaSegmentsScanner::get_new_segments() {
+    BetaRowsetSharedPtr beta_rowset = std::dynamic_pointer_cast<BetaRowset>(rowsets_[rowsets_idx_]);
+    segments_.clear();
+    RETURN_IF_ERROR(beta_rowset->load_segments(&segments_));

Review Comment:
   done, we can't lightweight load segments, so we have to give up the segments information.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #11266: [feature](information_schema) add 'segments' table into information_s…

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #11266:
URL: https://github.com/apache/doris/pull/11266#discussion_r939784668


##########
fe/fe-core/src/main/java/org/apache/doris/planner/SingleNodePlanner.java:
##########
@@ -1699,7 +1699,11 @@ private PlanNode createScanNode(Analyzer analyzer, TableRef tblRef, SelectStmt s
                 scanNode = new MysqlScanNode(ctx.getNextNodeId(), tblRef.getDesc(), (MysqlTable) tblRef.getTable());
                 break;
             case SCHEMA:
-                scanNode = new SchemaScanNode(ctx.getNextNodeId(), tblRef.getDesc());
+                if (BackendSchemaScanNode.isBackendSchemaTable(tblRef.getDesc().getTable().getName())) {

Review Comment:
   BackendPartitionedSchemaScanNode



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #11266: [feature](information_schema) add 'segments' table into information_s…

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #11266:
URL: https://github.com/apache/doris/pull/11266#discussion_r939952743


##########
be/src/exec/schema_scanner/schema_segments_scanner.cpp:
##########
@@ -0,0 +1,195 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "exec/schema_scanner/schema_segments_scanner.h"
+
+#include <cstddef>
+
+#include "common/status.h"
+#include "gutil/integral_types.h"
+#include "olap/rowset/beta_rowset.h"
+#include "olap/rowset/rowset.h"
+#include "olap/rowset/segment_v2/segment.h"
+#include "olap/segment_loader.h"
+#include "olap/storage_engine.h"
+#include "olap/tablet.h"
+#include "runtime/descriptors.h"
+#include "runtime/primitive_type.h"
+#include "runtime/string_value.h"
+namespace doris {
+SchemaScanner::ColumnDesc SchemaSegmentsScanner::_s_tbls_columns[] = {
+        //   name,       type,          size,     is_null
+        {"BACKEND_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"ROWSET_ID", TYPE_VARCHAR, sizeof(StringValue), true},
+        {"TABLET_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"ROWSET_NUM_ROWS", TYPE_BIGINT, sizeof(int64_t), true},
+        {"TXN_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"PARTITION_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"NUM_SEGMENTS", TYPE_BIGINT, sizeof(int64_t), true},
+        {"START_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        {"END_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        // size_t or int64_t???
+        {"INDEX_DISK_SIZE", TYPE_BIGINT, sizeof(size_t), true},
+        {"DATA_DISK_SIZE", TYPE_BIGINT, sizeof(size_t), true},
+        {"SEGMENT_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        {"SEGMENTS_NUM_ROWS", TYPE_BIGINT, sizeof(int64_t), true},
+};
+
+SchemaSegmentsScanner::SchemaSegmentsScanner()
+        : SchemaScanner(_s_tbls_columns,
+                        sizeof(_s_tbls_columns) / sizeof(SchemaScanner::ColumnDesc)),
+          backend_id_(0),
+          rowsets_idx_(0),
+          segments_idx_(0) {};
+
+Status SchemaSegmentsScanner::start(RuntimeState* state) {
+    if (!_is_init) {
+        return Status::InternalError("used before initialized.");
+    }
+    backend_id_ = state->backend_id();
+    RETURN_IF_ERROR(get_all_rowsets());
+    return Status::OK();
+}
+
+Status SchemaSegmentsScanner::get_next_row(Tuple* tuple, MemPool* pool, bool* eos) {
+    if (!_is_init) {
+        return Status::InternalError("Used before initialized.");
+    }
+    if (nullptr == tuple || nullptr == pool || nullptr == eos) {
+        return Status::InternalError("input pointer is nullptr.");
+    }
+    while (segments_idx_ >= segments_.size()) {
+        if (rowsets_idx_ < rowsets_.size()) {
+            RETURN_IF_ERROR(get_new_segments());
+        } else {
+            *eos = true;
+            return Status::OK();
+        }
+    }
+    *eos = false;
+    return fill_one_row(tuple, pool);
+}
+
+Status SchemaSegmentsScanner::get_all_rowsets() {
+    std::vector<TabletSharedPtr> tablets =
+            StorageEngine::instance()->tablet_manager()->get_all_tablet();
+    for (const auto& tablet : tablets) {
+        TabletMetaSharedPtr tabletMetas = tablet->tablet_meta();
+
+        // all rowset
+        std::vector<std::pair<Version, RowsetSharedPtr>> all_rowsets;
+        {
+            std::shared_lock rowset_ldlock(tablet->get_header_lock());
+            tablet->acquire_version_and_rowsets(&all_rowsets);
+        }
+        for (const auto& version_and_rowset : all_rowsets) {
+            RowsetSharedPtr rowset = version_and_rowset.second;
+            rowsets_.emplace_back(rowset);
+        }
+    }
+    return Status::OK();
+}
+
+Status SchemaSegmentsScanner::get_new_segments() {
+    BetaRowsetSharedPtr beta_rowset = std::dynamic_pointer_cast<BetaRowset>(rowsets_[rowsets_idx_]);
+    segments_.clear();
+    RETURN_IF_ERROR(beta_rowset->load_segments(&segments_));

Review Comment:
   你看121行,我们只是为了拿到一下PB 结构,感觉没有必要这么大动干戈。



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #11266: [feature](information_schema) add 'segments' table into information_s…

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #11266:
URL: https://github.com/apache/doris/pull/11266#discussion_r939942694


##########
be/src/exec/schema_scanner/schema_segments_scanner.cpp:
##########
@@ -0,0 +1,195 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "exec/schema_scanner/schema_segments_scanner.h"
+
+#include <cstddef>
+
+#include "common/status.h"
+#include "gutil/integral_types.h"
+#include "olap/rowset/beta_rowset.h"
+#include "olap/rowset/rowset.h"
+#include "olap/rowset/segment_v2/segment.h"
+#include "olap/segment_loader.h"
+#include "olap/storage_engine.h"
+#include "olap/tablet.h"
+#include "runtime/descriptors.h"
+#include "runtime/primitive_type.h"
+#include "runtime/string_value.h"
+namespace doris {
+SchemaScanner::ColumnDesc SchemaSegmentsScanner::_s_tbls_columns[] = {
+        //   name,       type,          size,     is_null
+        {"BACKEND_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"ROWSET_ID", TYPE_VARCHAR, sizeof(StringValue), true},
+        {"TABLET_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"ROWSET_NUM_ROWS", TYPE_BIGINT, sizeof(int64_t), true},
+        {"TXN_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"PARTITION_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"NUM_SEGMENTS", TYPE_BIGINT, sizeof(int64_t), true},
+        {"START_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        {"END_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        // size_t or int64_t???
+        {"INDEX_DISK_SIZE", TYPE_BIGINT, sizeof(size_t), true},
+        {"DATA_DISK_SIZE", TYPE_BIGINT, sizeof(size_t), true},
+        {"SEGMENT_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        {"SEGMENTS_NUM_ROWS", TYPE_BIGINT, sizeof(int64_t), true},
+};
+
+SchemaSegmentsScanner::SchemaSegmentsScanner()
+        : SchemaScanner(_s_tbls_columns,
+                        sizeof(_s_tbls_columns) / sizeof(SchemaScanner::ColumnDesc)),
+          backend_id_(0),
+          rowsets_idx_(0),
+          segments_idx_(0) {};
+
+Status SchemaSegmentsScanner::start(RuntimeState* state) {
+    if (!_is_init) {
+        return Status::InternalError("used before initialized.");
+    }
+    backend_id_ = state->backend_id();
+    RETURN_IF_ERROR(get_all_rowsets());
+    return Status::OK();
+}
+
+Status SchemaSegmentsScanner::get_next_row(Tuple* tuple, MemPool* pool, bool* eos) {
+    if (!_is_init) {
+        return Status::InternalError("Used before initialized.");
+    }
+    if (nullptr == tuple || nullptr == pool || nullptr == eos) {
+        return Status::InternalError("input pointer is nullptr.");
+    }
+    while (segments_idx_ >= segments_.size()) {
+        if (rowsets_idx_ < rowsets_.size()) {
+            RETURN_IF_ERROR(get_new_segments());
+        } else {
+            *eos = true;
+            return Status::OK();
+        }
+    }
+    *eos = false;
+    return fill_one_row(tuple, pool);
+}
+
+Status SchemaSegmentsScanner::get_all_rowsets() {
+    std::vector<TabletSharedPtr> tablets =
+            StorageEngine::instance()->tablet_manager()->get_all_tablet();
+    for (const auto& tablet : tablets) {
+        TabletMetaSharedPtr tabletMetas = tablet->tablet_meta();

Review Comment:
   Is this field useful?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #11266: [feature](information_schema) add 'segments' table into information_s…

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #11266:
URL: https://github.com/apache/doris/pull/11266#discussion_r939944266


##########
be/src/exec/schema_scanner/schema_segments_scanner.cpp:
##########
@@ -0,0 +1,195 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "exec/schema_scanner/schema_segments_scanner.h"
+
+#include <cstddef>
+
+#include "common/status.h"
+#include "gutil/integral_types.h"
+#include "olap/rowset/beta_rowset.h"
+#include "olap/rowset/rowset.h"
+#include "olap/rowset/segment_v2/segment.h"
+#include "olap/segment_loader.h"
+#include "olap/storage_engine.h"
+#include "olap/tablet.h"
+#include "runtime/descriptors.h"
+#include "runtime/primitive_type.h"
+#include "runtime/string_value.h"
+namespace doris {
+SchemaScanner::ColumnDesc SchemaSegmentsScanner::_s_tbls_columns[] = {
+        //   name,       type,          size,     is_null
+        {"BACKEND_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"ROWSET_ID", TYPE_VARCHAR, sizeof(StringValue), true},
+        {"TABLET_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"ROWSET_NUM_ROWS", TYPE_BIGINT, sizeof(int64_t), true},
+        {"TXN_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"PARTITION_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"NUM_SEGMENTS", TYPE_BIGINT, sizeof(int64_t), true},
+        {"START_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        {"END_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        // size_t or int64_t???
+        {"INDEX_DISK_SIZE", TYPE_BIGINT, sizeof(size_t), true},
+        {"DATA_DISK_SIZE", TYPE_BIGINT, sizeof(size_t), true},
+        {"SEGMENT_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        {"SEGMENTS_NUM_ROWS", TYPE_BIGINT, sizeof(int64_t), true},
+};
+
+SchemaSegmentsScanner::SchemaSegmentsScanner()
+        : SchemaScanner(_s_tbls_columns,
+                        sizeof(_s_tbls_columns) / sizeof(SchemaScanner::ColumnDesc)),
+          backend_id_(0),
+          rowsets_idx_(0),
+          segments_idx_(0) {};
+
+Status SchemaSegmentsScanner::start(RuntimeState* state) {
+    if (!_is_init) {
+        return Status::InternalError("used before initialized.");
+    }
+    backend_id_ = state->backend_id();
+    RETURN_IF_ERROR(get_all_rowsets());
+    return Status::OK();
+}
+
+Status SchemaSegmentsScanner::get_next_row(Tuple* tuple, MemPool* pool, bool* eos) {
+    if (!_is_init) {

Review Comment:
   这个实现向量化的版本,不要实现非向量化的版本。



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] BePPPower commented on a diff in pull request #11266: [feature](information_schema) add 'segments' table into information_s…

Posted by GitBox <gi...@apache.org>.
BePPPower commented on code in PR #11266:
URL: https://github.com/apache/doris/pull/11266#discussion_r940068430


##########
fe/fe-core/src/main/java/org/apache/doris/catalog/SchemaTable.java:
##########
@@ -384,7 +384,22 @@ public class SchemaTable extends Table {
                                     .column("ORIGINATOR", ScalarType.createType(PrimitiveType.INT))
                                     .column("CHARACTER_SET_CLIENT", ScalarType.createVarchar(32))
                                     .column("COLLATION_CONNECTION", ScalarType.createVarchar(32))
-                                    .column("DATABASE_COLLATION", ScalarType.createVarchar(32)).build())).build();
+                                    .column("DATABASE_COLLATION", ScalarType.createVarchar(32)).build()))
+            .put("segments", new SchemaTable(SystemIdGenerator.getNextId(), "segments", TableType.SCHEMA,
+                            builder().column("BACKEND_ID", ScalarType.createType(PrimitiveType.BIGINT))
+                                    .column("ROWSET_ID", ScalarType.createVarchar(64))
+                                    .column("TABLET_ID", ScalarType.createType(PrimitiveType.BIGINT))
+                                    .column("ROWSET_NUM_ROWS", ScalarType.createType(PrimitiveType.BIGINT))
+                                    .column("TXN_ID", ScalarType.createType(PrimitiveType.BIGINT))
+                                    .column("PARTITION_ID", ScalarType.createType(PrimitiveType.BIGINT))

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #11266: [feature](information_schema) add 'segments' table into information_s…

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #11266:
URL: https://github.com/apache/doris/pull/11266#discussion_r939954348


##########
fe/fe-core/src/main/java/org/apache/doris/catalog/SchemaTable.java:
##########
@@ -384,7 +384,22 @@ public class SchemaTable extends Table {
                                     .column("ORIGINATOR", ScalarType.createType(PrimitiveType.INT))
                                     .column("CHARACTER_SET_CLIENT", ScalarType.createVarchar(32))
                                     .column("COLLATION_CONNECTION", ScalarType.createVarchar(32))
-                                    .column("DATABASE_COLLATION", ScalarType.createVarchar(32)).build())).build();
+                                    .column("DATABASE_COLLATION", ScalarType.createVarchar(32)).build()))
+            .put("segments", new SchemaTable(SystemIdGenerator.getNextId(), "segments", TableType.SCHEMA,
+                            builder().column("BACKEND_ID", ScalarType.createType(PrimitiveType.BIGINT))
+                                    .column("ROWSET_ID", ScalarType.createVarchar(64))
+                                    .column("TABLET_ID", ScalarType.createType(PrimitiveType.BIGINT))
+                                    .column("ROWSET_NUM_ROWS", ScalarType.createType(PrimitiveType.BIGINT))
+                                    .column("TXN_ID", ScalarType.createType(PrimitiveType.BIGINT))
+                                    .column("PARTITION_ID", ScalarType.createType(PrimitiveType.BIGINT))

Review Comment:
   不要partitionid了



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei commented on a diff in pull request #11266: [feature](information_schema) add 'segments' table into information_s…

Posted by GitBox <gi...@apache.org>.
yiguolei commented on code in PR #11266:
URL: https://github.com/apache/doris/pull/11266#discussion_r939951547


##########
be/src/exec/schema_scanner/schema_segments_scanner.cpp:
##########
@@ -0,0 +1,195 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "exec/schema_scanner/schema_segments_scanner.h"
+
+#include <cstddef>
+
+#include "common/status.h"
+#include "gutil/integral_types.h"
+#include "olap/rowset/beta_rowset.h"
+#include "olap/rowset/rowset.h"
+#include "olap/rowset/segment_v2/segment.h"
+#include "olap/segment_loader.h"
+#include "olap/storage_engine.h"
+#include "olap/tablet.h"
+#include "runtime/descriptors.h"
+#include "runtime/primitive_type.h"
+#include "runtime/string_value.h"
+namespace doris {
+SchemaScanner::ColumnDesc SchemaSegmentsScanner::_s_tbls_columns[] = {
+        //   name,       type,          size,     is_null
+        {"BACKEND_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"ROWSET_ID", TYPE_VARCHAR, sizeof(StringValue), true},
+        {"TABLET_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"ROWSET_NUM_ROWS", TYPE_BIGINT, sizeof(int64_t), true},
+        {"TXN_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"PARTITION_ID", TYPE_BIGINT, sizeof(int64_t), true},
+        {"NUM_SEGMENTS", TYPE_BIGINT, sizeof(int64_t), true},
+        {"START_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        {"END_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        // size_t or int64_t???
+        {"INDEX_DISK_SIZE", TYPE_BIGINT, sizeof(size_t), true},
+        {"DATA_DISK_SIZE", TYPE_BIGINT, sizeof(size_t), true},
+        {"SEGMENT_VERSION", TYPE_BIGINT, sizeof(int64_t), true},
+        {"SEGMENTS_NUM_ROWS", TYPE_BIGINT, sizeof(int64_t), true},
+};
+
+SchemaSegmentsScanner::SchemaSegmentsScanner()
+        : SchemaScanner(_s_tbls_columns,
+                        sizeof(_s_tbls_columns) / sizeof(SchemaScanner::ColumnDesc)),
+          backend_id_(0),
+          rowsets_idx_(0),
+          segments_idx_(0) {};
+
+Status SchemaSegmentsScanner::start(RuntimeState* state) {
+    if (!_is_init) {
+        return Status::InternalError("used before initialized.");
+    }
+    backend_id_ = state->backend_id();
+    RETURN_IF_ERROR(get_all_rowsets());
+    return Status::OK();
+}
+
+Status SchemaSegmentsScanner::get_next_row(Tuple* tuple, MemPool* pool, bool* eos) {
+    if (!_is_init) {
+        return Status::InternalError("Used before initialized.");
+    }
+    if (nullptr == tuple || nullptr == pool || nullptr == eos) {
+        return Status::InternalError("input pointer is nullptr.");
+    }
+    while (segments_idx_ >= segments_.size()) {
+        if (rowsets_idx_ < rowsets_.size()) {
+            RETURN_IF_ERROR(get_new_segments());
+        } else {
+            *eos = true;
+            return Status::OK();
+        }
+    }
+    *eos = false;
+    return fill_one_row(tuple, pool);
+}
+
+Status SchemaSegmentsScanner::get_all_rowsets() {
+    std::vector<TabletSharedPtr> tablets =
+            StorageEngine::instance()->tablet_manager()->get_all_tablet();
+    for (const auto& tablet : tablets) {
+        TabletMetaSharedPtr tabletMetas = tablet->tablet_meta();
+
+        // all rowset
+        std::vector<std::pair<Version, RowsetSharedPtr>> all_rowsets;
+        {
+            std::shared_lock rowset_ldlock(tablet->get_header_lock());
+            tablet->acquire_version_and_rowsets(&all_rowsets);
+        }
+        for (const auto& version_and_rowset : all_rowsets) {
+            RowsetSharedPtr rowset = version_and_rowset.second;
+            rowsets_.emplace_back(rowset);
+        }
+    }
+    return Status::OK();
+}
+
+Status SchemaSegmentsScanner::get_new_segments() {
+    BetaRowsetSharedPtr beta_rowset = std::dynamic_pointer_cast<BetaRowset>(rowsets_[rowsets_idx_]);
+    segments_.clear();
+    RETURN_IF_ERROR(beta_rowset->load_segments(&segments_));

Review Comment:
   这里调用load segments 太重了,会打开一堆文件。
   是不是可以只通过rowset meta类似的pb 结构得到。



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei merged pull request #11266: [feature](information_schema) add `rowsets` table into information_s…

Posted by GitBox <gi...@apache.org>.
yiguolei merged PR #11266:
URL: https://github.com/apache/doris/pull/11266


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org