You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2021/09/03 08:35:02 UTC

[GitHub] [incubator-doris] ccoffline opened a new pull request #6567: [Performance] Improve show proc statistic #6477

ccoffline opened a new pull request #6567:
URL: https://github.com/apache/incubator-doris/pull/6567


   ## Proposed changes
   
   To speed up `show proc '/statistic'`, this pr refactor the statistic structure to make the logic clear, and use parallel computing.
   
   ## Types of changes
   
   - [ ] Bugfix (non-breaking change which fixes an issue)
   - [ ] New feature (non-breaking change which adds functionality)
   - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
   - [ ] Documentation Update (if none of the other choices apply)
   - [x] Code refactor (Modify the code structure, format the code, etc...)
   - [x] Optimization. Including functional usability improvements and performance improvements.
   - [ ] Dependency. Such as changes related to third-party components.
   - [ ] Other.
   
   ## Checklist
   
   - [x] I have created an issue on (Fix #6477) and described the bug/feature there in detail
   - [x] Compiling and unit tests pass locally with my changes
   - [ ] I have added tests that prove my fix is effective or that my feature works
   - [ ] If these changes need document changes, I have updated the document
   - [x] Any dependent changes have been merged
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #6567: [Performance] Improve show proc statistic #6477

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #6567:
URL: https://github.com/apache/incubator-doris/pull/6567#issuecomment-919695183






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] ccoffline commented on a change in pull request #6567: [Performance] Improve show proc statistic #6477

Posted by GitBox <gi...@apache.org>.
ccoffline commented on a change in pull request #6567:
URL: https://github.com/apache/incubator-doris/pull/6567#discussion_r708827801



##########
File path: fe/fe-core/src/main/java/org/apache/doris/common/proc/StatisticProcDir.java
##########
@@ -58,86 +60,95 @@
 
     private Catalog catalog;
 
-    // db id -> set(tablet id)
-    Multimap<Long, Long> unhealthyTabletIds;
-    // db id -> set(tablet id)
-    Multimap<Long, Long> inconsistentTabletIds;
-    // db id -> set(tablet id)
-    Multimap<Long, Long> cloningTabletIds;
-    // db id -> set(tablet id)
-    Multimap<Long, Long> unrecoverableTabletIds;
-
     public StatisticProcDir(Catalog catalog) {
+        Preconditions.checkNotNull(catalog);
         this.catalog = catalog;
-        unhealthyTabletIds = HashMultimap.create();
-        inconsistentTabletIds = HashMultimap.create();
-        cloningTabletIds = HashMultimap.create();
-        unrecoverableTabletIds = HashMultimap.create();
     }
 
     @Override
     public ProcResult fetchResult() throws AnalysisException {
-        Preconditions.checkNotNull(catalog);
+        List<DBStatistic> statistics = catalog.getDbIds().parallelStream()
+                // skip information_schema database
+                .flatMap(id -> Stream.of(id == 0 ? null : catalog.getDbNullable(id)))
+                .filter(Objects::nonNull).map(DBStatistic::new)
+                // sort by dbName
+                .sorted(Comparator.comparing(db -> db.db.getFullName()))
+                .collect(Collectors.toList());
+
+        List<List<String>> rows = new ArrayList<>(statistics.size() + 1);
+        for (DBStatistic statistic : statistics) {
+            rows.add(statistic.toRow());
+        }
+        rows.add(statistics.stream().reduce(new DBStatistic(), DBStatistic::reduce).toRow());
 
-        BaseProcResult result = new BaseProcResult();
+        return new BaseProcResult(TITLE_NAMES, rows);
+    }
+
+    @Override
+    public boolean register(String name, ProcNodeInterface node) {
+        return false;
+    }
 
-        result.setNames(TITLE_NAMES);
-        List<Long> dbIds = catalog.getDbIds();
-        if (dbIds == null || dbIds.isEmpty()) {
-            // empty
-            return result;
+    @Override
+    public ProcNodeInterface lookup(String dbIdStr) throws AnalysisException {
+        try {
+            long dbId = Long.parseLong(dbIdStr);
+            return catalog.getDb(dbId).map(IncompleteTabletsProcNode::new).orElse(null);
+        } catch (NumberFormatException e) {
+            throw new AnalysisException("Invalid db id format: " + dbIdStr);
         }
+    }
 
-        SystemInfoService infoService = Catalog.getCurrentSystemInfo();
-
-        int totalDbNum = 0;
-        int totalTableNum = 0;
-        int totalPartitionNum = 0;
-        int totalIndexNum = 0;
-        int totalTabletNum = 0;
-        int totalReplicaNum = 0;
-
-        unhealthyTabletIds.clear();
-        inconsistentTabletIds.clear();
-        cloningTabletIds = AgentTaskQueue.getTabletIdsByType(TTaskType.CLONE);
-        List<List<Comparable>> lines = new ArrayList<List<Comparable>>();
-        for (Long dbId : dbIds) {
-            if (dbId == 0) {
-                // skip information_schema database
-                continue;
-            }
-            Database db = catalog.getDbNullable(dbId);
-            if (db == null) {
-                continue;
-            }
+    static class DBStatistic {
+        boolean summary;
+        Database db;
+        int dbNum;
+        int tableNum;
+        int partitionNum;
+        int indexNum;
+        int tabletNum;
+        int replicaNum;
+        int unhealthyTabletNum;
+        int inconsistentTabletNum;
+        int cloningTabletNum;
+        int badTabletNum;
+        Set<Long> unhealthyTabletIds;
+        Set<Long> inconsistentTabletIds;
+        Set<Long> cloningTabletIds;
+        Set<Long> unrecoverableTabletIds;
+
+        DBStatistic() {
+            this.summary = true;
+        }
 
-            ++totalDbNum;
+        DBStatistic(Database db) {
+            Preconditions.checkNotNull(db);
+            this.summary = false;
+            this.db = db;
+            this.dbNum = 1;
+            this.unhealthyTabletIds = new HashSet<>();
+            this.inconsistentTabletIds = new HashSet<>();
+            this.unrecoverableTabletIds = new HashSet<>();
+            this.cloningTabletIds = AgentTaskQueue.getTask(db.getId(), TTaskType.CLONE)
+                    .stream().map(AgentTask::getTabletId).collect(Collectors.toSet());
+
+            SystemInfoService infoService = Catalog.getCurrentSystemInfo();
             List<Long> aliveBeIdsInCluster = infoService.getClusterBackendIds(db.getClusterName(), true);
             db.readLock();

Review comment:
       ok




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on pull request #6567: [Performance] Improve show proc statistic #6477

Posted by GitBox <gi...@apache.org>.
morningman commented on pull request #6567:
URL: https://github.com/apache/incubator-doris/pull/6567#issuecomment-919684832


   Hi @ccoffline , please rebase the master to solve the conflict


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] ccoffline commented on pull request #6567: [Performance] Improve show proc statistic #6477

Posted by GitBox <gi...@apache.org>.
ccoffline commented on pull request #6567:
URL: https://github.com/apache/incubator-doris/pull/6567#issuecomment-912380266


   > There is a problem with the use of `parallelStream` and it's `ForkJoinPool`.
   
   avoid to use nested `parallelStream`, problem solved


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] ccoffline commented on pull request #6567: [Performance] Improve show proc statistic #6477

Posted by GitBox <gi...@apache.org>.
ccoffline commented on pull request #6567:
URL: https://github.com/apache/incubator-doris/pull/6567#issuecomment-912377241


   There is a problem with the use of `parallelStream` and it's `ForkJoinPool`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #6567: [Performance] Improve show proc statistic #6477

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #6567:
URL: https://github.com/apache/incubator-doris/pull/6567#discussion_r708820813



##########
File path: fe/fe-core/src/main/java/org/apache/doris/common/proc/StatisticProcDir.java
##########
@@ -58,86 +60,95 @@
 
     private Catalog catalog;
 
-    // db id -> set(tablet id)
-    Multimap<Long, Long> unhealthyTabletIds;
-    // db id -> set(tablet id)
-    Multimap<Long, Long> inconsistentTabletIds;
-    // db id -> set(tablet id)
-    Multimap<Long, Long> cloningTabletIds;
-    // db id -> set(tablet id)
-    Multimap<Long, Long> unrecoverableTabletIds;
-
     public StatisticProcDir(Catalog catalog) {
+        Preconditions.checkNotNull(catalog);
         this.catalog = catalog;
-        unhealthyTabletIds = HashMultimap.create();
-        inconsistentTabletIds = HashMultimap.create();
-        cloningTabletIds = HashMultimap.create();
-        unrecoverableTabletIds = HashMultimap.create();
     }
 
     @Override
     public ProcResult fetchResult() throws AnalysisException {
-        Preconditions.checkNotNull(catalog);
+        List<DBStatistic> statistics = catalog.getDbIds().parallelStream()
+                // skip information_schema database
+                .flatMap(id -> Stream.of(id == 0 ? null : catalog.getDbNullable(id)))
+                .filter(Objects::nonNull).map(DBStatistic::new)
+                // sort by dbName
+                .sorted(Comparator.comparing(db -> db.db.getFullName()))
+                .collect(Collectors.toList());
+
+        List<List<String>> rows = new ArrayList<>(statistics.size() + 1);
+        for (DBStatistic statistic : statistics) {
+            rows.add(statistic.toRow());
+        }
+        rows.add(statistics.stream().reduce(new DBStatistic(), DBStatistic::reduce).toRow());
 
-        BaseProcResult result = new BaseProcResult();
+        return new BaseProcResult(TITLE_NAMES, rows);
+    }
+
+    @Override
+    public boolean register(String name, ProcNodeInterface node) {
+        return false;
+    }
 
-        result.setNames(TITLE_NAMES);
-        List<Long> dbIds = catalog.getDbIds();
-        if (dbIds == null || dbIds.isEmpty()) {
-            // empty
-            return result;
+    @Override
+    public ProcNodeInterface lookup(String dbIdStr) throws AnalysisException {
+        try {
+            long dbId = Long.parseLong(dbIdStr);
+            return catalog.getDb(dbId).map(IncompleteTabletsProcNode::new).orElse(null);
+        } catch (NumberFormatException e) {
+            throw new AnalysisException("Invalid db id format: " + dbIdStr);
         }
+    }
 
-        SystemInfoService infoService = Catalog.getCurrentSystemInfo();
-
-        int totalDbNum = 0;
-        int totalTableNum = 0;
-        int totalPartitionNum = 0;
-        int totalIndexNum = 0;
-        int totalTabletNum = 0;
-        int totalReplicaNum = 0;
-
-        unhealthyTabletIds.clear();
-        inconsistentTabletIds.clear();
-        cloningTabletIds = AgentTaskQueue.getTabletIdsByType(TTaskType.CLONE);
-        List<List<Comparable>> lines = new ArrayList<List<Comparable>>();
-        for (Long dbId : dbIds) {
-            if (dbId == 0) {
-                // skip information_schema database
-                continue;
-            }
-            Database db = catalog.getDbNullable(dbId);
-            if (db == null) {
-                continue;
-            }
+    static class DBStatistic {
+        boolean summary;
+        Database db;
+        int dbNum;
+        int tableNum;
+        int partitionNum;
+        int indexNum;
+        int tabletNum;
+        int replicaNum;
+        int unhealthyTabletNum;
+        int inconsistentTabletNum;
+        int cloningTabletNum;
+        int badTabletNum;
+        Set<Long> unhealthyTabletIds;
+        Set<Long> inconsistentTabletIds;
+        Set<Long> cloningTabletIds;
+        Set<Long> unrecoverableTabletIds;
+
+        DBStatistic() {
+            this.summary = true;
+        }
 
-            ++totalDbNum;
+        DBStatistic(Database db) {
+            Preconditions.checkNotNull(db);
+            this.summary = false;
+            this.db = db;
+            this.dbNum = 1;
+            this.unhealthyTabletIds = new HashSet<>();
+            this.inconsistentTabletIds = new HashSet<>();
+            this.unrecoverableTabletIds = new HashSet<>();
+            this.cloningTabletIds = AgentTaskQueue.getTask(db.getId(), TTaskType.CLONE)
+                    .stream().map(AgentTask::getTabletId).collect(Collectors.toSet());
+
+            SystemInfoService infoService = Catalog.getCurrentSystemInfo();
             List<Long> aliveBeIdsInCluster = infoService.getClusterBackendIds(db.getClusterName(), true);
             db.readLock();

Review comment:
       I think db lock is not needed here.

##########
File path: fe/fe-core/src/main/java/org/apache/doris/common/proc/StatisticProcDir.java
##########
@@ -58,86 +60,95 @@
 
     private Catalog catalog;
 
-    // db id -> set(tablet id)
-    Multimap<Long, Long> unhealthyTabletIds;
-    // db id -> set(tablet id)
-    Multimap<Long, Long> inconsistentTabletIds;
-    // db id -> set(tablet id)
-    Multimap<Long, Long> cloningTabletIds;
-    // db id -> set(tablet id)
-    Multimap<Long, Long> unrecoverableTabletIds;
-
     public StatisticProcDir(Catalog catalog) {
+        Preconditions.checkNotNull(catalog);
         this.catalog = catalog;
-        unhealthyTabletIds = HashMultimap.create();
-        inconsistentTabletIds = HashMultimap.create();
-        cloningTabletIds = HashMultimap.create();
-        unrecoverableTabletIds = HashMultimap.create();
     }
 
     @Override
     public ProcResult fetchResult() throws AnalysisException {
-        Preconditions.checkNotNull(catalog);
+        List<DBStatistic> statistics = catalog.getDbIds().parallelStream()
+                // skip information_schema database
+                .flatMap(id -> Stream.of(id == 0 ? null : catalog.getDbNullable(id)))
+                .filter(Objects::nonNull).map(DBStatistic::new)
+                // sort by dbName
+                .sorted(Comparator.comparing(db -> db.db.getFullName()))
+                .collect(Collectors.toList());
+
+        List<List<String>> rows = new ArrayList<>(statistics.size() + 1);
+        for (DBStatistic statistic : statistics) {
+            rows.add(statistic.toRow());
+        }
+        rows.add(statistics.stream().reduce(new DBStatistic(), DBStatistic::reduce).toRow());
 
-        BaseProcResult result = new BaseProcResult();
+        return new BaseProcResult(TITLE_NAMES, rows);
+    }
+
+    @Override
+    public boolean register(String name, ProcNodeInterface node) {
+        return false;
+    }
 
-        result.setNames(TITLE_NAMES);
-        List<Long> dbIds = catalog.getDbIds();
-        if (dbIds == null || dbIds.isEmpty()) {
-            // empty
-            return result;
+    @Override
+    public ProcNodeInterface lookup(String dbIdStr) throws AnalysisException {
+        try {
+            long dbId = Long.parseLong(dbIdStr);
+            return catalog.getDb(dbId).map(IncompleteTabletsProcNode::new).orElse(null);
+        } catch (NumberFormatException e) {
+            throw new AnalysisException("Invalid db id format: " + dbIdStr);
         }
+    }
 
-        SystemInfoService infoService = Catalog.getCurrentSystemInfo();
-
-        int totalDbNum = 0;
-        int totalTableNum = 0;
-        int totalPartitionNum = 0;
-        int totalIndexNum = 0;
-        int totalTabletNum = 0;
-        int totalReplicaNum = 0;
-
-        unhealthyTabletIds.clear();
-        inconsistentTabletIds.clear();
-        cloningTabletIds = AgentTaskQueue.getTabletIdsByType(TTaskType.CLONE);
-        List<List<Comparable>> lines = new ArrayList<List<Comparable>>();
-        for (Long dbId : dbIds) {
-            if (dbId == 0) {
-                // skip information_schema database
-                continue;
-            }
-            Database db = catalog.getDbNullable(dbId);
-            if (db == null) {
-                continue;
-            }
+    static class DBStatistic {
+        boolean summary;
+        Database db;
+        int dbNum;
+        int tableNum;
+        int partitionNum;
+        int indexNum;
+        int tabletNum;
+        int replicaNum;
+        int unhealthyTabletNum;
+        int inconsistentTabletNum;
+        int cloningTabletNum;
+        int badTabletNum;
+        Set<Long> unhealthyTabletIds;
+        Set<Long> inconsistentTabletIds;
+        Set<Long> cloningTabletIds;
+        Set<Long> unrecoverableTabletIds;
+
+        DBStatistic() {
+            this.summary = true;
+        }
 
-            ++totalDbNum;
+        DBStatistic(Database db) {
+            Preconditions.checkNotNull(db);
+            this.summary = false;
+            this.db = db;
+            this.dbNum = 1;
+            this.unhealthyTabletIds = new HashSet<>();
+            this.inconsistentTabletIds = new HashSet<>();
+            this.unrecoverableTabletIds = new HashSet<>();
+            this.cloningTabletIds = AgentTaskQueue.getTask(db.getId(), TTaskType.CLONE)
+                    .stream().map(AgentTask::getTabletId).collect(Collectors.toSet());
+
+            SystemInfoService infoService = Catalog.getCurrentSystemInfo();
             List<Long> aliveBeIdsInCluster = infoService.getClusterBackendIds(db.getClusterName(), true);
             db.readLock();
             try {
-                int dbTableNum = 0;
-                int dbPartitionNum = 0;
-                int dbIndexNum = 0;
-                int dbTabletNum = 0;
-                int dbReplicaNum = 0;
-
-                for (Table table : db.getTables()) {
-                    if (table.getType() != TableType.OLAP) {
-                        continue;
-                    }
-
-                    ++dbTableNum;
-                    OlapTable olapTable = (OlapTable) table;
-                    table.readLock();
+                db.getTables().stream().filter(t -> t != null && t.getType() == TableType.OLAP).forEach(t -> {
+                    ++tableNum;
+                    OlapTable olapTable = (OlapTable) t;
+                    olapTable.readLock();
                     try {
                         for (Partition partition : olapTable.getAllPartitions()) {
                             ReplicaAllocation replicaAlloc = olapTable.getPartitionInfo().getReplicaAllocation(partition.getId());
-                            ++dbPartitionNum;
+                            ++partitionNum;
                             for (MaterializedIndex materializedIndex : partition.getMaterializedIndices(IndexExtState.VISIBLE)) {
-                                ++dbIndexNum;
+                                ++indexNum;
                                 for (Tablet tablet : materializedIndex.getTablets()) {
-                                    ++dbTabletNum;
-                                    dbReplicaNum += tablet.getReplicas().size();
+                                    ++tableNum;

Review comment:
       ```suggestion
                                       ++tabletNum;
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] caiconghui merged pull request #6567: [Performance] Improve show proc statistic #6477

Posted by GitBox <gi...@apache.org>.
caiconghui merged pull request #6567:
URL: https://github.com/apache/incubator-doris/pull/6567


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org