You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by ji...@apache.org on 2022/07/22 01:45:38 UTC

[doris-website] branch master updated: tablet repair and balance fix

This is an automated email from the ASF dual-hosted git repository.

jiafengzheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new d62aaa7ed0 tablet repair and balance fix
     new ae431ef0b5 Merge branch 'master' of https://github.com/apache/doris-website
d62aaa7ed0 is described below

commit d62aaa7ed0aeb4b364415c7b34b95ffc8d7d7071
Author: jiafeng.zhang <zh...@gmail.com>
AuthorDate: Fri Jul 22 09:44:58 2022 +0800

    tablet repair and balance fix
    
    tablet repair and balance fix
---
 .../maint-monitor/tablet-repair-and-balance.md     | 47 ++++++++++++----------
 .../import/import-way/stream-load-manual.md        |  2 +-
 .../Show-Statements/SHOW-STREAM-LOAD.md            |  2 +-
 .../maint-monitor/tablet-repair-and-balance.md     | 43 +++++++++++---------
 .../import/import-way/stream-load-manual.md        |  2 +-
 .../Show-Statements/SHOW-STREAM-LOAD.md            |  2 +-
 6 files changed, 52 insertions(+), 46 deletions(-)

diff --git a/docs/admin-manual/maint-monitor/tablet-repair-and-balance.md b/docs/admin-manual/maint-monitor/tablet-repair-and-balance.md
index 44898313db..b29dace40f 100644
--- a/docs/admin-manual/maint-monitor/tablet-repair-and-balance.md
+++ b/docs/admin-manual/maint-monitor/tablet-repair-and-balance.md
@@ -267,34 +267,36 @@ Duplicate status view mainly looks at the status of the duplicate, as well as th
 
 1. Global state checking
 
-	Through `SHOW PROC'/ statistic'; `commands can view the replica status of the entire cluster.
+  Through `SHOW PROC'/cluster_health/tablet_health'; `commands can view the replica status of the entire cluster.
 
-    ```
-	+----------+-----------------------------+----------+--------------+----------+-----------+------------+--------------------+-----------------------+
-	| DbId     | DbName                      | TableNum | PartitionNum | IndexNum | TabletNum | ReplicaNum | UnhealthyTabletNum | InconsistentTabletNum |
-	+----------+-----------------------------+----------+--------------+----------+-----------+------------+--------------------+-----------------------+
-	| 35153636 | default_cluster:DF_Newrisk  | 3        | 3            | 3        | 96        | 288        | 0                  | 0                     |
-	| 48297972 | default_cluster:PaperData   | 0        | 0            | 0        | 0         | 0          | 0                  | 0                     |
-	| 5909381  | default_cluster:UM_TEST     | 7        | 7            | 10       | 320       | 960        | 1                  | 0                     |
-	| Total    | 240                         | 10       | 10           | 13       | 416       | 1248       | 1                  | 0                     |
-	+----------+-----------------------------+----------+--------------+----------+-----------+------------+--------------------+-----------------------+
-    ```
+   ```
+      +-------+--------------------------------+-----------+------------+-------------------+----------------------+----------------------+--------------+----------------------------+-------------------------+-------------------+---------------------+----------------------+----------------------+------------------+-----------------------------+-----------------+-------------+------------+
+      | DbId  | DbName                         | TabletNum | HealthyNum | ReplicaMissingNum | VersionIncompleteNum | ReplicaRelocatingNum | RedundantNum | ReplicaMissingInClusterNum | ReplicaMissingForTagNum | ForceRedundantNum | ColocateMismatchNum | ColocateRedundantNum | NeedFurtherRepairNum | UnrecoverableNum | ReplicaCompactionTooSlowNum | InconsistentNum | OversizeNum | CloningNum |
+      +-------+--------------------------------+-----------+------------+-------------------+----------------------+----------------------+--------------+----------------------------+-------------------------+-------------------+---------------------+----------------------+----------------------+------------------+-----------------------------+-----------------+-------------+------------+
+      | 10005 | default_cluster:doris_audit_db | 84        | 84         | 0                 | 0                    | 0                    | 0            | 0                          | 0                       | 0                 | 0                   | 0                    | 0                    | 0                | 0                           | 0               | 0           | 0          |
+      | 13402 | default_cluster:ssb1           | 709       | 708        | 1                 | 0                    | 0                    | 0            | 0                          | 0                       | 0                 | 0                   | 0                    | 0                    | 0                | 0                           | 0               | 0           | 0          |
+      | 10108 | default_cluster:tpch1          | 278       | 278        | 0                 | 0                    | 0                    | 0            | 0                          | 0                       | 0                 | 0                   | 0                    | 0                    | 0                | 0                           | 0               | 0           | 0          |
+      | Total | 3                              | 1071      | 1070       | 1                 | 0                    | 0                    | 0            | 0                          | 0                       | 0                 | 0                   | 0                    | 0                    | 0                | 0                           | 0               | 0           | 0          |
+      +-------+--------------------------------+-----------+------------+-------------------+----------------------+----------------------+--------------+----------------------------+-------------------------+-------------------+---------------------+----------------------+----------------------+------------------+-----------------------------+-----------------+-------------+------------+
+   ```
 
-	The `UnhealthyTabletNum` column shows how many Tablets are in an unhealthy state in the corresponding database. `The Inconsistent Tablet Num` column shows how many Tablets are in an inconsistent replica state in the corresponding database. The last `Total` line counts the entire cluster. Normally `Unhealth Tablet Num` and `Inconsistent Tablet Num` should be 0. If it's not zero, you can further see which Tablets are there. As shown in the figure above, one table in the UM_TEST database i [...]
+  The `HealthyNum` column shows how many Tablets are in a healthy state in the corresponding database. `ReplicaCompactionTooSlowNum` column shows how many Tablets are in a too many versions state in the corresponding database, `InconsistentNum` column shows how many Tablets are in an inconsistent replica state in the corresponding database. The last `Total` line counts the entire cluster. Normally `TabletNum` and `HealthyNum` should be equal. If it's not equal, you can further see which  [...]
 
-	`SHOW PROC '/statistic/5909381';`
+  ```sql
+  SHOW PROC '/cluster_health/tablet_health/13402';
+  ```
 
-	Among them `5909381'is the corresponding DbId.
+  Among them `13402`  is the corresponding DbId.
 
-    ```
-	+------------------+---------------------+
-	| UnhealthyTablets | InconsistentTablets |
-	+------------------+---------------------+
-	| [40467980]       | []                  |
-	+------------------+---------------------+
-    ```
+   ```
+  +-----------------------+--------------------------+--------------------------+------------------+--------------------------------+-----------------------------+-----------------------+-------------------------+--------------------------+--------------------------+----------------------+---------------------------------+---------------------+-----------------+
+  | ReplicaMissingTablets | VersionIncompleteTablets | ReplicaRelocatingTablets | RedundantTablets | ReplicaMissingInClusterTablets | ReplicaMissingForTagTablets | ForceRedundantTablets | ColocateMismatchTablets | ColocateRedundantTablets | NeedFurtherRepairTablets | UnrecoverableTablets | ReplicaCompactionTooSlowTablets | InconsistentTablets | OversizeTablets |
+  +-----------------------+--------------------------+--------------------------+------------------+--------------------------------+-----------------------------+-----------------------+-------------------------+--------------------------+--------------------------+----------------------+---------------------------------+---------------------+-----------------+
+  | 14679                 |                          |                          |                  |                                |                             |                       |                         |                          |                          |                      |                                 |                     |                 |
+  +-----------------------+--------------------------+--------------------------+------------------+--------------------------------+-----------------------------+-----------------------+-------------------------+--------------------------+--------------------------+----------------------+---------------------------------+---------------------+-----------------+
+   ```
 
-	The figure above shows the specific unhealthy Tablet ID (40467980). Later we'll show you how to view the status of each copy of a specific Tablet.
+  The figure above shows the specific unhealthy Tablet ID (14679). Later we'll show you how to view the status of each copy of a specific Tablet.
 
 2. Table (partition) level status checking
 
@@ -765,6 +767,7 @@ This section describes how to control and manage the progress of replica repair
 
     This operation may cause some import tasks to fail during balancing (requiring a retry), but it will speed up balancing significantly.
     
+
 Overall, when we need to bring the cluster back to a normal state quickly, consider handling it along the following lines.
 
 1. find the tablet that is causing the highly optimal task to report an error and set the problematic copy to bad.
diff --git a/docs/data-operate/import/import-way/stream-load-manual.md b/docs/data-operate/import/import-way/stream-load-manual.md
index 0b0a67913b..17c0c4276d 100644
--- a/docs/data-operate/import/import-way/stream-load-manual.md
+++ b/docs/data-operate/import/import-way/stream-load-manual.md
@@ -302,7 +302,7 @@ Users can't cancel Stream load manually. Stream load will be cancelled automatic
 
 Users can view completed stream load tasks through `show stream load`.
 
-By default, BE does not record Stream Load records. If you want to view the records that need to be enabled on BE `enable_stream_load_record=true`, you need to restart BE here
+By default, BE does not record Stream Load records. If you want to view records that need to be enabled on BE, the configuration parameter is: `enable_stream_load_record=true`. For details, please refer to [BE Configuration Items](https://doris.apache. org/zh-CN/docs/admin-manual/config/be-config)
 
 ## Relevant System Configuration
 
diff --git a/docs/sql-manual/sql-reference/Show-Statements/SHOW-STREAM-LOAD.md b/docs/sql-manual/sql-reference/Show-Statements/SHOW-STREAM-LOAD.md
index d108a4b879..bb74e405d1 100644
--- a/docs/sql-manual/sql-reference/Show-Statements/SHOW-STREAM-LOAD.md
+++ b/docs/sql-manual/sql-reference/Show-Statements/SHOW-STREAM-LOAD.md
@@ -50,7 +50,7 @@ SHOW STREAM LOAD
 
 illustrate:
 
-1. BE does not record Stream Load records. If you want to view the need to re-enable the configuration `enable_stream_load_record=true`, you need to restart BE here
+1. By default, BE does not record Stream Load records. If you want to view records that need to be enabled on BE, the configuration parameter is: `enable_stream_load_record=true`. For details, please refer to [BE Configuration Items](https://doris.apache. org/zh-CN/docs/admin-manual/config/be-config)
 1. If db_name is not specified, the current default db is used
 2. If LABEL LIKE is used, it will match the tasks whose label of the Stream Load task contains label_matcher
 3. If LABEL = is used, it will match the specified label exactly
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/maint-monitor/tablet-repair-and-balance.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/maint-monitor/tablet-repair-and-balance.md
index bfa095c8a1..f5ab675841 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/maint-monitor/tablet-repair-and-balance.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/admin-manual/maint-monitor/tablet-repair-and-balance.md
@@ -265,34 +265,36 @@ TabletScheduler 在每轮调度时,都会通过 LoadBalancer 来选择一定
 
 1. 全局状态检查
 
-    通过 `SHOW PROC '/statistic';` 命令可以查看整个集群的副本状态。
+    通过 `SHOW PROC '/cluster_health/tablet_health';` 命令可以查看整个集群的副本状态。。
 
     ```
-    +----------+-----------------------------+----------+--------------+----------+-----------+------------+--------------------+-----------------------+
-    | DbId     | DbName                      | TableNum | PartitionNum | IndexNum | TabletNum | ReplicaNum | UnhealthyTabletNum | InconsistentTabletNum |
-    +----------+-----------------------------+----------+--------------+----------+-----------+------------+--------------------+-----------------------+
-    | 35153636 | default_cluster:DF_Newrisk  | 3        | 3            | 3        | 96        | 288        | 0                  | 0                     |
-    | 48297972 | default_cluster:PaperData   | 0        | 0            | 0        | 0         | 0          | 0                  | 0                     |
-    | 5909381  | default_cluster:UM_TEST     | 7        | 7            | 10       | 320       | 960        | 1                  | 0                     |
-    | Total    | 240                         | 10       | 10           | 13       | 416       | 1248       | 1                  | 0                     |
-    +----------+-----------------------------+----------+--------------+----------+-----------+------------+--------------------+-----------------------+
+        +-------+--------------------------------+-----------+------------+-------------------+----------------------+----------------------+--------------+----------------------------+-------------------------+-------------------+---------------------+----------------------+----------------------+------------------+-----------------------------+-----------------+-------------+------------+
+        | DbId  | DbName                         | TabletNum | HealthyNum | ReplicaMissingNum | VersionIncompleteNum | ReplicaRelocatingNum | RedundantNum | ReplicaMissingInClusterNum | ReplicaMissingForTagNum | ForceRedundantNum | ColocateMismatchNum | ColocateRedundantNum | NeedFurtherRepairNum | UnrecoverableNum | ReplicaCompactionTooSlowNum | InconsistentNum | OversizeNum | CloningNum |
+        +-------+--------------------------------+-----------+------------+-------------------+----------------------+----------------------+--------------+----------------------------+-------------------------+-------------------+---------------------+----------------------+----------------------+------------------+-----------------------------+-----------------+-------------+------------+
+        | 10005 | default_cluster:doris_audit_db | 84        | 84         | 0                 | 0                    | 0                    | 0            | 0                          | 0                       | 0                 | 0                   | 0                    | 0                    | 0                | 0                           | 0               | 0           | 0          |
+        | 13402 | default_cluster:ssb1           | 709       | 708        | 1                 | 0                    | 0                    | 0            | 0                          | 0                       | 0                 | 0                   | 0                    | 0                    | 0                | 0                           | 0               | 0           | 0          |
+        | 10108 | default_cluster:tpch1          | 278       | 278        | 0                 | 0                    | 0                    | 0            | 0                          | 0                       | 0                 | 0                   | 0                    | 0                    | 0                | 0                           | 0               | 0           | 0          |
+        | Total | 3                              | 1071      | 1070       | 1                 | 0                    | 0                    | 0            | 0                          | 0                       | 0                 | 0                   | 0                    | 0                    | 0                | 0                           | 0               | 0           | 0          |
+        +-------+--------------------------------+-----------+------------+-------------------+----------------------+----------------------+--------------+----------------------------+-------------------------+-------------------+---------------------+----------------------+----------------------+------------------+-----------------------------+-----------------+-------------+------------+
     ```
 
-    其中 `UnhealthyTabletNum` 列显示了对应的 Database 中,有多少 Tablet 处于非健康状态。`InconsistentTabletNum` 列显示了对应的 Database 中,有多少 Tablet 处于副本不一致的状态。最后一行 `Total` 行对整个集群进行了统计。正常情况下 `UnhealthyTabletNum` 和 `InconsistentTabletNum` 应为0。如果不为零,可以进一步查看具体有哪些 Tablet。如上图中,UM_TEST 数据库有 1 个 Tablet 状态不健康,则可以使用以下命令查看具体是哪一个 Tablet。
+    其中 `HealthyNum` 列显示了对应的 Database 中,有多少 Tablet 处于健康状态。`ReplicaCompactionTooSlowNum` 列显示了对应的 Database 中,有多少 Tablet的 处于副本版本数过多的状态, `InconsistentNum` 列显示了对应的 Database 中,有多少 Tablet 处于副本不一致的状态。最后一行 `Total` 行对整个集群进行了统计。正常情况下 `TabletNum` 和 `HealthNum` 应该相等。如果不相等,可以进一步查看具体有哪些 Tablet。如上图中,ssb1 数据库有 1 个 Tablet 状态不健康,则可以使用以下命令查看具体是哪一个 Tablet。
     
-    `SHOW PROC '/statistic/5909381';`
-    
-    其中 `5909381` 为对应的 DbId。
-
     ```
-    +------------------+---------------------+
-    | UnhealthyTablets | InconsistentTablets |
-    +------------------+---------------------+
-    | [40467980]       | []                  |
-    +------------------+---------------------+
+    SHOW PROC '/cluster_health/tablet_health/13402';
     ```
 
-    上图会显示具体的不健康的 Tablet ID(40467980)。后面我们会介绍如何查看一个具体的 Tablet 的各个副本的状态。
+    其中 `13402` 为对应的 DbId。
+    
+    ```
+     +-----------------------+--------------------------+--------------------------+------------------+--------------------------------+-----------------------------+-----------------------+-------------------------+--------------------------+--------------------------+----------------------+---------------------------------+---------------------+-----------------+
+       | ReplicaMissingTablets | VersionIncompleteTablets | ReplicaRelocatingTablets | RedundantTablets | ReplicaMissingInClusterTablets | ReplicaMissingForTagTablets | ForceRedundantTablets | ColocateMismatchTablets | ColocateRedundantTablets | NeedFurtherRepairTablets | UnrecoverableTablets | ReplicaCompactionTooSlowTablets | InconsistentTablets | OversizeTablets |
+       +-----------------------+--------------------------+--------------------------+------------------+--------------------------------+-----------------------------+-----------------------+-------------------------+--------------------------+--------------------------+----------------------+---------------------------------+---------------------+-----------------+
+       | 14679                 |                          |                          |                  |                                |                             |                       |                         |                          |                          |                      |                                 |                     |                 |
+       +-----------------------+--------------------------+--------------------------+------------------+--------------------------------+-----------------------------+-----------------------+-------------------------+--------------------------+--------------------------+----------------------+---------------------------------+---------------------+-----------------+
+    ```
+    
+    上图会显示具体的不健康的 Tablet ID(14679),该 Tablet 处于 ReplicaMissing 的状态。后面我们会介绍如何查看一个具体的 Tablet 的各个副本的状态。
     
 2. 表(分区)级别状态检查
    
@@ -763,6 +765,7 @@ TabletScheduler 在每轮调度时,都会通过 LoadBalancer 来选择一定
 
     这种操作可能会导致均衡期间部分导入任务失败(需要重试),但会显著加速均衡速度。
     
+
 总体来讲,当我们需要将集群快速恢复到正常状态时,可以考虑按照以下思路处理:
 
 1. 找到导致高优任务报错的tablet,将有问题的副本置为 bad。
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
index 606ca5f2a2..35d8048a0c 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/stream-load-manual.md
@@ -314,7 +314,7 @@ Stream Load 由于使用的是 HTTP 协议,所以所有导入任务有关的
 
 用户可以通过 `show stream load` 来查看已经完成的 stream load 任务。
 
-默认 BE 是不记录 Stream Load 的记录,如果你要查看需要再 BE 上启用记录,`enable_stream_load_record=true`,这里需要重启BE
+默认 BE 是不记录 Stream Load 的记录,如果你要查看需要在 BE 上启用记录,配置参数是:`enable_stream_load_record=true` ,具体怎么配置请参照 [BE 配置项](https://doris.apache.org/zh-CN/docs/admin-manual/config/be-config)
 
 ## 相关系统配置
 
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-reference/Show-Statements/SHOW-STREAM-LOAD.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-reference/Show-Statements/SHOW-STREAM-LOAD.md
index 87da5910b3..75936198ca 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-reference/Show-Statements/SHOW-STREAM-LOAD.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-reference/Show-Statements/SHOW-STREAM-LOAD.md
@@ -50,7 +50,7 @@ SHOW STREAM LOAD
 
 说明:
 
-1. 默认 BE 是不记录 Stream Load 的记录,如果你要查看需要再 BE 上启用配置 `enable_stream_load_record=true`,这里需要重启BE
+1. 默认 BE 是不记录 Stream Load 的记录,如果你要查看需要在 BE 上启用记录,配置参数是:`enable_stream_load_record=true` ,具体怎么配置请参照 [BE 配置项](https://doris.apache.org/zh-CN/docs/admin-manual/config/be-config)
 1. 如果不指定 db_name,使用当前默认db
 2. 如果使用 LABEL LIKE,则会匹配Stream Load任务的 label 包含 label_matcher 的任务
 3. 如果使用 LABEL = ,则精确匹配指定的 label


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org