You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by yi...@apache.org on 2022/07/18 08:02:35 UTC

[doris] branch master updated: [Docs] add doc of tablet local debug (#10944)

This is an automated email from the ASF dual-hosted git repository.

yiguolei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new d199283df0 [Docs] add doc of tablet local debug (#10944)
d199283df0 is described below

commit d199283df08bfe722da37f1f97c91529f7bfb7a0
Author: HappenLee <ha...@hotmail.com>
AuthorDate: Mon Jul 18 16:02:29 2022 +0800

    [Docs] add doc of tablet local debug (#10944)
    
    Co-authored-by: lihaopeng <li...@baidu.com>
---
 docs/.vuepress/sidebar/en/docs.js                  |  1 +
 docs/.vuepress/sidebar/zh-CN/docs.js               |  1 +
 .../maint-monitor/tablet-local-debug.md            | 81 ++++++++++++++++++++++
 .../maint-monitor/tablet-local-debug.md            | 81 ++++++++++++++++++++++
 4 files changed, 164 insertions(+)

diff --git a/docs/.vuepress/sidebar/en/docs.js b/docs/.vuepress/sidebar/en/docs.js
index 111470a313..8409409e5e 100644
--- a/docs/.vuepress/sidebar/en/docs.js
+++ b/docs/.vuepress/sidebar/en/docs.js
@@ -887,6 +887,7 @@ module.exports = [
           "disk-capacity",
           "metadata-operation",
           "tablet-meta-tool",
+          "tablet-local-debug",
           "tablet-repair-and-balance",
           "tablet-restore-tool",
           "monitor-alert",
diff --git a/docs/.vuepress/sidebar/zh-CN/docs.js b/docs/.vuepress/sidebar/zh-CN/docs.js
index a4ef26fc7d..9373af45a1 100644
--- a/docs/.vuepress/sidebar/zh-CN/docs.js
+++ b/docs/.vuepress/sidebar/zh-CN/docs.js
@@ -887,6 +887,7 @@ module.exports = [
           "disk-capacity",
           "metadata-operation",
           "tablet-meta-tool",
+          "tablet-local-debug",
           "tablet-repair-and-balance",
           "tablet-restore-tool",
           "monitor-alert",
diff --git a/docs/en/docs/admin-manual/maint-monitor/tablet-local-debug.md b/docs/en/docs/admin-manual/maint-monitor/tablet-local-debug.md
new file mode 100644
index 0000000000..c47d4320da
--- /dev/null
+++ b/docs/en/docs/admin-manual/maint-monitor/tablet-local-debug.md
@@ -0,0 +1,81 @@
+
+During the online operation of Doris, various bugs may occur for various reasons, e.g., inconsistent replicas, version diffs in the data, etc. At this point, it is necessary to replicate the online tablet data to the local environment and then locate the problem.
+
+In this case, you need to copy the online tablet copy data to the local environment for replication and then locate the problem.
+
+## 1\. Prepare the environment
+
+Deploy a single-node Doris cluster locally, with the same deployment version as the online cluster.
+
+If the online deployment is DORIS-1.0.1, deploy DORIS-1.0.1 in the local environment as well.
+
+After deploying the cluster, create a local table that is the same as the online one, but the changes that need to be made are that the local table has only one copy of the tablet, and the version, version_hash need to be specified.
+
+If the problem is to locate inconsistent data, you can build three different tables corresponding to each copy online.
+
+## 2\. Copy Data
+
+To find the machine where the copy of online tablet is located, find the method, you can find the machine where the corresponding copy is located with the following two commands.
+
+* show tablet meta data
+```show tablet 10011```
+```
+mysql> show tablet 10011;
++-------------------------------+-----------+---------------+-----------+-------+---------+-------------+---------+--------+-------+------------------------------------------------------------+
+| DbName                        | TableName | PartitionName | IndexName | DbId  | TableId | PartitionId | IndexId | IsSync | Order | DetailCmd                                                  |
++-------------------------------+-----------+---------------+-----------+-------+---------+-------------+---------+--------+-------+------------------------------------------------------------+
+| default_cluster:test_query_qa | baseall   | baseall       | baseall   | 10007 | 10009   | 10008       | 10010   | true   | 0     | SHOW PROC '/dbs/10007/10009/partitions/10008/10010/10011'; |
++-------------------------------+-----------+---------------+-----------+-------+---------+-------------+---------+--------+-------+------------------------------------------------------------+
+1 row in set (0.00 sec)
+```
+* Execute the `DetailCmd` command, locate the BE ip:
+
+```SHOW PROC '/dbs/10007/10009/partitions/10008/10010/10011';```
+```
+mysql> SHOW PROC '/dbs/10007/10009/partitions/10008/10010/10011';
++-----------+-----------+---------+-------------------+------------------+---------------+------------+----------+----------+--------+-------+--------------+----------------------+-------------------------------------------------+---------------------------------------------------------------+
+| ReplicaId | BackendId | Version | LstSuccessVersion | LstFailedVersion | LstFailedTime | SchemaHash | DataSize | RowCount | State  | IsBad | VersionCount | PathHash             | MetaUrl                                         | CompactionStatus                                              |
++-----------+-----------+---------+-------------------+------------------+---------------+------------+----------+----------+--------+-------+--------------+----------------------+-------------------------------------------------+---------------------------------------------------------------+
+| 10012     | 10003     | 2       | 2                 | -1               | NULL          | 945014548  | 2195     | 5        | NORMAL | false | 2            | 844142863681807094   | http://192.168.0.2:8001/api/meta/header/10011 | http://192.168.0.2:8001/api/compaction/show?tablet_id=10011 |
+| 10013     | 10002     | 2       | 2                 | -1               | NULL          | 945014548  | 2195     | 5        | NORMAL | false | 2            | -6740067817150249792 | http://192.168.0.1:8001/api/meta/header/10011  | http://192.168.0.1:8001/api/compaction/show?tablet_id=10011  |
+| 10014     | 10005     | 2       | 2                 | -1               | NULL          | 945014548  | 2195     | 5        | NORMAL | false | 2            | 4758004238194195485  | http://192.168.0.3:8001/api/meta/header/10011  | http://192.168.0.3:8001/api/compaction/show?tablet_id=10011  |
++-----------+-----------+---------+-------------------+------------------+---------------+------------+----------+----------+--------+-------+--------------+----------------------+-------------------------------------------------+---------------------------------------------------------------+
+3 rows in set (0.01 sec)
+```
+* Log in to the corresponding machine and find the directory where the replica is located.
+
+```
+ll ./data.HDD/data/0/10011/945014548/
+total 4
+-rw-rw-r-- 1 palo-qa palo-qa 2195 Jul 14 20:16 0200000000000018bb4a69226c414ace42487209dc145dbb_0.dat
+```
+
+Then replica the directory to the local counterpart with the scp command.
+
+## 3\. Download meta data
+
+Get the metadata of the replica
+
+```
+wget http://host:be_http_port/api/meta/header/$tablet_id?byte_to_base64=true -O meta_data
+```
+
+## 4\. Modify meta data
+
+Take the metadata downloaded online and modify it to identify the online data copied in step 2.
+
+In the same way download the tablet metadata from the local deployment cluster, and according to the correspondence, change the `table_id, tablet_id, partition_id, schema_hash, shard_id` in the metadata to the same case as local, other fields do not need to be changed.
+
+## 5\. Take effect
+
+(1)  Stop the local be in order to take effect the metadata.
+
+(2)  Delete the metadata in the local be by the command, and take effect the new metadata at the same time.
+```
+./lib/meta_tool --root_path=/home/doris/be --operation=get_meta --tablet_id=10027 --schema_hash=112641656
+./lib/meta_tool --root_path=/home/doris/be --operation=delete_meta --tablet_id=10027 --schema_hash=112641656
+./lib/meta_tool --root_path=/home/doris/be --operation=load_meta --json_meta_path=/home/doris/error/tablet/112641656_1
+
+```
+
+(3) Restart be query the corresponding data.
diff --git a/docs/zh-CN/docs/admin-manual/maint-monitor/tablet-local-debug.md b/docs/zh-CN/docs/admin-manual/maint-monitor/tablet-local-debug.md
new file mode 100644
index 0000000000..e8aedfde65
--- /dev/null
+++ b/docs/zh-CN/docs/admin-manual/maint-monitor/tablet-local-debug.md
@@ -0,0 +1,81 @@
+
+Doris线上运行过程中,因为各种原因,可能出现各种各样的bug。例如:副本不一致,数据存在版本diff等。
+
+这时候需要将线上的tablet的副本数据拷贝到本地环境进行复现,然后进行问题定位。
+
+## 1\. 准备环境
+
+在本地部署一个单节点的Doris集群,部署版本和线上集群保持一致。
+
+如果线上部署的版本是DORIS-1.0.1, 本地环境也同样部署DORIS-1.0.1的版本。
+
+部署好集群之后,在本地建一个和线上同样的表,但是需要作出的改动就是,本地建的表只有一个副本的tablet,同时需要指定version, version_hash。
+
+如果是定位数据不一致的问题,可以建三个不同的表对应线上的各个副本。
+
+## 2\. 拷贝数据
+
+找到线上tablet的副本所在的机器,寻找方法,可以下面两个命令找到副本所在的机器。
+
+* 查看tablet元数据
+```show tablet 10011```
+```
+mysql> show tablet 10011;
++-------------------------------+-----------+---------------+-----------+-------+---------+-------------+---------+--------+-------+------------------------------------------------------------+
+| DbName                        | TableName | PartitionName | IndexName | DbId  | TableId | PartitionId | IndexId | IsSync | Order | DetailCmd                                                  |
++-------------------------------+-----------+---------------+-----------+-------+---------+-------------+---------+--------+-------+------------------------------------------------------------+
+| default_cluster:test_query_qa | baseall   | baseall       | baseall   | 10007 | 10009   | 10008       | 10010   | true   | 0     | SHOW PROC '/dbs/10007/10009/partitions/10008/10010/10011'; |
++-------------------------------+-----------+---------------+-----------+-------+---------+-------------+---------+--------+-------+------------------------------------------------------------+
+1 row in set (0.00 sec)
+```
+* 执行`DetailCmd`命令, 定位BE:
+
+```SHOW PROC '/dbs/10007/10009/partitions/10008/10010/10011';```
+```
+mysql> SHOW PROC '/dbs/10007/10009/partitions/10008/10010/10011';
++-----------+-----------+---------+-------------------+------------------+---------------+------------+----------+----------+--------+-------+--------------+----------------------+-------------------------------------------------+---------------------------------------------------------------+
+| ReplicaId | BackendId | Version | LstSuccessVersion | LstFailedVersion | LstFailedTime | SchemaHash | DataSize | RowCount | State  | IsBad | VersionCount | PathHash             | MetaUrl                                         | CompactionStatus                                              |
++-----------+-----------+---------+-------------------+------------------+---------------+------------+----------+----------+--------+-------+--------------+----------------------+-------------------------------------------------+---------------------------------------------------------------+
+| 10012     | 10003     | 2       | 2                 | -1               | NULL          | 945014548  | 2195     | 5        | NORMAL | false | 2            | 844142863681807094   | http://192.168.0.2:8001/api/meta/header/10011 | http://192.168.0.2:8001/api/compaction/show?tablet_id=10011 |
+| 10013     | 10002     | 2       | 2                 | -1               | NULL          | 945014548  | 2195     | 5        | NORMAL | false | 2            | -6740067817150249792 | http://192.168.0.1:8001/api/meta/header/10011  | http://192.168.0.1:8001/api/compaction/show?tablet_id=10011  |
+| 10014     | 10005     | 2       | 2                 | -1               | NULL          | 945014548  | 2195     | 5        | NORMAL | false | 2            | 4758004238194195485  | http://192.168.0.3:8001/api/meta/header/10011  | http://192.168.0.3:8001/api/compaction/show?tablet_id=10011  |
++-----------+-----------+---------+-------------------+------------------+---------------+------------+----------+----------+--------+-------+--------------+----------------------+-------------------------------------------------+---------------------------------------------------------------+
+3 rows in set (0.01 sec)
+```
+* 登录到对应机器上,找到副本所在的目录。
+
+```
+ll ./data.HDD/data/0/10011/945014548/
+total 4
+-rw-rw-r-- 1 palo-qa palo-qa 2195 Jul 14 20:16 0200000000000018bb4a69226c414ace42487209dc145dbb_0.dat
+```
+
+然后通过scp命令,把目录拷贝到本地对应的目录下。
+
+## 3\. 下载元数据
+
+获取副本的元数据
+
+```
+wget http://host:be_http_port/api/meta/header/$tablet_id?byte_to_base64=true -O meta_data
+```
+
+## 4\. 修改元数据
+
+将线上下载下来的元数据,进行修改,修改的目的是为了识别第2步拷贝的线上数据。
+
+同样的方式将本地部署集群的tablet元数据下载下来,根据对应关系,将元数据中的`table_id, tablet_id, partition_id, schema_hash,shard_id`改成和本地一样的情况,其他字段不用改。
+
+## 5\. 生效
+
+(1) 停掉本地的be,才能生效元数据。
+
+(2) 通过命令删除本地be中的元数据,同时生效新的元数据。
+```
+./lib/meta_tool --root_path=/home/doris/be --operation=get_meta --tablet_id=10027 --schema_hash=112641656
+./lib/meta_tool --root_path=/home/doris/be --operation=delete_meta --tablet_id=10027 --schema_hash=112641656
+./lib/meta_tool --root_path=/home/doris/be --operation=load_meta --json_meta_path=/home/doris/error/tablet/112641656_1
+
+```
+
+(3) 重启be, 查询对应的数据。


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org