You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@iotdb.apache.org by ca...@apache.org on 2022/12/22 00:42:15 UTC

[iotdb] branch master updated: Update FAQ for cluster setup (#8567)

This is an automated email from the ASF dual-hosted git repository.

caogaofei pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/iotdb.git


The following commit(s) were added to refs/heads/master by this push:
     new 656d281150 Update FAQ for cluster setup (#8567)
656d281150 is described below

commit 656d281150299e57bb8b24aa79ef387cd2ff45ac
Author: Beyyes <cg...@foxmail.com>
AuthorDate: Thu Dec 22 08:42:07 2022 +0800

    Update FAQ for cluster setup (#8567)
---
 docs/UserGuide/Cluster/Cluster-Setup.md            |  2 +-
 docs/UserGuide/FAQ/FAQ-for-cluster-setup.md        | 99 ++++++++++++++++++++++
 .../db/mpp/common/header/ColumnHeaderConstant.java |  6 +-
 site/src/main/.vuepress/config.js                  |  2 +
 4 files changed, 106 insertions(+), 3 deletions(-)

diff --git a/docs/UserGuide/Cluster/Cluster-Setup.md b/docs/UserGuide/Cluster/Cluster-Setup.md
index 9f643c4c42..61e5700cde 100644
--- a/docs/UserGuide/Cluster/Cluster-Setup.md
+++ b/docs/UserGuide/Cluster/Cluster-Setup.md
@@ -410,4 +410,4 @@ Run the remove-datanode script on an active DataNode:
 
 # 7. FAQ
 
-See [FAQ](https://iotdb.apache.org/UserGuide/Master/FAQ/Frequently-asked-questions.html)
+See [FAQ](https://iotdb.apache.org/UserGuide/Master/FAQ/FAQ-for-cluster-setup.html)
diff --git a/docs/UserGuide/FAQ/FAQ-for-cluster-setup.md b/docs/UserGuide/FAQ/FAQ-for-cluster-setup.md
new file mode 100644
index 0000000000..b28c062c73
--- /dev/null
+++ b/docs/UserGuide/FAQ/FAQ-for-cluster-setup.md
@@ -0,0 +1,99 @@
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+    
+        http://www.apache.org/licenses/LICENSE-2.0
+    
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+-->
+
+<!-- TOC -->
+
+# FAQ for Cluster Setup
+
+## 1. Cluster StartUp and Stop
+
+### 1). Failed to start ConfigNode for the first time, how to find the reason?
+
+- Make sure that the data/confignode directory is cleared when start ConfigNode for the first time.
+- Make sure that the <IP+Port> used by ConfigNode is not occupied, and the <IP+Port> is also not conflicted with other ConfigNodes.
+- Make sure that the `cn_target_confignode_list` is configured correctly, which points to the alive ConfigNode. And if the ConfigNode is started for the first time, make sure that `cn_target_confignode_list` points to itself.
+- Make sure that the configuration(consensus protocol and replica number) of the started ConfigNode is accord with the `cn_target_confignode_list` ConfigNode.
+
+### 2). ConfigNode is started successfully, but why the node doesn't appear in the results of `show cluster`?
+
+- Examine whether the `cn_target_confignode_list` points to the correct address. If `cn_target_confignode_list` points to itself, a new ConfigNode cluster is started.
+
+### 3). Failed to start DataNode for the first time, how to find the reason?
+
+- Make sure that the data/datanode directory is cleared when start DataNode for the first time. If the start result is “Reject DataNode restart.”, maybe the data/datanode directory is not cleared.
+- Make sure that the <IP+Port> used by DataNode is not occupied, and the <IP+Port> is also not conflicted with other DataNodes. 
+- Make sure that the `dn_target_confignode_list` points to the alive ConfigNode.
+
+### 4). Failed to remove DataNode, how to find the reason?
+
+- Examine whether the parameter of remove-datanode.sh is correct, only rpcIp:rpcPort and dataNodeId are correct parameter.
+- Only when the number of available DataNodes in the cluster is greater than max(schema_replication_factor, data_replication_factor), removing operation can be executed.
+- Removing DataNode will migrate the data from the removing DataNode to other alive DataNodes. Data migration is based on Region, if some regions are migrated failed, the removing DataNode will always in the status of `Removing`.
+- If the DataNode is in the status of `Removing`, the regions in the removing DataNode will also in the status of `Removing` or `Unknown`, which are unavailable status. Besides, the removing DataNode will not receive new write requests from client. 
+And users can use the command `set system status to running` to make the status of DataNode from Removing to Running;
+If users want to make the Regions from Removing to available status, command `migrate region from datanodeId1 to datanodeId2` can take effect, this command can migrate the regions to other alive DataNodes.
+Besides, IoTDB will publish `remove-datanode.sh -f` command in the next version, which can remove DataNodes forced (The failed migrated regions will be discarded).
+
+### 5). Whether the down DataNode can be removed?
+
+- The down DataNode can be removed only when the replica factor of schema and data is greater than 1.  
+Besides, IoTDB will publish `remove-datanode.sh -f` function in the next version.
+
+### 6).What should be paid attention to when upgrading from 0.13 to 1.0?
+
+- The file structure between 0.13 and 1.0 is different, we can't copy the data directory from 0.13 to 1.0 to use directly. 
+If you want to load the data from 0.13 to 1.0, you can use the LOAD function.
+- The default RPC address of 0.13 is `0.0.0.0`, but the default RPC address of 1.0 is `127.0.0.1`.
+
+
+## 2. Cluster Restart
+
+### 1). How to restart any ConfigNode in the cluster?
+- First step: stop the process by stop-confignode.sh or kill PID of ConfigNode. 
+- Second step: execute start-confignode.sh to restart ConfigNode.
+
+### 2). How to restart any DataNode in the cluster?
+- First step: stop the process by stop-datanode.sh or kill PID of DataNode.
+- Second step: execute start-datanode.sh to restart DataNode.
+
+### 3). If it's possible to restart ConfigNode using the old data directory when it's removed?
+- Can't. The running result will be "Reject ConfigNode restart. Because there are no corresponding ConfigNode(whose nodeId=xx) in the cluster".
+
+### 4). If it's possible to restart DataNode using the old data directory when it's removed?
+- Can't. The running result will be "Reject DataNode restart. Because there are no corresponding DataNode(whose nodeId=xx) in the cluster. Possible solutions are as follows:...".
+
+### 5). Can we execute start-confignode.sh/start-datanode.sh successfully when delete the data directory of given ConfigNode/DataNode without killing the PID?
+- Can't. The running result will be "The port is already occupied".
+
+## 3. Cluster Maintenance
+
+### 1). How to find the reason when Show cluster failed, and error logs like "please check server status" are shown?
+- Make sure that more than one half ConfigNodes are alive.
+- Make sure that the DataNode connected by the client is alive.
+
+### 2). How to fix one DataNode when the disk file is broken?
+- We can use remove-datanode.sh to fix it. Remove-datanode will migrate the data in the removing DataNode to other alive DataNodes.
+- IoTDB will publish Node-Fix tools in the next version.
+
+### 3). How to decrease the memory usage of ConfigNode/DataNode?
+- Adjust the MAX_HEAP_SIZE、MAX_DIRECT_MEMORY_SIZE options in conf/confignode-env.sh and conf/datanode-env.sh.
+
+
diff --git a/server/src/main/java/org/apache/iotdb/db/mpp/common/header/ColumnHeaderConstant.java b/server/src/main/java/org/apache/iotdb/db/mpp/common/header/ColumnHeaderConstant.java
index c8fffdb2e7..b625667c75 100644
--- a/server/src/main/java/org/apache/iotdb/db/mpp/common/header/ColumnHeaderConstant.java
+++ b/server/src/main/java/org/apache/iotdb/db/mpp/common/header/ColumnHeaderConstant.java
@@ -98,6 +98,8 @@ public class ColumnHeaderConstant {
   public static final String DATA_NODE_ID = "DataNodeId";
   public static final String SERIES_SLOT_NUM = "SeriesSlotNum";
   public static final String TIME_SLOT_NUM = "TimeSlotNum";
+  public static final String SERIES_SLOT_ID = "SeriesSlotId";
+  public static final String TIME_SLOT_ID = "TimeSlotId";
   public static final String ROLE = "Role";
 
   // column names for show datanodes
@@ -324,10 +326,10 @@ public class ColumnHeaderConstant {
       ImmutableList.of(new ColumnHeader(REGION_ID, TSDataType.INT32));
 
   public static final List<ColumnHeader> getTimeSlotListColumnHeaders =
-      ImmutableList.of(new ColumnHeader(TIME_SLOT_NUM, TSDataType.INT64));
+      ImmutableList.of(new ColumnHeader(TIME_SLOT_ID, TSDataType.INT64));
 
   public static final List<ColumnHeader> getSeriesSlotListColumnHeaders =
-      ImmutableList.of(new ColumnHeader(SERIES_SLOT_NUM, TSDataType.INT32));
+      ImmutableList.of(new ColumnHeader(SERIES_SLOT_ID, TSDataType.INT32));
 
   public static final List<ColumnHeader> showContinuousQueriesColumnHeaders =
       ImmutableList.of(
diff --git a/site/src/main/.vuepress/config.js b/site/src/main/.vuepress/config.js
index b723dddb62..fdd2e5a2fd 100644
--- a/site/src/main/.vuepress/config.js
+++ b/site/src/main/.vuepress/config.js
@@ -1269,6 +1269,7 @@ var config = {
 						title: 'FAQ',
 						children: [
 							['FAQ/Frequently-asked-questions','Frequently asked questions'],
+							['FAQ/FAQ-for-cluster-setup','FAQ for cluster setup'],
 						]
 					},
 					{
@@ -2465,6 +2466,7 @@ var config = {
 						title: 'FAQ',
 						children: [
 							['FAQ/Frequently-asked-questions','常见问题'],
+							['FAQ/FAQ-for-cluster-setup','分布式部署FAQ'],
 						]
 					},
 					{