You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@kylin.apache.org by ya...@apache.org on 2022/04/02 08:53:56 UTC
[kylin] branch kylin4_on_cloud updated: Modify docs (#1847)
This is an automated email from the ASF dual-hosted git repository.
yaqian pushed a commit to branch kylin4_on_cloud
in repository https://gitbox.apache.org/repos/asf/kylin.git
The following commit(s) were added to refs/heads/kylin4_on_cloud by this push:
new 113e417 Modify docs (#1847)
113e417 is described below
commit 113e417c6096d424c3c1bf8bcdf9ca78abb8013d
Author: Yaqian Zhang <59...@qq.com>
AuthorDate: Sat Apr 2 16:53:51 2022 +0800
Modify docs (#1847)
---
README.md | 5 ++-
instances/aws_instance.py | 2 +-
...vanced_configs.md => advanced_configuration.md} | 10 ++---
readme/commands.md | 43 +++++++++++++++-------
readme/{configs.md => configuration.md} | 16 +++++---
readme/quick_start.md | 27 ++++++++------
readme/quick_start_for_multiple_clusters.md | 4 +-
7 files changed, 68 insertions(+), 39 deletions(-)
diff --git a/README.md b/README.md
index c4704e4..2599bc9 100644
--- a/README.md
+++ b/README.md
@@ -40,7 +40,7 @@ When cluster(s) created, services and nodes will be like below:
1. For more details about `cost` of tool, see document [cost calculation](./readme/cost_calculation.md).
2. For more details about `commands` of tool, see document [commands](./readme/commands.md).
3. For more details about the `prerequisites` of tool, see document [prerequisites](./readme/prerequisites.md).
-4. For more details about `advanced configs` of tool, see document [advanced configs](./readme/advanced_configs.md).
+4. For more details about `advanced configs` of tool, see document [configuration](./readme/configuration.md) and [advanced configuration](./readme/advanced_configuration.md).
5. For more details about `monitor services` supported by tool, see document [monitor](./readme/monitor.md).
6. For more details about `troubleshooting`, see document [troubleshooting](./readme/trouble_shooting.md).
7. The current tool has already opened the public port for some services. You can access the service by `public IP` of related EC2 instances.
@@ -49,6 +49,7 @@ When cluster(s) created, services and nodes will be like below:
3. `Prometheus`: 9090, 9100.
4. `Kylin`: 7070.
5. `Spark`: 8080, 4040.
+ 6. `MDX for Kylin`: 7080.
8. More about cloudformation syntax, please check [aws website](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html).
-9. The current Kylin version is 4.0.0.
+9. The current Kylin version is 4.0.1.
10. The current Spark version is 3.1.1.
diff --git a/instances/aws_instance.py b/instances/aws_instance.py
index f370d95..9260e79 100644
--- a/instances/aws_instance.py
+++ b/instances/aws_instance.py
@@ -1808,7 +1808,7 @@ class AWSInstance:
logger.info(f"Fetching messages successfully ...")
header_msg = '\n=================== List Alive Nodes ===========================\n'
- result = header_msg + f"Stack Name\t\tInstance ID\t\tPrivate Ip\t\tPublic Ip\t\t\n"
+ result = header_msg + f"Node Name\t\tInstance ID\t\tPrivate Ip\t\tPublic Ip\t\t\n"
for msg in msgs:
result += msg + '\n'
result += header_msg
diff --git a/readme/advanced_configs.md b/readme/advanced_configuration.md
similarity index 83%
rename from readme/advanced_configs.md
rename to readme/advanced_configuration.md
index cd36d3b..0218650 100644
--- a/readme/advanced_configs.md
+++ b/readme/advanced_configuration.md
@@ -16,7 +16,7 @@ There are `9` modules params for tools. Introductions as below:
- EC2_KYLIN4_PARAMS: These params of the module are for creating a Kylin4.
-- EC2_SPARK_WORKER_PARAMS: These params of the module are for creating **Spark Workers**, the default is **3** spark workers for all clusters.
+- EC2_SPARK_WORKER_PARAMS: These params of the module are for creating **Spark Workers**, the default is **3** spark workers for all of the clusters.
- EC2_KYLIN4_SCALE_PARAMS: these params of the module are for scaling **Kylin4 nodes**, the range of **Kylin4 nodes** is related to `KYLIN_SCALE_UP_NODES` and `KYLIN_SCALE_DOWN_NODES`.
@@ -25,16 +25,16 @@ There are `9` modules params for tools. Introductions as below:
> 1. `KYLIN_SCALE_UP_NODES` is for the range of Kylin nodes to scale up.
> 2. `KYLIN_SCALE_DOWN_NODES` is for the range of Kylin nodes to scale down.
> 3. The range of `KYLIN_SCALE_UP_NODES` must contain the range of `KYLIN_SCALE_DOWN_NODES`.
- > 4. **They are effective to all clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.**
+ > 4. **They are effective to all of the clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.**
- EC2_SPARK_SCALE_SLAVE_PARAMS: these params of the module are for scaling **Spark workers**, the range of **Spark Workers is related to `SPARK_WORKER_SCALE_UP_NODES` and `SPARK_WORKER_SCALE_DOWN_NODES`.
> Note:
>
- > 1. `SPARK_WORKER_SCALE_UP_NODES` is for the range for spark workers to scale up. **It's effective to all clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.**
- > 2. `SPARK_WORKER_SCALE_DOWN_NODES` is for the range for spark workers to scale down. **It's effective to all clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.**
+ > 1. `SPARK_WORKER_SCALE_UP_NODES` is for the range for spark workers to scale up. **It's effective to all of the clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.**
+ > 2. `SPARK_WORKER_SCALE_DOWN_NODES` is for the range for spark workers to scale down. **It's effective to all of the clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.**
> 3. The range of `SPARK_WORKER_SCALE_UP_NODES` must contain the range of `SPARK_WORKER_SCALE_DOWN_NODES`.
- > 4. **They are effective to all clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.**
+ > 4. **They are effective to all of the clusters which are not only `default cluster` but also another cluster whose index is in `${CLUSTER_INDEXES}`.**
### Customize Configs
diff --git a/readme/commands.md b/readme/commands.md
index 7bca88f..81dd510 100644
--- a/readme/commands.md
+++ b/readme/commands.md
@@ -1,16 +1,21 @@
## Commands<a name="run"></a>
Command:
+ > Note:
+ >
+ > Options are placed in `[]`, and different options are separated by `|`.
```shell
-python deploy.py --type [deploy|destroy|list|scale] --scale-type [up|down] --node-type [kylin|spark_worker] [--cluster {1..6}|all|default]
+python deploy.py --type [deploy|destroy|destroy-all|list|scale] --mode [all|job|query] --scale-type [up|down] --node-type [kylin|spark_worker] --cluster [{1..6}|all|default]
```
- deploy: create cluster(s).
-- destroy: destroy created cluster(s).
+- destroy: destroy created cluster(s). Including kylin node, spark master node, spark slave node and zookeeper node.
+
+- destroy-all: destroy all of the node. Including kylin node, spark master node, spark slave node, zookeeper node, rds node, monitor node and vpc node .
-- list: list alive nodes which are with stack name, instance id, private IP, and public IP.
+- list: list alive nodes which are with node name, instance id, private IP, and public IP.
- scale: Must be used with `--scale-type` and `--node-type`.
@@ -19,14 +24,26 @@ python deploy.py --type [deploy|destroy|list|scale] --scale-type [up|down] --nod
> 1. Current support to scale up/down `kylin` or `spark_worker` for a specific cluster.
> 2. Before scaling up/down `kylin` or `spark_worker` nodes, Cluster services must be ready.
> 3. If you want to scale a `kylin` or `spark_worker` node to a specified cluster, please add the `--cluster ${cluster ID}` to specify the expected node add to the cluster `${cluster ID}`.
- > 4. For details about the index of the cluster, please check [Indexes of clusters](./configs.md#indexofcluster).
+ > 4. For details about the index of the cluster, please check [Indexes of clusters](./configuration.md#indexofcluster).
### Command for deploy
-- Deploy a default cluster
+- Deploy a cluster, the mode of kylin node is `all`
+
+```shell
+$ python deploy.py --type deploy
+```
+
+- deploy a cluster, the mode of kylin node is `job`
+
+```shell
+$ python deploy.py --type deploy --mode job
+```
+
+- deploy a cluster, the mode of kylin node is `query`
```shell
-$ python deploy.py --type deploy [--cluster default]
+$ python deploy.py --type deploy --mode query
```
- Deploy a cluster with a specific cluster index. <a name="deploycluster"></a>
@@ -37,7 +54,7 @@ $ python deploy.py --type deploy --cluster ${cluster ID}
> Note: the `${cluster ID}` must be in the range of `CLUSTER_INDEXES`.
-- Deploy all clusters which contain the default cluster and all clusters whose index is in the range of `CLUSTER_INDEXES`.
+- Deploy all of the clusters which contain the default cluster and all of the clusters whose index is in the range of `CLUSTER_INDEXES`.
```shell
$ python deploy.py --type deploy --cluster all
@@ -47,12 +64,12 @@ $ python deploy.py --type deploy --cluster all
> Note:
>
-> Destroy all clusters will not delete vpc, rds, and monitor node. So if user doesn't want to hold the env, please set the `ALWAYS_DESTROY_VPC_RDS_MONITOR` to be `'true'`.
+> By default, using the `destroy` command does not vpc, rds, and monitor node. So if user doesn't want to hold the env, please use `destroy-all` command.
-- Destroy a default cluster
+- Destroy the default cluster
```shell
-$ python deploy.py --type destroy [--cluster default]
+$ python deploy.py --type destroy
```
- Destroy a cluster with a specific cluster index.
@@ -63,7 +80,7 @@ $ python deploy.py --type destroy --cluster ${cluster ID}
> Note: the `${cluster ID}` must be in the range of `CLUSTER_INDEXES`.
-- Destroy all clusters which contain the default cluster and all clusters whose index is in the range of `CLUSTER_INDEXES`.
+- Destroy all of the clusters which contain the default cluster and all of the clusters whose index is in the range of `CLUSTER_INDEXES`.
```shell
$ python deploy.py --type destroy --cluster all
@@ -71,7 +88,7 @@ $ python deploy.py --type destroy --cluster all
### Command for list
-- List nodes that are with **stack name**, **instance id**, **private IP,** and **public IP** in **available stacks**.
+- List nodes that are with **node name**, **instance id**, **private IP,** and **public IP** in **available stacks**.
```shell
$ python deploy.py --type list
@@ -83,7 +100,7 @@ $ python deploy.py --type list
>
> 1. Scale command must be used with `--scale-type` and `--node-type`.
> 2. If the scale command does not specify a `cluster ID`, then the scaled node(Kylin or spark worker) will be added to the `default` cluster.
-> 3. Scale command **not support** to **scale** node (kylin or spark worker) to **all clusters** at **one time**. It means that `python ./deploy.py --type scale --scale-type up[|down] --node-type kylin[|spark_worker] --cluster all` is invalid commad.
+> 3. Scale command **not support** to **scale** node (kylin or spark worker) to **all of the clusters** at **one time**. It means that `python ./deploy.py --type scale --scale-type up[|down] --node-type kylin[|spark_worker] --cluster all` is invalid commad.
> 4. Scale params which are `KYLIN_SCALE_UP_NODES`, `KYLIN_SCALE_DOWN_NODES`, `SPARK_WORKER_SCALE_UP_NODES` and `SPARK_WORKER_SCALE_DOWN_NODES` effect on all cluster. So if user wants to scale a node for a specific cluster, then modify the scale params before **every run time.**
> 5. **(Important!!!)** The current cluster is created with default `3` spark workers and `1` Kylin node. The `3` spark workers can not be scaled down. The `1` Kylin node also can not be scaled down.
> 6. **(Important!!!)** The current cluster can only scale up or down the range of nodes which is in `KYLIN_SCALE_UP_NODES`, `KYLIN_SCALE_DOWN_NODES`, `SPARK_WORKER_SCALE_UP_NODES,` and `SPARK_WORKER_SCALE_DOWN_NODES`. Not the default `3` spark workers and `1` kylin node in a cluster.
diff --git a/readme/configs.md b/readme/configuration.md
similarity index 87%
rename from readme/configs.md
rename to readme/configuration.md
index b3281f7..52e0559 100644
--- a/readme/configs.md
+++ b/readme/configuration.md
@@ -1,18 +1,24 @@
-## Configs
+## Configuration
#### I. Configure the `kylin_configs.yaml`
**Required parameters**:
-- `AWS_REGION`: Current region for EC2 instances.
+- `AWS_REGION`: Current region for EC2 instances. Default is `cn-northwest-1`.
- `IAMRole`: IAM role which has the access to aws authority. This parameter will be set to the created **name** of the IAM role.
- `S3_URI`: the prefix path of storing `jars/scripts/tar`. For example, this parameter will be set to `s3://.../kylin4-aws-test`.
- `KeyName`: Security key name is a set of security credentials that you use to prove your identity when connecting to an instance. This parameter will be set to the created **name** of key pair`.
- `CIDR_IP`: An inbound rule permits instances to receive traffic from the specified IPv4 or IPv6 CIDR address range, or the instances associated with the specified security group.
+
+**Optional parameters**:
+
- `DB_IDENTIFIER`: this param should be only one in the `RDS -> Databases`. And it will be the name of created RDS database.
-- `DB_PORT`: this param will be the port of created RDS database, default is `3306`.
-- `DB_USER`: this param will be a login ID for the master user of your DB instance, the default is `root`.
-- `DB_PASSWORD`: this param will be the password of `DB_USER` to access the DB instance. default is `123456test`, it's strongly suggested you change it.
+- `DB_PORT`: this param will be the port of created RDS database. The default value is `3306`.
+- `DB_USER`: this param will be a login ID for the master user of your DB instance. The default value is `root`.
+- `DB_PASSWORD`: this param will be the password of `DB_USER` to access the DB instance. The default value is `123456test`, it's strongly suggested you change it.
+
+- `ENABLE_MDX`: Whether to start the `MDX for Kylin` service when starting the cluster. The default value is `false`.
+- `SUPPORT_GLUE`: Whether to use AWS Glue as the metastore service of hive data source. The default value is `true`, effective only when deploying a kylin node of `job` mode.
#### II. Configure the `kylin.properties` in `backup/properties` directories.<a name="cluster"></a>
diff --git a/readme/quick_start.md b/readme/quick_start.md
index 7311c2b..f86f026 100644
--- a/readme/quick_start.md
+++ b/readme/quick_start.md
@@ -15,26 +15,30 @@
git clone https://github.com/apache/kylin.git && cd kylin && git checkout kylin4_on_cloud
```
-3. Modify the `kylin_config.yml`.
+3. Configure the `kylin_config.yaml`.
- 1. Set the `AWS_REGION`, such as us-east-1.
+ 1. Set the `AWS_REGION`, such as `us-east-1`.
- 2. Set the `IAMRole`, please check [Create an IAM role](./prerequisites.md#IAM).
+ 2. Set the `IAMRole`, please check [create an IAM role](./prerequisites.md#IAM).
- 3. Set the `S3_URI`, please check [Create a S3 direcotry](./prerequisites.md#S3).
+ 3. Set the `S3_URI`, please check [create a S3 direcotry](./prerequisites.md#S3).
- 4. Set the `KeyName`, please check [Create a keypair](./prerequisites.md#keypair).
+ 4. Set the `KeyName`, please check [create a keypair](./prerequisites.md#keypair).
5. Set the `CIDR_IP`, make sure that the `CIDR_IP` match the pattern `xxx.xxx.xxx.xxx/16[|24|32]`.
- > Note:
+ > Note:
>
> 1. this `CIDR_IP` is the specified IPv4 or IPv6 CIDR address range which an inbound rule can permit instances to receive traffic from.
>
> 2. In one word, it will let your mac which IP is in the `CIDR_IP` to access instances.
+ 6. Set the `ENABLE_MDX`, if you want to use `MDX for Kylin`, you can set this parameter to `true`. For `MDX for Kylin`, please refer to: [The manual of MDX for Kylin](https://kyligence.github.io/mdx-kylin/).
+
4. Init python env.
+> Note: You need to ensure that the local machine has installed Python above version 3.6.6.
+
```shell
$ bin/init.sh
$ source venv/bin/activate
@@ -46,8 +50,6 @@ Check the python version:
$ python --version
```
-> Note: If Python is already installed locally, you need to ensure that the python version is 3.6.6 or later.
-
5. Execute commands to deploy a cluster quickly.
```shell
@@ -58,8 +60,9 @@ After this cluster is ready, you will see the message `Kylin Cluster already sta
> Note:
>
-> 1. For more details about the properties of kylin4 in a cluster, please check [configure kylin.properties](./configs.md#cluster).
-> 2. For more details about the index of the clusters, please check [Indexes of clusters](./configs.md#indexofcluster).
+> 1. By default, the mode of kylin node in the deployed cluster is `all`, supports both `job` and `query`. If you want to deploy a read-write separated cluster, you can use command `python deploy.py --type deploy --mode job` to deploy a `job` cluster, and use command `python deploy.py --type deploy --mode query` to deploy a `query` cluster. AWS Glue is supported by default in `job` cluster.
+> 2. For more details about the properties of kylin4 in a cluster, please check [configure kylin.properties](./configuration.md#cluster).
+> 3. For more details about the index of the clusters, please check [Indexes of clusters](./configuration.md#indexofcluster).
6. Execute commands to list nodes of the cluster.
@@ -73,6 +76,8 @@ You can access `Kylin` web by `http://{kylin public ip}:7070/kylin`.
![kylin login](../images/kylinlogin.png)
+If you set `ENABLE_MDX` to true, you can access `MDX for Kylin` by `http://{kylin public ip}:7080/kylin`.
+
7. Destroy the cluster quickly.
```shell
@@ -82,5 +87,5 @@ $ python deploy.py --type destroy
> Note:
>
> 1. If you want to check about a quick start for multiple clusters, please referer to a [quick start for multiple clusters](./quick_start_for_multiple_clusters.md).
-> 2. **Current destroy operation will remain some stack which contains `RDS` and so on**. So if user want to destroy clearly, please modify the `ALWAYS_DESTROY_VPC_RDS_MONITOR` in `kylin_configs.yml` to be `true` and re-execute `destroy` command.
+> 2. **Current destroy operation will remain some stack which contains `RDS` and so on**. So if user want to destroy clearly, please use `python deploy.py --type destroy-all`.
diff --git a/readme/quick_start_for_multiple_clusters.md b/readme/quick_start_for_multiple_clusters.md
index fd18836..e025941 100644
--- a/readme/quick_start_for_multiple_clusters.md
+++ b/readme/quick_start_for_multiple_clusters.md
@@ -8,9 +8,9 @@
>
> 1. `CLUSTER_INDEXES` means that cluster index is in the range of `CLUSTER_INDEXES`.
> 2. Configs for multiple clusters are also from `kylin_configs.yaml`.
- > 3. For more details about the index of the clusters, please check [Indexes of clusters](./configs.md#indexofcluster).
+ > 3. For more details about the index of the clusters, please check [Indexes of clusters](./configuration.md#indexofcluster).
-2. Copy `kylin.properties.template` for expecting clusters to deploy, please check the [details](./configs.md#cluster).
+2. Copy `kylin.properties.template` for expecting clusters to deploy, please check the [details](./configuration.md#cluster).
3. Execute commands to deploy all of the clusters.