You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by ty...@apache.org on 2022/10/03 05:17:41 UTC

[incubator-seatunnel] branch dev updated: [Improve][doc] remove Chinese doc (#2972)

This is an automated email from the ASF dual-hosted git repository.

tyrantlucifer pushed a commit to branch dev
in repository https://gitbox.apache.org/repos/asf/incubator-seatunnel.git


The following commit(s) were added to refs/heads/dev by this push:
     new 29182ec15 [Improve][doc] remove Chinese doc (#2972)
29182ec15 is described below

commit 29182ec1579c2e9014b227aa03d817947f2156d2
Author: Zongwen Li <zo...@apache.org>
AuthorDate: Mon Oct 3 13:17:34 2022 +0800

    [Improve][doc] remove Chinese doc (#2972)
---
 .../flink/configuration/sink-plugins/Doris.md      |  72 --------------
 .../spark/commands/start-seatunnel-spark.sh.md     |  43 ---------
 docs/zh-CN/spark/configuration/ConfigExamples.md   |   9 --
 docs/zh-CN/spark/deployment.md                     |  72 --------------
 docs/zh-CN/spark/installation.md                   |  32 ------
 docs/zh-CN/spark/quick-start.md                    | 107 ---------------------
 6 files changed, 335 deletions(-)

diff --git a/docs/zh-CN/flink/configuration/sink-plugins/Doris.md b/docs/zh-CN/flink/configuration/sink-plugins/Doris.md
deleted file mode 100644
index 185161409..000000000
--- a/docs/zh-CN/flink/configuration/sink-plugins/Doris.md
+++ /dev/null
@@ -1,72 +0,0 @@
-# Sink plugin: Doris [Flink]
-
-### 描述
-
-向Doris表中写数据
-
-### 配置
-
-| 配置项 | 类型 | 必填 | 默认值 | 支持引擎 |
-| --- | --- | --- | --- | --- |
-| fenodes | string | yes | - | Flink |
-| database | string | yes | - | Flink  |
-| table | string | yes | - | Flink  |
-| user	 | string | yes | - | Flink  |
-| password	 | string | yes | - | Flink  |
-| batch_size	 | int | no |  100 | Flink  |
-| interval	 | int | no |1000 | Flink |
-| max_retries	 | int | no | 1 | Flink|
-| doris.*	 | - | no | - | Flink  |
-
-##### fenodes [string]
-
-Doris FE http 地址
-
-##### database [string]
-
-Doris 数据库名称
-
-##### table [string]
-
-Doris 表名称
-
-##### user [string]
-
-Doris 用户名
-
-##### password [string]
-
-Doris 密码
-
-##### batch_size [int]
-
-单次写 Doris 的最大行数,默认值 : 5000
-
-##### interval [int]
-
-flush 间隔时间(毫秒),超过该时间后异步线程将 缓存中数据写入Doris。设置为0表示关闭定期写入。
-
-默认值:5000(毫秒)即5秒一个批次
-
-##### max_retries [int]
-
-写Doris失败之后的重试次数
-
-##### doris.* [string]
-
-Stream load 的导入参数。例如:'doris.column_separator' = ', ' 定义列分隔符
-
-### Examples
-
-```
-DorisSink {
-	 fenodes = "127.0.0.1:8030"
-	 database = database
-	 table = table
-	 user = root
-	 password = password
-	 batch_size = 1
-	 doris.column_separator="\t"
-     doris.columns="id,user_name,user_name_cn,create_time,last_login_time"
-}
-```
diff --git a/docs/zh-CN/spark/commands/start-seatunnel-spark.sh.md b/docs/zh-CN/spark/commands/start-seatunnel-spark.sh.md
deleted file mode 100644
index 13d9368ff..000000000
--- a/docs/zh-CN/spark/commands/start-seatunnel-spark.sh.md
+++ /dev/null
@@ -1,43 +0,0 @@
-# 命令使用说明
-
-> 命令使用说明 [Spark]
-
-## Seatunnel spark 启动命令
-
-```bash
-bin/start-seatunnel-spark.sh
-```
-
-### 使用说明
-
-```bash
-bin/start-seatunnel-spark.sh \
--c config-path \
--m master \
--e deploy-mode \
--i city=beijing
-```
-
-- 使用 `-c` or `--config` 指定配置文件的路径
-
-- 使用 `-m` or `--master` 指定集群管理器
-
-- 使用 `-e` or `--deploy-mode` 指定部署模式
-
-- 使用 `-i` or `--variable` 指定配置文件中使用的变量,可以使用多次
-
-#### 用例
-
-```bash
-# Yarn client mode
-./bin/start-seatunnel-spark.sh \
---master yarn \
---deploy-mode client \
---config ./config/application.conf
-
-# Yarn cluster mode
-./bin/start-seatunnel-spark.sh \
---master yarn \
---deploy-mode cluster \
---config ./config/application.conf
-```
diff --git a/docs/zh-CN/spark/configuration/ConfigExamples.md b/docs/zh-CN/spark/configuration/ConfigExamples.md
deleted file mode 100644
index 6c828b57b..000000000
--- a/docs/zh-CN/spark/configuration/ConfigExamples.md
+++ /dev/null
@@ -1,9 +0,0 @@
-# 配置示例
-
-> 完整的配置案例 [Spark]
-
-- 示例 1: [Streaming streaming computing](https://github.com/apache/incubator-seatunnel/blob/dev/config/spark.streaming.conf.template)
-
-- 示例 2: [Batch offline batch processing](https://github.com/apache/incubator-seatunnel/blob/dev/config/spark.batch.conf.template) 
-
-如果你想了解配置格式的细节, 请参考 [HOCON](https://github.com/lightbend/config/blob/main/HOCON.md).
\ No newline at end of file
diff --git a/docs/zh-CN/spark/deployment.md b/docs/zh-CN/spark/deployment.md
deleted file mode 100644
index 72dc1038d..000000000
--- a/docs/zh-CN/spark/deployment.md
+++ /dev/null
@@ -1,72 +0,0 @@
-## 部署和运行
-
-> Seatunnel v2 For Spark 依赖Java运行时环境和Spark. 详细的Seatunnel安装步骤, 请参考 [安装Seatunnel](./installation.md)
-
-下面主要介绍不同的任务运行模式:
-
-## Local模式运行
-
-```bash
-./bin/start-seatunnel-spark.sh \
---master local[4] \
---deploy-mode client \
---config ./config/application.conf
-```
-
-## Spark Standalone cluster模式运行
-
-```bash
-# client mode
-./bin/start-seatunnel-spark.sh \
---master spark://ip:7077 \
---deploy-mode client \
---config ./config/application.conf
-
-# cluster mode
-./bin/start-seatunnel-spark.sh \
---master spark://ip:7077 \
---deploy-mode cluster \
---config ./config/application.conf
-```
-
-## Yarn 模式运行
-
-```bash
-# client mode
-./bin/start-seatunnel-spark.sh \
---master yarn \
---deploy-mode client \
---config ./config/application.conf
-
-# cluster mode
-./bin/start-seatunnel-spark.sh \
---master yarn \
---deploy-mode cluster \
---config ./config/application.conf
-```
-
-## Mesos cluster模式运行
-
-```bash
-# cluster mode
-./bin/start-seatunnel-spark.sh \
---master mesos://ip:7077 \
---deploy-mode cluster \
---config ./config/application.conf
-```
-
-`start-seatunnel-spark.sh`中`master` and `deploy-mode`的参数含义 , 请参考: [命令说明](./commands/start-seatunnel-spark.sh.md)
-
-如果想指定`seatunnel`运行时使用的资源或其他 `Spark参数` , 可以在配置文件中通过 `--config`指定 :
-
-```bash
-env {
-  spark.executor.instances = 2
-  spark.executor.cores = 1
-  spark.executor.memory = "1g"
-  ...
-}
-...
-```
-
-`seatunnel`的配置方式,请参考`seatunnel` [公共配置](./configuration)
diff --git a/docs/zh-CN/spark/installation.md b/docs/zh-CN/spark/installation.md
deleted file mode 100644
index b901cad89..000000000
--- a/docs/zh-CN/spark/installation.md
+++ /dev/null
@@ -1,32 +0,0 @@
-# 下载和安装
-
-## 下载
-
-```bash
-https://github.com/apache/incubator-seatunnel/releases
-```
-
-## 环境准备
-
-### 准备 JDK1.8
-
-`Seatunnel` 依赖`JDK1.8`.
-
-### 准备Spark
-
-`Seatunnel` 支持`Spark`引擎。如果希望在`Spark`中使用`SeaTunnel`,需要提前准备好`Spark`环境。[下载 Spark](https://spark.apache.org/downloads.html) , 选择 `Spark 版本 >= 2.x.x 
-`,目前还不支持spark3.x。 在下载和解压后, 
-不需要任何配置就可以指定 
-`deploy-mode = local` 提交任务。 如果需要使用其他模式,如 `Standalone cluster`,`Yarn cluster` ,`Mesos cluster`, 请参考官方文档。
-
-## 安装 Seatunnel
-
-下载`seatunnel` 安装包并解压:
-
-```bash
-wget https://github.com/apache/incubator-seatunnel/releases/download/v<version>/seatunnel-<version>.zip -O seatunnel-<version>.zip
-unzip seatunnel-<version>.zip
-ln -s seatunnel-<version> seatunnel
-```
-
-这里没有完整的安装和配置步骤。 请参考 [快速开始](./quick-start.md) 和 [配置](./configuration)去使用`seatunnel`。
diff --git a/docs/zh-CN/spark/quick-start.md b/docs/zh-CN/spark/quick-start.md
deleted file mode 100644
index 24c017f42..000000000
--- a/docs/zh-CN/spark/quick-start.md
+++ /dev/null
@@ -1,107 +0,0 @@
-# 快速开始
-
-> 通过一个从`socket`接收数据,然后划分数据到多列后输出的例子来了解如何使用`Seatunnel`.
-
-## 第一步: 准备Spark运行时环境
-
-> 如果你熟悉Spark,或者已经有Spark环境, 可以忽略这一步。我们不需要对Spark做任何配置。
-
-请先[下载Spark](https://spark.apache.org/downloads.html), 选择`Spark版本 >= 2.x.x`,目前还不支持spark3.x。在下载并解压之后, 不需要修改配置就可以指定 `deploy-mode = local`去提交任务。如果想将任务运行在`Standalone clusters`, `Yarn clusters`, `Mesos clusters`, 请参考在Spark官网 [Spark部署文档](https://spark.apache.org/docs/latest/cluster-overview.html).
-
-### 第二布: 下载 Seatunnel
-
-通过[Seatunnel安装包下载](https://seatunnel.apache.org/download) 下载最新版本 `seatunnel-<version>-bin.tar.gz`
-
-或者下载指定的版本 (以`2.1.0`为例):
-
-```bash
-wget https://downloads.apache.org/incubator/seatunnel/2.1.0/apache-seatunnel-incubating-2.1.0-bin.tar.gz -O seatunnel-2.1.0.tar.gz
-```
-
-下载完成后解压:
-
-```bash
-tar -xvzf seatunnel-<version>.tar.gz
-ln -s seatunnel-<version> seatunnel
-```
-
-## 步骤 3: 配置 seatunnel
-
-- 编辑 `config/seatunnel-env.sh` , 指定必要的环境变量,比如`SPARK_HOME` (步骤 1解压后的目录)
-
-- 创建一个新的 `config/application.conf` , 它决定了`Seatunnel`被启动后,数据如何被输入,处理和输出。
-
-```bash
-env {
-  # seatunnel defined streaming batch duration in seconds
-  spark.stream.batchDuration = 5
-
-  spark.app.name = "seatunnel"
-  spark.ui.port = 13000
-}
-
-source {
-  socketStream {}
-}
-
-transform {
-  split {
-    fields = ["msg", "name"]
-    delimiter = ","
-  }
-}
-
-sink {
-  console {}
-}
-```
-
-## 步骤 4: 启动 `netcat server`发送数据
-
-```bash
-nc -lk 9999
-```
-
-## 步骤 5: 启动 Seatunnel
-
-```bash
-cd seatunnel
-./bin/start-seatunnel-spark.sh \
---master local[4] \
---deploy-mode client \
---config ./config/application.conf
-```
-
-## 步骤 6: Input at the `nc` terminal
-
-```bash
-Hello World, seatunnel
-```
-
-`Seatunnel`日志输出:
-
-```bash
-+----------------------+-----------+---------+
-|raw_message           |msg        |name     |
-+----------------------+-----------+---------+
-|Hello World, seatunnel|Hello World|seatunnel|
-+----------------------+-----------+---------+
-```
-
-## 总结
-
-`Seatunnel`是简单而且易用的, 还有更丰富的数据处理功能有待发现。 本文展示的数据处理案例,无需任何代码、编译、打包,比官方文档更简单 [快速开始](https://spark.apache.org/docs/latest/streaming-programming-guide.html#a-quick-example).
-
-如果想了解更多的`Seatunnel`配置案例, 请参考:
-
-- Configuration example 2: [Batch offline batch processing](https://github.com/apache/incubator-seatunnel/blob/dev/config/spark.batch.conf.template)
-
-The above configuration is the default [offline batch configuration template], which can be run directly, the command is as follows:
-
-```bash
-cd seatunnel
-./bin/start-seatunnel-spark.sh \
---master 'local[2]' \
---deploy-mode client \
---config ./config/spark.batch.conf.template
-```