You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@inlong.apache.org by do...@apache.org on 2022/04/18 12:00:23 UTC
[incubator-inlong-website] branch master updated: [INLONG-355][Doc] Refactor the File collector guide (#356)
This is an automated email from the ASF dual-hosted git repository.
dockerzhang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-inlong-website.git
The following commit(s) were added to refs/heads/master by this push:
new f0f79923f [INLONG-355][Doc] Refactor the File collector guide (#356)
f0f79923f is described below
commit f0f79923fbdfb6e2cb29d1d3ab57244decf0d190
Author: dockerzhang <do...@apache.org>
AuthorDate: Mon Apr 18 20:00:18 2022 +0800
[INLONG-355][Doc] Refactor the File collector guide (#356)
---
docs/administration/_category_.json | 2 +-
docs/contact.md | 2 +-
docs/data_node/_category_.json | 4 ++
docs/data_node/extract_node/_category_.json | 4 ++
.../agent => data_node/extract_node}/file.md | 21 +++++---
docs/data_node/extract_node/img/file_param.png | Bin 0 -> 17445 bytes
docs/data_node/load_node/_category_.json | 4 ++
docs/development/_category_.json | 2 +-
docs/modules/agent/sql.md | 60 ---------------------
docs/sdk/_category_.json | 2 +-
docs/user_guide/_category_.json | 2 +-
.../current/how-to-release.md | 2 +-
.../docusaurus-plugin-content-docs/current.json | 4 ++
.../agent => data_node/extract_node}/file.md | 18 +++++--
.../data_node/extract_node/img/file_param.png | Bin 0 -> 12487 bytes
.../current/modules/agent/sql.md | 60 ---------------------
16 files changed, 49 insertions(+), 138 deletions(-)
diff --git a/docs/administration/_category_.json b/docs/administration/_category_.json
index cc9b71eac..56431da33 100644
--- a/docs/administration/_category_.json
+++ b/docs/administration/_category_.json
@@ -1,4 +1,4 @@
{
"label": "Administration",
- "position": 9
+ "position": 10
}
\ No newline at end of file
diff --git a/docs/contact.md b/docs/contact.md
index 867e82d26..918702dae 100644
--- a/docs/contact.md
+++ b/docs/contact.md
@@ -1,6 +1,6 @@
---
title: Contact Us
-sidebar_position: 10
+sidebar_position: 11
---
Contact information
diff --git a/docs/data_node/_category_.json b/docs/data_node/_category_.json
new file mode 100644
index 000000000..c44226d96
--- /dev/null
+++ b/docs/data_node/_category_.json
@@ -0,0 +1,4 @@
+{
+ "label": "Data Nodes",
+ "position": 6
+}
\ No newline at end of file
diff --git a/docs/data_node/extract_node/_category_.json b/docs/data_node/extract_node/_category_.json
new file mode 100644
index 000000000..c61e62a3b
--- /dev/null
+++ b/docs/data_node/extract_node/_category_.json
@@ -0,0 +1,4 @@
+{
+ "label": "Extract Nodes",
+ "position": 1
+}
\ No newline at end of file
diff --git a/docs/modules/agent/file.md b/docs/data_node/extract_node/file.md
similarity index 67%
rename from docs/modules/agent/file.md
rename to docs/data_node/extract_node/file.md
index a85201c76..d8bc22a92 100644
--- a/docs/modules/agent/file.md
+++ b/docs/data_node/extract_node/file.md
@@ -1,9 +1,18 @@
---
title: File
-sidebar_position: 3
+sidebar_position: 1
---
-## File Agent Configuration
+## Parameters
+![File Params](img/file_param.png)
+- DataSource Name
+- Data source IP: Collect Node Agent IP.
+- File path: Must be an absolute path and support regular expressions.
+- Time offset: The file will be collected from a certain time,' 1m' means 1 minute later,' -1m' means 1 minute before, and m(minute), h(hour), d(day) are supported. If it is empty, the file will be collected from the current time.
+- Source data fileDelimiter: Vertical line(|), Comma(,), Semicolon(;)...
+- Source data field: Delimited fields
+
+## Path Configuration
```
/data/inlong-agent/test.log //Represents reading the new file test.log in the inlong-agent folder
/data/inlong-agent/test[0-9]{1} // means to read the new file test in the inlong-agent folder followed by a number at the end
@@ -11,19 +20,17 @@ sidebar_position: 3
/data/inlong-agent/^\\d+(\\.\\d+)? // Start with one or more digits, followed by. or end with one. or more digits (? stands for optional, can match Examples: "5", "1.5" and "2.21"
```
-## Get data time from file name
-
+## Data Time
Agent supports obtaining the time from the file name as the production time of the data. The configuration instructions are as follows:
```
/data/inlong-agent/***YYYYMMDDHH***
```
+
Where YYYYDDMMHH represents the data time, YYYY represents the year, MM represents the month, DD represents the day, and HH represents the hour
Where *** is any character
At the same time, you need to add the current data cycle to the job conf, the current support day cycle and hour cycle,
-When adding a task, add the property job.cycleUnit
-
-job.cycleUnit contains the following two types:
+When adding a task, add the property job.cycleUnit. job.cycleUnit contains the following two types:
- D: Represents the data time and day dimension
- H: Represents the data time and hour dimension
diff --git a/docs/data_node/extract_node/img/file_param.png b/docs/data_node/extract_node/img/file_param.png
new file mode 100644
index 000000000..c3cbcdaea
Binary files /dev/null and b/docs/data_node/extract_node/img/file_param.png differ
diff --git a/docs/data_node/load_node/_category_.json b/docs/data_node/load_node/_category_.json
new file mode 100644
index 000000000..429218e09
--- /dev/null
+++ b/docs/data_node/load_node/_category_.json
@@ -0,0 +1,4 @@
+{
+ "label": "Load Nodes",
+ "position": 2
+}
\ No newline at end of file
diff --git a/docs/development/_category_.json b/docs/development/_category_.json
index 6244591cb..dc8cb43dc 100644
--- a/docs/development/_category_.json
+++ b/docs/development/_category_.json
@@ -1,4 +1,4 @@
{
"label": "Development",
- "position": 8
+ "position": 9
}
\ No newline at end of file
diff --git a/docs/modules/agent/sql.md b/docs/modules/agent/sql.md
deleted file mode 100644
index c4256c05f..000000000
--- a/docs/modules/agent/sql.md
+++ /dev/null
@@ -1,60 +0,0 @@
----
-title: MySQL SQL
-sidebar_position: 3
----
-
-## Overview
-Currently, Agent supports MYSQL version 5.1.x , 5.5.x , 5.6.x , 5.7.x , 8.0.x
-Currently, the Agent only supports the curl request to create a Job to submit collection tasks, and temporarily does not support the manager front-end to create SQL collection
-
-## Create A MySQL Job
-
-1. Apply for access on the manager, when filling in the data information, select the message source as "Independent Push"
-2. Select the source data field separator
-3. Fill in the source data fields, and the field order should be consistent with the field order in the sql query result
-4. Create a SQL read task using curl request
-
-## Parameter Description
-
-````
-Each parameter used in SQL Agent Job is described as
-1. job.sql.command: The actual executed sql statement, for example: select * from apache_inlong_manager.user
-2. job.sql.user: the user used when connecting to the database, for example: abc
-3. job.sql.password: The password used when connecting to the database, for example: 123456
-4. job.sql.hostname: The IP address of the connected database, for example: 127.0.0.1
-5. job.sql.port: the connected database port, for example: 3306
-6. job.sql.separator: The separator used to separate multiple fields needs to be used with the manager front-end
-````
-
-## Example
-
-```bash
-curl --location --request POST 'http://localhost:8008/config/job' \--header 'Content-Type: application/json' \--data '{
- "job": {
- "sql": {
- "command": "select * from apache_inlong_manager.user",
- "user": "root",
- "password": "inlong",
- "hostname": "10.0.0.6",
- "port": "3306",
- "separator": "|"
- },
- "id": 1,
- "thread": {
- "running": {
- "core": "4"
- }
- },
- "name": "test",
- "source": "org.apache.inlong.agent.plugin.sources.DataBaseSource",
- "sink": "org.apache.inlong.agent.plugin.sinks.ProxySink",
- "channel": "org.apache.inlong.agent.plugin.channel.MemoryChannel"
- },
- "proxy": {
- "inlongGroupId": "b_test_tube_hive_20211221_01",
- "inlongStreamId": "test_data_stream_20211221_01_01"
- },
- "op": "add"
-}
-'
-```
\ No newline at end of file
diff --git a/docs/sdk/_category_.json b/docs/sdk/_category_.json
index 6b355ca6b..9c997dee9 100644
--- a/docs/sdk/_category_.json
+++ b/docs/sdk/_category_.json
@@ -1,4 +1,4 @@
{
"label": "SDK",
- "position": 6
+ "position": 7
}
\ No newline at end of file
diff --git a/docs/user_guide/_category_.json b/docs/user_guide/_category_.json
index 891650af2..514ec933b 100644
--- a/docs/user_guide/_category_.json
+++ b/docs/user_guide/_category_.json
@@ -1,4 +1,4 @@
{
"label": "User Guide",
- "position": 7
+ "position": 8
}
\ No newline at end of file
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/how-to-release.md b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/how-to-release.md
index 63a1cd790..cc3b5d3bb 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/how-to-release.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs-community/current/how-to-release.md
@@ -232,7 +232,7 @@ cd /tmp/apache-inlong-${release_version}-${rc_version} # 进入源码包目录
tar xzvf apache-inlong-${release_version}-src.tar.gz #解压源码包
cd apache-inlong-${release_version} # 进入源码目录
mvn compile clean install package -DskipTests # 编译
-cp ./inlong-distribution/target/apache-inlong-${release_version}-bin.tar.gz /tmp/apache-inlong-${release_version}-${rc_version}/ # 拷贝二进制包拷到源码包目录下,方面下一步对包进行签名
+cp ./inlong-distribution/target/apache-inlong-${release_version}-bin.tar.gz /tmp/apache-inlong-${release_version}-${rc_version}/ # 拷贝二进制包拷到源码包目录下,方便下一步对包进行签名
```
### 对源码包/二进制包进行签名/sha512
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current.json b/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
index c5ca17905..f0e315eb8 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current.json
@@ -58,5 +58,9 @@
"sidebar.tutorialSidebar.category.User Guide": {
"message": "用户指引",
"description": "The label for category User Guide in sidebar tutorialSidebar"
+ },
+ "sidebar.tutorialSidebar.category.Data Nodes": {
+ "message": "数据节点",
+ "description": "The label for category Data Nodes in sidebar tutorialSidebar"
}
}
\ No newline at end of file
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/agent/file.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/file.md
similarity index 66%
rename from i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/agent/file.md
rename to i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/file.md
index 0744cce32..313fdcdd2 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/agent/file.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/file.md
@@ -3,7 +3,16 @@ title: 文件
sidebar_position: 3
---
-## 文件Agent配置
+## 参数说明
+![File Params](img/file_param.png)
+- 数据源名称
+- 数据源IP:采集节点 Agent IP
+- ⽂件路径:必须是绝对路径,支持正则表达式,多个时以逗号分隔
+- 时间偏移量:从文件的某个时间开始采集,'1m'表示1分钟之后,'-1m'表示1分钟之前,支持m(分钟),h(小时),d(天),空则从当前时间开始采集
+- 文件分隔符:支持竖线(|), 逗号(,),分好(;)
+- 源数据字段:分隔符切分后的字段
+
+## 路径配置
```
/data/inlong-agent/test.log //代表读取inlong-agent文件夹下的的新增文件test.log
/data/inlong-agent/test[0-9]{1} //代表读取inlong-agent文件夹下的新增文件test后接一个数字结尾
@@ -11,18 +20,17 @@ sidebar_position: 3
/data/inlong-agent/^\\d+(\\.\\d+)? // 以一个或多个数字开头,之后可以是.或者一个.或多个数字结尾,?代表可选,可以匹配的实例:"5", "1.5" 和 "2.21"
```
-## 从文件名称中获取数据时间
+## 数据时间
Agent支持从文件名称中获取时间当作数据的生产时间,配置说明如下:
```
/data/inlong-agent/***YYYYMMDDHH***
```
+
其中YYYYDDMMHH代表数据时间,YYYY表示年,MM表示月份,DD表示天,HH表示小时
其中***为任意字符
同时需要在job conf中加入当前数据的周期,当前支持天周期以及小时周期,
-在添加任务时,加入属性job.cycleUnit
-
-job.cycleUnit 包含如下两种类型:
+在添加任务时,加入属性 job.cycleUnit。job.cycleUnit 包含如下两种类型:
- D : 代表数据时间天维度
- H : 代表数据时间小时维度
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/img/file_param.png b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/img/file_param.png
new file mode 100644
index 000000000..d4226b267
Binary files /dev/null and b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/img/file_param.png differ
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/agent/sql.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/agent/sql.md
deleted file mode 100644
index 8299313f3..000000000
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/modules/agent/sql.md
+++ /dev/null
@@ -1,60 +0,0 @@
----
-title: MySQL SQL
-sidebar_position: 3
----
-
-## 总览
-目前 Agent 支持 MySQL 版本为5.1.x , 5.5.x , 5.6.x , 5.7.x , 8.0.x
-目前 Agent 只支持 curl 请求创建 Job 方式提交采集任务,暂时不支持 manager 前端创建 SQL 采集
-
-## MySQL Job创建步骤
-
-1、在 manager 上申请接入,填写数据信息时,选择消息来源为"自主推送"
-2、选择源数据字段分隔符
-3、填写源数据字段,字段顺序与 sql 查询结果中的字段顺序保持一致
-4、使用 curl 请求创建一个 SQL 读取任务
-
-## 参数说明
-
-```
-SQL Agent Job 中各个使用参数说明为
-1、job.sql.command: 实际执行的 sql 语句,举例: select * from apache_inlong_manager.user
-2、job.sql.user: 连接数据库时使用的 user,举例: abc
-3、job.sql.password: 连接数据库时使用的 password, 举例: 123456
-4、job.sql.hostname: 连接的数据库 ip 地址,举例:127.0.0.1
-5、job.sql.port:连接的数据库端口,举例:3306
-6、job.sql.separator: 使用的分割符来分割多个字段,需要与 manager 前端
-```
-
-## 举例
-
-```bash
-curl --location --request POST 'http://localhost:8008/config/job' \--header 'Content-Type: application/json' \--data '{
- "job": {
- "sql": {
- "command": "select * from apache_inlong_manager.user",
- "user": "root",
- "password": "inlong",
- "hostname": "10.0.0.6",
- "port": "3306",
- "separator": "|"
- },
- "id": 1,
- "thread": {
- "running": {
- "core": "4"
- }
- },
- "name": "test",
- "source": "org.apache.inlong.agent.plugin.sources.DataBaseSource",
- "sink": "org.apache.inlong.agent.plugin.sinks.ProxySink",
- "channel": "org.apache.inlong.agent.plugin.channel.MemoryChannel"
- },
- "proxy": {
- "inlongGroupId": "b_test_tube_hive_20211221_01",
- "inlongStreamId": "test_data_stream_20211221_01_01"
- },
- "op": "add"
-}
-'
-```
\ No newline at end of file