You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@inlong.apache.org by GitBox <gi...@apache.org> on 2022/06/13 14:03:07 UTC

[GitHub] [incubator-inlong-website] GanfengTan opened a new pull request, #406: [INLONG-405][Sort] Add sqlserver cdc,hdfs,hive doc

GanfengTan opened a new pull request, #406:
URL: https://github.com/apache/incubator-inlong-website/pull/406

   Fixes #405 
   
   Add sqlserver,hdfs,hive doc
   
   ### Motivation
   
   Related documents missing,then add sqlserver,hdfs,hive doc
   
   ### Modifications
   
   Add HDFS extract node and load node doc,add sqlserver extract node doc,add hive load node doc.
   
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
     - *Added integration tests for end-to-end deployment with large payloads (10MB)*
     - *Extended integration test for recovery after broker failure*
   
   ### Documentation
   
     - Does this pull request introduce a new feature? (yes / no)
     - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
     - If a feature is not applicable for documentation, explain why?
     - If a feature is not documented yet in this PR, please create a followup issue for adding the documentation
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-inlong-website] yunqingmoswu commented on a diff in pull request #406: [INLONG-405][Sort] Add sqlserver cdc,hdfs,hive doc

Posted by GitBox <gi...@apache.org>.
yunqingmoswu commented on code in PR #406:
URL: https://github.com/apache/incubator-inlong-website/pull/406#discussion_r896661428


##########
docs/data_node/extract_node/sqlserver-cdc.md:
##########
@@ -1,4 +1,336 @@
 ---
-title: SqlServer-CDC
+title: SQLServer-CDC
 sidebar_position: 11
----
\ No newline at end of file
+---
+## SQLServer Extract Node
+
+The SqlServer extract node reads data and incremental data from the SqlServer database. The following will describe how to set up the SqlServer extraction node.
+
+## Supported Version
+
+| Extract Node                | Version                                                                                                                                                                                                                                                                                                                                                                                                |
+|-----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| [sqlserver-cdc](./sqlserver-cdc.md) | [SQLServer](https://docs.microsoft.com/en-us/sql/database-engine/install-windows/install-sql-server?view=sql-server-ver16): 2014、2016、2017、2019、2022 |      |

Review Comment:
   sqlserver-cdc -> SQLserver-CDC



##########
docs/data_node/extract_node/sqlserver-cdc.md:
##########
@@ -1,4 +1,336 @@
 ---
-title: SqlServer-CDC
+title: SQLServer-CDC
 sidebar_position: 11
----
\ No newline at end of file
+---
+## SQLServer Extract Node
+
+The SqlServer extract node reads data and incremental data from the SqlServer database. The following will describe how to set up the SqlServer extraction node.

Review Comment:
   The SqlServer extract node -> The SQLServer Extract Node



##########
docs/data_node/extract_node/sqlserver-cdc.md:
##########
@@ -1,4 +1,336 @@
 ---
-title: SqlServer-CDC
+title: SQLServer-CDC
 sidebar_position: 11
----
\ No newline at end of file
+---
+## SQLServer Extract Node
+
+The SqlServer extract node reads data and incremental data from the SqlServer database. The following will describe how to set up the SqlServer extraction node.
+
+## Supported Version
+
+| Extract Node                | Version                                                                                                                                                                                                                                                                                                                                                                                                |
+|-----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| [sqlserver-cdc](./sqlserver-cdc.md) | [SQLServer](https://docs.microsoft.com/en-us/sql/database-engine/install-windows/install-sql-server?view=sql-server-ver16): 2014、2016、2017、2019、2022 |      |
+
+## Dependencies
+
+Introduce related SQLServer cdc connector dependencies through maven.
+
+### Maven dependency
+
+```
+<dependency>
+    <groupId>org.apache.inlong</groupId>
+    <artifactId>sort-connector-sqlserver-cdc</artifactId>
+    <!-- Choose the version that suits your application -->
+    <version>inlong_version</version>
+</dependency>
+```
+## Setup SQLServer CDC
+
+SQLServer CDC needs to open related libraries and tables, the steps are as follows:
+
+1. Enable the CDC function for the database.
+```sql
+if exists(select 1 from sys.databases where name='dbName' and is_cdc_enabled=0)
+begin
+    exec sys.sp_cdc_enable_db
+end
+```
+2. Check the database CDC capability status.
+```sql
+select is_cdc_enabled from sys.databases where name='dbName'
+```
+note: 1 is running CDC of DB.
+
+3. Turn on CDC for the table
+```sql
+IF EXISTS(SELECT 1 FROM sys.tables WHERE name='tableName' AND is_tracked_by_cdc = 0)
+BEGIN
+    EXEC sys.sp_cdc_enable_table
+        @source_schema = 'dbo', -- source_schema
+        @source_name = 'tableName', -- table_name
+        @capture_instance = NULL, -- capture_instance
+        @supports_net_changes = 1, -- supports_net_changes
+        @role_name = NULL, -- role_name
+        @index_name = NULL, -- index_name
+        @captured_column_list = NULL, -- captured_column_list
+        @filegroup_name = 'PRIMARY' -- filegroup_name
+END
+```
+note: The table must have a primary key or unique index.
+
+4. Check the table CDC capability status.
+```sql
+SELECT is_tracked_by_cdc FROM sys.tables WHERE name='tableName'
+```
+note: 1 is running CDC of table.
+
+## How to create a SQLServer Extract Node
+
+### Usage for SQL API
+
+The example below shows how to create a SqlServer Extract Node with `Flink SQL Cli` :
+
+```sql
+-- Set checkpoint every 3000 milliseconds                       
+Flink SQL> SET 'execution.checkpointing.interval' = '3s';   
+
+-- Create a SqlServer table 'sqlserver_extract_node' in Flink SQL Cli

Review Comment:
   SqlServer -> SQLServer



##########
docs/data_node/load_node/hive.md:
##########
@@ -2,8 +2,207 @@
 title: Hive
 sidebar_position: 3
 ---
+## Hive Load Node
+Hive Load Node can write data to hive. Using the flink dialect, the insert operation is currently supported, and the data in the upsert mode will be converted into insert.
+Manipulating hive tables using the hive dialect is currently not supported.
 
-## Configuration
+## Supported Version
+
+| Load Node                           | Version                                            | 
+|-------------------------------------|----------------------------------------------------|
+| [Hive](./hive.md) | [Hive](https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/hive/overview/#supported-hive-versions): 1.x, 2.x, 3.x |
+
+### Dependencies
+
+Using Hive load requires the introduction of dependencies.
+
+```
+<dependency>
+    <groupId>org.apache.inlong</groupId>
+    <artifactId>sort-connector-hive</artifactId>
+    <version>inlong_version</version>
+</dependency>
+```
+## How to create a Hive Load Node
+
+### Usage for SQL API
+
+The example below shows how to create a Hive Load Node with `Flink SQL Cli` :
+
+```sql
+CREATE TABLE hiveTableName (
+  id STRING,
+  name STRING,
+  uv BIGINT,
+  pv BIGINT
+) WITH (
+  'connector' = 'hive',
+  'default-database' = 'default',
+  'hive-version' = '3.1.2',
+  'hive-conf-dir' = 'hdfs://localhost:9000/user/hive/hive-site.xml'
+);
+```
+### Usage for InLong Dashboard
+
+#### Configuration
 When creating a data flow, select `Hive` for the data stream direction, and click "Add" to configure it.

Review Comment:
   data flow -> data stream



##########
docs/data_node/extract_node/sqlserver-cdc.md:
##########
@@ -1,4 +1,336 @@
 ---
-title: SqlServer-CDC
+title: SQLServer-CDC
 sidebar_position: 11
----
\ No newline at end of file
+---
+## SQLServer Extract Node
+
+The SqlServer extract node reads data and incremental data from the SqlServer database. The following will describe how to set up the SqlServer extraction node.
+
+## Supported Version
+
+| Extract Node                | Version                                                                                                                                                                                                                                                                                                                                                                                                |
+|-----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| [sqlserver-cdc](./sqlserver-cdc.md) | [SQLServer](https://docs.microsoft.com/en-us/sql/database-engine/install-windows/install-sql-server?view=sql-server-ver16): 2014、2016、2017、2019、2022 |      |
+
+## Dependencies
+
+Introduce related SQLServer cdc connector dependencies through maven.
+
+### Maven dependency
+
+```
+<dependency>
+    <groupId>org.apache.inlong</groupId>
+    <artifactId>sort-connector-sqlserver-cdc</artifactId>
+    <!-- Choose the version that suits your application -->
+    <version>inlong_version</version>
+</dependency>
+```
+## Setup SQLServer CDC

Review Comment:
   SQLServer CDC -> SQLServer Extract Node



##########
docs/data_node/extract_node/sqlserver-cdc.md:
##########
@@ -1,4 +1,336 @@
 ---
-title: SqlServer-CDC
+title: SQLServer-CDC
 sidebar_position: 11
----
\ No newline at end of file
+---
+## SQLServer Extract Node
+
+The SqlServer extract node reads data and incremental data from the SqlServer database. The following will describe how to set up the SqlServer extraction node.
+
+## Supported Version
+
+| Extract Node                | Version                                                                                                                                                                                                                                                                                                                                                                                                |
+|-----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| [sqlserver-cdc](./sqlserver-cdc.md) | [SQLServer](https://docs.microsoft.com/en-us/sql/database-engine/install-windows/install-sql-server?view=sql-server-ver16): 2014、2016、2017、2019、2022 |      |
+
+## Dependencies
+
+Introduce related SQLServer cdc connector dependencies through maven.
+
+### Maven dependency
+
+```
+<dependency>
+    <groupId>org.apache.inlong</groupId>
+    <artifactId>sort-connector-sqlserver-cdc</artifactId>
+    <!-- Choose the version that suits your application -->
+    <version>inlong_version</version>
+</dependency>
+```
+## Setup SQLServer CDC
+
+SQLServer CDC needs to open related libraries and tables, the steps are as follows:

Review Comment:
   SQLServer CDC -> SQLServer Extract Node



##########
docs/data_node/load_node/hive.md:
##########
@@ -2,8 +2,207 @@
 title: Hive
 sidebar_position: 3
 ---
+## Hive Load Node
+Hive Load Node can write data to hive. Using the flink dialect, the insert operation is currently supported, and the data in the upsert mode will be converted into insert.
+Manipulating hive tables using the hive dialect is currently not supported.
 
-## Configuration
+## Supported Version
+
+| Load Node                           | Version                                            | 
+|-------------------------------------|----------------------------------------------------|
+| [Hive](./hive.md) | [Hive](https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/hive/overview/#supported-hive-versions): 1.x, 2.x, 3.x |
+
+### Dependencies
+
+Using Hive load requires the introduction of dependencies.
+
+```
+<dependency>
+    <groupId>org.apache.inlong</groupId>
+    <artifactId>sort-connector-hive</artifactId>
+    <version>inlong_version</version>

Review Comment:
   Add a comment for inlong_version



##########
docs/data_node/extract_node/hdfs.md:
##########
@@ -1,4 +1,9 @@
 ---
 title: HDFS
 sidebar_position: 6
----
\ No newline at end of file
+---
+The file system connector can be used to read single files or entire directories into a single table.

Review Comment:
   Keep the title and content consistent.



##########
docs/data_node/extract_node/sqlserver-cdc.md:
##########
@@ -1,4 +1,336 @@
 ---
-title: SqlServer-CDC
+title: SQLServer-CDC
 sidebar_position: 11
----
\ No newline at end of file
+---
+## SQLServer Extract Node
+
+The SqlServer extract node reads data and incremental data from the SqlServer database. The following will describe how to set up the SqlServer extraction node.
+
+## Supported Version
+
+| Extract Node                | Version                                                                                                                                                                                                                                                                                                                                                                                                |
+|-----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| [sqlserver-cdc](./sqlserver-cdc.md) | [SQLServer](https://docs.microsoft.com/en-us/sql/database-engine/install-windows/install-sql-server?view=sql-server-ver16): 2014、2016、2017、2019、2022 |      |
+
+## Dependencies
+
+Introduce related SQLServer cdc connector dependencies through maven.
+
+### Maven dependency
+
+```
+<dependency>
+    <groupId>org.apache.inlong</groupId>
+    <artifactId>sort-connector-sqlserver-cdc</artifactId>
+    <!-- Choose the version that suits your application -->
+    <version>inlong_version</version>
+</dependency>
+```
+## Setup SQLServer CDC
+
+SQLServer CDC needs to open related libraries and tables, the steps are as follows:
+
+1. Enable the CDC function for the database.
+```sql
+if exists(select 1 from sys.databases where name='dbName' and is_cdc_enabled=0)
+begin
+    exec sys.sp_cdc_enable_db
+end
+```
+2. Check the database CDC capability status.
+```sql
+select is_cdc_enabled from sys.databases where name='dbName'
+```
+note: 1 is running CDC of DB.
+
+3. Turn on CDC for the table
+```sql
+IF EXISTS(SELECT 1 FROM sys.tables WHERE name='tableName' AND is_tracked_by_cdc = 0)
+BEGIN
+    EXEC sys.sp_cdc_enable_table
+        @source_schema = 'dbo', -- source_schema
+        @source_name = 'tableName', -- table_name
+        @capture_instance = NULL, -- capture_instance
+        @supports_net_changes = 1, -- supports_net_changes
+        @role_name = NULL, -- role_name
+        @index_name = NULL, -- index_name
+        @captured_column_list = NULL, -- captured_column_list
+        @filegroup_name = 'PRIMARY' -- filegroup_name
+END
+```
+note: The table must have a primary key or unique index.
+
+4. Check the table CDC capability status.
+```sql
+SELECT is_tracked_by_cdc FROM sys.tables WHERE name='tableName'
+```
+note: 1 is running CDC of table.
+
+## How to create a SQLServer Extract Node
+
+### Usage for SQL API
+
+The example below shows how to create a SqlServer Extract Node with `Flink SQL Cli` :

Review Comment:
   SqlServer -> SQLServer



##########
docs/data_node/extract_node/sqlserver-cdc.md:
##########
@@ -1,4 +1,336 @@
 ---
-title: SqlServer-CDC
+title: SQLServer-CDC
 sidebar_position: 11
----
\ No newline at end of file
+---
+## SQLServer Extract Node
+
+The SqlServer extract node reads data and incremental data from the SqlServer database. The following will describe how to set up the SqlServer extraction node.
+
+## Supported Version
+
+| Extract Node                | Version                                                                                                                                                                                                                                                                                                                                                                                                |
+|-----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| [sqlserver-cdc](./sqlserver-cdc.md) | [SQLServer](https://docs.microsoft.com/en-us/sql/database-engine/install-windows/install-sql-server?view=sql-server-ver16): 2014、2016、2017、2019、2022 |      |
+
+## Dependencies
+
+Introduce related SQLServer cdc connector dependencies through maven.

Review Comment:
   SQLServer cdc connector -> SQLServer Extract Node



##########
i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/sqlserver-cdc.md:
##########
@@ -1,4 +1,334 @@
 ---
-title: SqlServer-CDC
+title: SQLServer-CDC
 sidebar_position: 11
----
\ No newline at end of file
+---
+## SQLServer抽取节点
+
+SQLServer 提取节点从 SQLServer 数据库中读取数据和增量数据。下面将介绍如何配置 SQLServer 抽取节点。
+
+## 支持的版本
+
+| Extract Node                | Version                                                                                                                                                                                                                                                                                                                                                                                                |
+|-----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| [sqlserver-cdc](./sqlserver-cdc.md) | [SQLServer](https://docs.microsoft.com/en-us/sql/database-engine/install-windows/install-sql-server?view=sql-server-ver16): 2014、2016、2017、2019、2022 |      |

Review Comment:
   sqlserver-cdc -> SQLServer-CDC



##########
i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/sqlserver-cdc.md:
##########
@@ -1,4 +1,334 @@
 ---
-title: SqlServer-CDC
+title: SQLServer-CDC
 sidebar_position: 11
----
\ No newline at end of file
+---
+## SQLServer抽取节点
+
+SQLServer 提取节点从 SQLServer 数据库中读取数据和增量数据。下面将介绍如何配置 SQLServer 抽取节点。
+
+## 支持的版本
+
+| Extract Node                | Version                                                                                                                                                                                                                                                                                                                                                                                                |
+|-----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| [sqlserver-cdc](./sqlserver-cdc.md) | [SQLServer](https://docs.microsoft.com/en-us/sql/database-engine/install-windows/install-sql-server?view=sql-server-ver16): 2014、2016、2017、2019、2022 |      |
+
+## 依赖配置
+
+通过maven引入sort-connector-sqlserver-cdc构建自己的项目。
+
+### Maven依赖配置
+
+```
+<dependency>
+    <groupId>org.apache.inlong</groupId>
+    <artifactId>sort-connector-sqlserver-cdc</artifactId>
+    <!-- Choose the version that suits your application -->

Review Comment:
   Translate `Choose the version that suits your application`



##########
i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/load_node/hive.md:
##########
@@ -2,8 +2,206 @@
 title: Hive
 sidebar_position: 2
 ---
+## Hive加载节点
 
-## 配置
-创建数据流时,数据流向选择 `Hive`,并点击 ”添加“ 进行配置。
+Hive加载节点可以将数据写入hive。使用flink方言,目前仅支持insert操作,upsert模式下的数据会转换成insert方式
+目前暂时不支持使用hive方言操作hive表。
 
-![Hive Configuration](img/hive.png)
\ No newline at end of file
+## 支持的版本
+
+| Load Node                           | Version                                            | 
+|-------------------------------------|----------------------------------------------------|
+| [Hive](./hive.md) | [Hive](https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/hive/overview/#supported-hive-versions): 1.x, 2.x, 3.x |
+
+### 依赖
+
+通过maven引入sort-connector-hive构建自己的项目。
+
+```
+<dependency>
+    <groupId>org.apache.inlong</groupId>
+    <artifactId>sort-connector-hive</artifactId>
+    <version>inlong_version</version>

Review Comment:
   Add a comment for version



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-inlong-website] dockerzhang merged pull request #406: [INLONG-405][Sort] Add sqlserver cdc,hdfs,hive doc

Posted by GitBox <gi...@apache.org>.
dockerzhang merged PR #406:
URL: https://github.com/apache/incubator-inlong-website/pull/406


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-inlong-website] yunqingmoswu commented on a diff in pull request #406: [INLONG-405][Sort] Add sqlserver cdc,hdfs,hive doc

Posted by GitBox <gi...@apache.org>.
yunqingmoswu commented on code in PR #406:
URL: https://github.com/apache/incubator-inlong-website/pull/406#discussion_r896305901


##########
docs/data_node/extract_node/sqlserver-cdc.md:
##########
@@ -1,4 +1,336 @@
 ---
 title: SqlServer-CDC
 sidebar_position: 11
----
\ No newline at end of file
+---
+## SqlServer Extract Node

Review Comment:
   SqlServer -> SQLServer



##########
docs/data_node/extract_node/sqlserver-cdc.md:
##########
@@ -1,4 +1,336 @@
 ---
 title: SqlServer-CDC

Review Comment:
   SqlServer-CDC -> SQLServer-CDC



##########
docs/data_node/extract_node/sqlserver-cdc.md:
##########
@@ -1,4 +1,336 @@
 ---
 title: SqlServer-CDC
 sidebar_position: 11
----
\ No newline at end of file
+---
+## SqlServer Extract Node
+
+The SqlServer extract node reads data and incremental data from the SqlServer database. The following will describe how to set up the SqlServer extraction node.
+
+## Supported Version
+
+| Extract Node                | Version                                                                                                                                                                                                                                                                                                                                                                                                |
+|-----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| [sqlserver-cdc](./sqlserver-cdc.md) | [SqlServer](https://docs.microsoft.com/en-us/sql/database-engine/install-windows/install-sql-server?view=sql-server-ver16): 2014、2016、2017、2019、2022 |      |
+
+## Dependencies
+
+Introduce related SQLServer cdc connector dependencies through maven.
+
+### Maven dependency
+
+```
+<dependency>
+    <groupId>org.apache.inlong</groupId>
+    <artifactId>sort-connector-sqlserver-cdc</artifactId>
+    <!-- Choose the version that suits your application -->
+    <version>inlong_version</version>
+</dependency>
+```
+## Setup SqlServer CDC
+
+SqlServer CDC needs to open related libraries and tables, the steps are as follows:
+
+1.Enable the CDC function for the database.

Review Comment:
   `1.` -> `-`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org