You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@inlong.apache.org by zi...@apache.org on 2022/11/08 06:31:57 UTC
[inlong-website] branch master updated: [INLONG-578][Doc] Add doc for MySQL connector for filtering and allmigrate (#582)
This is an automated email from the ASF dual-hosted git repository.
zirui pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/inlong-website.git
The following commit(s) were added to refs/heads/master by this push:
new 621a384521 [INLONG-578][Doc] Add doc for MySQL connector for filtering and allmigrate (#582)
621a384521 is described below
commit 621a3845212bfa2ffaef0d9943c989783924f288
Author: Schnapps <zp...@connect.ust.hk>
AuthorDate: Tue Nov 8 14:31:52 2022 +0800
[INLONG-578][Doc] Add doc for MySQL connector for filtering and allmigrate (#582)
---
docs/data_node/extract_node/kafka.md | 2 +-
docs/data_node/extract_node/mysql-cdc.md | 91 ++++++++++++++++------
.../current/data_node/extract_node/mysql-cdc.md | 60 ++++++++++++--
3 files changed, 124 insertions(+), 29 deletions(-)
diff --git a/docs/data_node/extract_node/kafka.md b/docs/data_node/extract_node/kafka.md
index 1a7a23b8ab..5c88e49649 100644
--- a/docs/data_node/extract_node/kafka.md
+++ b/docs/data_node/extract_node/kafka.md
@@ -15,7 +15,7 @@ upsert fashion. The `upsert-kafka` connector produces a `changelog stream`, wher
| Extract Node | Kafka version |
|-----------------------------|---------------|
-| [Kafka](./kafka.md) | 0.10+ |
+| [Kafka](./kafka.md) | 0.10+ |
## Dependencies
diff --git a/docs/data_node/extract_node/mysql-cdc.md b/docs/data_node/extract_node/mysql-cdc.md
index d5694b9341..f1f42e9ceb 100644
--- a/docs/data_node/extract_node/mysql-cdc.md
+++ b/docs/data_node/extract_node/mysql-cdc.md
@@ -293,7 +293,18 @@ TODO: It will be supported in the future.
<td>optional</td>
<td style={{wordWrap: 'break-word'}}>false</td>
<td>Boolean</td>
- <td>Whether it is a whole library migration, Whether it is a whole database migration scenario, if true, it compresses physical fields and other meta fields supported by MySQL Extract Node into a special meta field `data` in canal-json format.</td>
+ <td>Whether it is a full database migration scenario, if it is 'true', MySQL Extract Node will compress the physical fields and other meta fields of the table into 'json'.
+ The special 'data' meta field of the format, currently supports two data formats, if you need data in 'canal json' format,
+ then use the 'data_canal' metadata field, or use the 'data_debezium' metadata field if data in 'debezium json' format is required.</td>
+ </tr>
+ <tr>
+ <td>row-kinds-filtered</td>
+ <td>optional</td>
+ <td style={{wordWrap: 'break-word'}}>false</td>
+ <td>Boolean</td>
+ <td>The specific operation type that needs to be retained, where +U corresponds to the data before the update, -U corresponds to the updated data, and +I corresponds to the data before the update.
+ Inserted data (the existing data is the data of the insert type), -D represents the deleted data, if you want to keep multiple operation types, use & connection.
+ For example +I&-D, the connector will only output the inserted and deleted data, and the updated data will not be output. </td>
</tr>
<tr>
<td>debezium.*</td>
@@ -349,9 +360,14 @@ The following format metadata can be exposed as read-only (VIRTUAL) columns in a
<td>Type of database operation, such as INSERT/DELETE, etc.</td>
</tr>
<tr>
- <td>meta.data</td>
- <td>STRING</td>
- <td>Data of the row that format by `canal-json` only exists when the option `migrate-all` is 'true'.</td>
+ <td>meta.data_canal</td>
+ <td>STRING/BYTES</td>
+ <td>Data for rows in `canal-json` format only exists when the `migrate-all` option is 'true'.</td>
+ </tr>
+ <tr>
+ <td>meta.data_debezium</td>
+ <td>STRING/BYTES</td>
+ <td>Data for `debezium-json` formatted lines only exists if the `migrate-all` option is 'true'.</td>
</tr>
<tr>
<td>meta.is_ddl</td>
@@ -394,30 +410,31 @@ The following format metadata can be exposed as read-only (VIRTUAL) columns in a
The extended CREATE TABLE example demonstrates the syntax for exposing these metadata fields:
```sql
CREATE TABLE `mysql_extract_node` (
- `id` INT,
- `name` STRING,
- `database_name` string METADATA FROM 'meta.database_name',
- `table_name` string METADATA FROM 'meta.table_name',
- `op_ts` timestamp(3) METADATA FROM 'meta.op_ts',
- `op_type` string METADATA FROM 'meta.op_type',
- `batch_id` bigint METADATA FROM 'meta.batch_id',
- `is_ddl` boolean METADATA FROM 'meta.is_ddl',
- `update_before` ARRAY<MAP<STRING, STRING>> METADATA FROM 'meta.update_before',
- `mysql_type` MAP<STRING, STRING> METADATA FROM 'meta.mysql_type',
- `pk_names` ARRAY<STRING> METADATA FROM 'meta.pk_names',
- `data` STRING METADATA FROM 'meta.data',
- `sql_type` MAP<STRING, INT> METADATA FROM 'meta.sql_type',
- `ingestion_ts` TIMESTAMP(3) METADATA FROM 'meta.ts',
- PRIMARY KEY (`id`) NOT ENFORCED
+ `id` INT,
+ `name` STRING,
+ `database_name` string METADATA FROM 'meta.database_name',
+ `table_name` string METADATA FROM 'meta.table_name',
+ `op_ts` timestamp(3) METADATA FROM 'meta.op_ts',
+ `op_type` string METADATA FROM 'meta.op_type',
+ `batch_id` bigint METADATA FROM 'meta.batch_id',
+ `is_ddl` boolean METADATA FROM 'meta.is_ddl',
+ `update_before` ARRAY<MAP<STRING, STRING>> METADATA FROM 'meta.update_before',
+ `mysql_type` MAP<STRING, STRING> METADATA FROM 'meta.mysql_type',
+ `pk_names` ARRAY<STRING> METADATA FROM 'meta.pk_names',
+ `data` STRING METADATA FROM 'meta.data_canal',
+ `sql_type` MAP<STRING, INT> METADATA FROM 'meta.sql_type',
+ `ingestion_ts` TIMESTAMP(3) METADATA FROM 'meta.ts',
+ PRIMARY KEY (`id`) NOT ENFORCED
) WITH (
- 'connector' = 'mysql-cdc-inlong',
+ 'connector' = 'mysql-cdc-inlong',
'hostname' = 'YourHostname',
'migrate-all' = 'true',
- 'port' = '3306',
+ 'port' = '3306',
'username' = 'YourUsername',
'password' = 'YourPassword',
'database-name' = 'YourDatabase',
- 'table-name' = 'YourTable'
+ 'table-name' = 'YourTable',
+ 'row-kinds-filtered' = '+I'
);
```
@@ -615,3 +632,33 @@ CREATE TABLE `mysql_extract_node` (
</table>
</div>
+## Features
+
+### Multi-database multi-table synchronization
+
+Mysql Extract node supports whole database and multi-table synchronization. After this function is enabled, the Mysql Extract node will compress the physical fields of the table into a special meta field 'data_canal' in the 'canal-json' format, and can also be configured as a metadata field 'data_debezium' in the 'debezium-json' format.
+
+Configuration parameters:
+
+| Parameter | Required | Default Value | Data Type | Description |
+|---------------| ---| ---| ---|--------------------- ----------------------------------------|
+| migrate-all |optional| false|String| Enable the entire database migration mode, all physical fields are obtained through the data_canal field |
+| table-name |optional| false|String| The regular expression of the table to be read, use "\." to separate between database and table, and use "," to separate multiple regular expressions |
+| database-name |optional| false|String| The expression of the library to be read, multiple regular expressions are separated by "," |
+
+The CREATE TABLE example demonstrates the function syntax:
+
+```sql
+CREATE TABLE `table_1`(
+`data` STRING METADATA FROM 'meta.data_canal' VIRTUAL)
+WITH (
+'inlong.metric.labels' = 'groupId=1&streamId=1&nodeId=1',
+'migrate-all' = 'true',
+'connector' = 'mysql-cdc-inlong',
+'hostname' = 'localhost',
+'database-name' = 'test,test01',
+'username' = 'root',
+'password' = 'inlong',
+'table-name' = 'test01\.a{2}[0-9]$, test\.[\s\S]*'
+)
+````
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/mysql-cdc.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/mysql-cdc.md
index 46b6e03b7a..40932bf45f 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/mysql-cdc.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/mysql-cdc.md
@@ -289,7 +289,18 @@ TODO: 将在未来支持此功能。
<td>optional</td>
<td style={{wordWrap: 'break-word'}}>false</td>
<td>Boolean</td>
- <td>是否是全库迁移场景,如果为 'true',MySQL Extract Node 则将表的物理字段和其他元字段压缩成 'canal-json' 格式的特殊元字段 'data'。</td>
+ <td>是否是全库迁移场景,如果为 'true',MySQL Extract Node 则将表的物理字段和其他元字段压缩成 'json'
+ 格式的特殊 'data' 元字段, 目前支持两种 data 格式, 如果需要 'canal json' 格式的数据,
+ 则使用 'data_canal' 元数据字段,如果需要使用 'debezium json' 格式的数据则使用 'data_debezium' 元数据字段。</td>
+ </tr>
+ <tr>
+ <td>row-kinds-filtered</td>
+ <td>optional</td>
+ <td style={{wordWrap: 'break-word'}}>false</td>
+ <td>Boolean</td>
+ <td>需要保留的特定的操作类型,其中 +U 对应更新前的数据,-U 对应更新后的数据,+I 对应
+ 插入的数据(存量数据为插入类型的数据),-D 代表删除的数据, 如需保留多个操作类型则使用 & 连接。
+ 举例 +I&-D,connector 只会输出插入以及删除的数据,更新的数据则不会输出。</td>
</tr>
<tr>
<td>debezium.*</td>
@@ -345,10 +356,15 @@ TODO: 将在未来支持此功能。
<td>数据库操作的类型,如 INSERT/DELETE 等。</td>
</tr>
<tr>
- <td>meta.data</td>
- <td>STRING</td>
+ <td>meta.data_canal</td>
+ <td>STRING/BYTES</td>
<td>`canal-json` 格式化的行的数据只有在 `migrate-all` 选项为 'true' 时才存在。</td>
</tr>
+ <tr>
+ <td>meta.data_debezium</td>
+ <td>STRING/BYTES</td>
+ <td>`debezium-json` 格式化的行的数据只有在 `migrate-all` 选项为 'true' 时才存在。</td>
+ </tr>
<tr>
<td>meta.is_ddl</td>
<td>BOOLEAN</td>
@@ -402,7 +418,7 @@ CREATE TABLE `mysql_extract_node` (
`update_before` ARRAY<MAP<STRING, STRING>> METADATA FROM 'meta.update_before',
`mysql_type` MAP<STRING, STRING> METADATA FROM 'meta.mysql_type',
`pk_names` ARRAY<STRING> METADATA FROM 'meta.pk_names',
- `data` STRING METADATA FROM 'meta.data',
+ `data` STRING METADATA FROM 'meta.data_canal',
`sql_type` MAP<STRING, INT> METADATA FROM 'meta.sql_type',
`ingestion_ts` TIMESTAMP(3) METADATA FROM 'meta.ts',
PRIMARY KEY (`id`) NOT ENFORCED
@@ -414,8 +430,9 @@ CREATE TABLE `mysql_extract_node` (
'username' = 'YourUsername',
'password' = 'YourPassword',
'database-name' = 'YourDatabase',
- 'table-name' = 'YourTable'
- );
+ 'table-name' = 'YourTable',
+ 'row-kinds-filtered' = '+I'
+ );
```
## 数据类型映射
@@ -612,3 +629,34 @@ CREATE TABLE `mysql_extract_node` (
</table>
</div>
+
+## 特性
+
+### 多库多表同步
+
+Mysql Extract 节点支持整库、多表同步。开启该功能后,Mysql Extract 节点会将表的物理字段压缩成 'canal-json' 格式的特殊元字段 'data_canal',也可配置为 'debezium-json' 格式的元数据字段 'data_debezium'。
+
+配置参数:
+
+| 参数 | 是否必须 | 默认值 | 数据类型 | 描述 |
+|---------------| ---| ---| ---|-------------------------------------------------------------|
+| migrate-all |optional| false|String| 开启整库迁移模式,所有的物理字段通过 data_canal 字段获取 |
+| table-name |optional| false|String| 需要读取的表的正则表达式,database 和 table 之间使用 "\." 分隔,多个正则表达式使用 "," 分隔 |
+| database-name |optional| false|String| 需要读取的库的表达式,多个正则表达式使用 "," 分隔 |
+
+CREATE TABLE 示例演示该功能语法:
+
+```sql
+CREATE TABLE `table_1`(
+`data` STRING METADATA FROM 'meta.data_canal' VIRTUAL)
+WITH (
+'inlong.metric.labels' = 'groupId=1&streamId=1&nodeId=1',
+'migrate-all' = 'true',
+'connector' = 'mysql-cdc-inlong',
+'hostname' = 'localhost',
+'database-name' = 'test,test01',
+'username' = 'root',
+'password' = 'inlong',
+'table-name' = 'test01\.a{2}[0-9]$, test\.[\s\S]*'
+)
+```
\ No newline at end of file