You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@inlong.apache.org by yu...@apache.org on 2022/11/08 06:29:41 UTC
[inlong-website] branch master updated: [INLONG-579][Doc] Add doc for oracle connector for all migrate (#584)
This is an automated email from the ASF dual-hosted git repository.
yunqing pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/inlong-website.git
The following commit(s) were added to refs/heads/master by this push:
new f59f2a7186 [INLONG-579][Doc] Add doc for oracle connector for all migrate (#584)
f59f2a7186 is described below
commit f59f2a71867a85f6c447b7ea37fea67f12a64e06
Author: emhui <11...@users.noreply.github.com>
AuthorDate: Tue Nov 8 14:29:35 2022 +0800
[INLONG-579][Doc] Add doc for oracle connector for all migrate (#584)
* [INLONG-579][Doc] Add doc for oracle connector for all migrate
---
docs/data_node/extract_node/oracle-cdc.md | 109 ++++++++++++++++++-
.../current/data_node/extract_node/oracle-cdc.md | 117 +++++++++++++++++++--
2 files changed, 216 insertions(+), 10 deletions(-)
diff --git a/docs/data_node/extract_node/oracle-cdc.md b/docs/data_node/extract_node/oracle-cdc.md
index f861091bcf..e330965912 100644
--- a/docs/data_node/extract_node/oracle-cdc.md
+++ b/docs/data_node/extract_node/oracle-cdc.md
@@ -294,7 +294,7 @@ TODO: It will be supported in the future.
<td>required</td>
<td style={{wordWrap: 'break-word'}}>(none)</td>
<td>String</td>
- <td>Table name of the Oracle database to monitor.</td>
+ <td>Table name of the Oracle database to monitor. The value is of the form <i><schema_name>.<table_name></i></td>
</tr>
<tr>
<td>port</td>
@@ -328,6 +328,13 @@ TODO: It will be supported in the future.
<td>String</td>
<td>Inlong metric label, format of value is groupId=xxgroup&streamId=xxstream&nodeId=xxnode.</td>
</tr>
+ <tr>
+ <td>source.multiple.enable</td>
+ <td>optional</td>
+ <td style={{wordWrap: 'break-word'}}>false</td>
+ <td>Boolean</td>
+ <td>Whether to enable multiple schema and table migration. If it is' true ', Oracle Extract Node will compress the physical field of the table into a special meta field 'data_canal' in the format of 'canal json'.</td>
+ </tr>
</tbody>
</table>
</div>
@@ -378,6 +385,61 @@ The following format metadata can be exposed as read-only (VIRTUAL) columns in a
<td>TIMESTAMP_LTZ(3) NOT NULL</td>
<td>It indicates the time that the change was made in the database. <br/>If the record is read from snapshot of the table instead of the change stream, the value is always 0.</td>
</tr>
+ <tr>
+ <td>meta.table_name</td>
+ <td>STRING NOT NULL</td>
+ <td>Name of the table that contain the row.</td>
+ </tr>
+ <tr>
+ <td>meta.schema_name</td>
+ <td>STRING NOT NULL</td>
+ <td>Name of the schema that contain the row.</td>
+ </tr>
+ <tr>
+ <td>meta.database_name</td>
+ <td>STRING NOT NULL</td>
+ <td>Name of the database that contain the row.</td>
+ </tr>
+ <tr>
+ <td>meta.op_ts</td>
+ <td>TIMESTAMP_LTZ(3) NOT NULL</td>
+ <td>It indicates the time that the change was made in the database. <br/>If the record is read from snapshot of the table instead of the change stream, the value is always 0.</td>
+ </tr>
+ <tr>
+ <td>meta.op_type</td>
+ <td>STRING</td>
+ <td>Type of database operation, such as INSERT/DELETE, etc.</td>
+ </tr>
+ <tr>
+ <td>meta.data_canal</td>
+ <td>STRING/BYTES</td>
+ <td>Data for rows in `canal-json` format only exists when the `source.multiple.enable` option is 'true'.</td>
+ </tr>
+ <tr>
+ <td>meta.is_ddl</td>
+ <td>BOOLEAN</td>
+ <td>Whether the DDL statement.</td>
+ </tr>
+ <tr>
+ <td>meta.ts</td>
+ <td>TIMESTAMP_LTZ(3) NOT NULL</td>
+ <td>The current time when the row was received and processed.</td>
+ </tr>
+ <tr>
+ <td>meta.sql_type</td>
+ <td>MAP</td>
+ <td>Mapping of sql_type table fields to java data type IDs.</td>
+ </tr>
+ <tr>
+ <td>meta.oracle_type</td>
+ <td>MAP</td>
+ <td>Structure of the table.</td>
+ </tr>
+ <tr>
+ <td>meta.pk_names</td>
+ <td>ARRAY</td>
+ <td>Primay key name of the table.</td>
+ </tr>
</tbody>
</table>
@@ -387,7 +449,18 @@ CREATE TABLE products (
db_name STRING METADATA FROM 'database_name' VIRTUAL,
schema_name STRING METADATA FROM 'schema_name' VIRTUAL,
table_name STRING METADATA FROM 'table_name' VIRTUAL,
- operation_ts TIMESTAMP_LTZ(3) METADATA FROM 'op_ts' VIRTUAL,
+ op_ts TIMESTAMP_LTZ(3) METADATA FROM 'op_ts' VIRTUAL,
+ meta_db_name STRING METADATA FROM 'meta.database_name' VIRTUAL,
+ meta_schema_name STRING METADATA FROM 'meta.schema_name' VIRTUAL,
+ meta_table_name STRING METADATA FROM 'meta.table_name' VIRTUAL,
+ meat_op_ts TIMESTAMP_LTZ(3) METADATA FROM 'meta.op_ts' VIRTUAL,
+ meta_op_type STRING METADATA FROM 'meta.op_type' VIRTUAL,
+ meta_data_canal STRING METADATA FROM 'meta.data_canal' VIRTUAL,
+ meta_is_ddl BOOLEAN METADATA FROM 'meta.is_ddl' VIRTUAL,
+ meta_ts TIMESTAMP_LTZ(3) METADATA FROM 'meta.ts' VIRTUAL,
+ meta_sql_type MAP<STRING, INT> METADATA FROM 'meta.sql_type' VIRTUAL,
+ meat_oracle_type MAP<STRING, STRING> METADATA FROM 'meta.oracle_type' VIRTUAL,
+ meta_pk_names ARRAY<STRING> METADATA FROM 'meta.pk_names' VIRTUAL
ID INT NOT NULL,
NAME STRING,
DESCRIPTION STRING,
@@ -401,7 +474,7 @@ CREATE TABLE products (
'password' = 'flinkpw',
'database-name' = 'XE',
'schema-name' = 'inventory',
- 'table-name' = 'products'
+ 'table-name' = 'inventory.products'
);
```
@@ -427,6 +500,36 @@ _Note: the mechanism of `scan.startup.mode` option relying on Debezium's `snapsh
The Oracle Extract Node can't work in parallel reading, because there is only one task can receive change events.
+### Whole Database, Multiple Schemas, Multiple Tables Migration
+
+Oracle Extract Node supports the whole database, multiple schemas, multiple tables migration function. When you enable this function, Oracle Extract Node will compress the physical field of the table into a special meta field 'data_canal' in the format of 'canal json'.
+
+config options:
+
+| Option | Required | Default | Type | Description |
+| ---| ---| ---| ---| ---|
+|source.multiple.enable|optional| false|String|Specify `'source.multiple.enable' = 'true'` to enable the whole database, multiple schemas, multiple tables migration function |
+|schema-name|required|(none)|String| Schema name of the Oracle database to monitor. If you want to capture multiple schemas, you can use commas to separate them. For example: `'schema-name' = 'SCHEMA1,SCHEMA2'` |
+|table-name| required | (none) |String| Table name of the Oracle database to monitor. If you want to capture multiple tables, you can use commas to separate them. For example: `'table-name' = 'SCHEMA1.TB.*, SCHEMA2.TB1'`|
+
+The CREATE TABLE example demonstrates the syntax of this function:
+
+```sql
+CREATE TABLE node(
+ data STRING METADATA FROM 'meta.data_canal' VIRTUAL)
+ WITH (
+ 'connector' = 'oracle-cdc-inlong',
+ 'hostname' = 'localhost',
+ 'port' = '1521',
+ 'username' = 'flinkuser',
+ 'password' = 'flinkpw',
+ 'database-name' = 'XE',
+ 'schema-name' = 'inventory',
+ 'table-name' = 'inventory..*',
+ 'source.multiple.enable' = 'true'
+)
+```
+
## Data Type Mapping
<div class="wy-table-responsive">
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/oracle-cdc.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/oracle-cdc.md
index 565f3b4d53..fdf261a6f4 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/oracle-cdc.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/oracle-cdc.md
@@ -294,7 +294,7 @@ TODO: 将在未来支持此功能。
<td>required</td>
<td style={{wordWrap: 'break-word'}}>(none)</td>
<td>String</td>
- <td>要监视的 Oracle 数据库的表名。</td>
+ <td>要监视的 Oracle 数据库的表名。格式为<i><schema_name>.<table_name></i></td>
</tr>
<tr>
<td>port</td>
@@ -329,6 +329,13 @@ Oracle CDC 消费者的可选启动模式,有效枚举为"initial"
<td>String</td>
<td>inlong metric 的标签值,该值的构成为groupId&streamId&nodeId。</td>
</tr>
+ <tr>
+ <td>source.multiple.enable</td>
+ <td>可选</td>
+ <td style={{wordWrap: 'break-word'}}>false</td>
+ <td>Boolean</td>
+ <td>是否开启多模式、表同步功能,如果为 'true',Oracle Extract Node 则将表的物理字段压缩成 'canal-json' 格式的特殊元字段 'data_canal'。</td>
+ </tr>
</tbody>
</table>
</div>
@@ -361,22 +368,77 @@ restart-strategy.fixed-delay.attempts: 2147483647
<tr>
<td>table_name</td>
<td>STRING NOT NULL</td>
- <td>Name of the table that contain the row.</td>
+ <td>该行所属的表名。</td>
</tr>
<tr>
<td>schema_name</td>
<td>STRING NOT NULL</td>
- <td>Name of the schema that contain the row.</td>
+ <td>该行所属的模式名称。</td>
</tr>
<tr>
<td>database_name</td>
<td>STRING NOT NULL</td>
- <td>Name of the database that contain the row.</td>
+ <td>该行所属的数据库名称。</td>
</tr>
<tr>
<td>op_ts</td>
<td>TIMESTAMP_LTZ(3) NOT NULL</td>
- <td>It indicates the time that the change was made in the database. <br/>If the record is read from snapshot of the table instead of the change stream, the value is always 0.</td>
+ <td>它指示在数据库中进行更改的时间。<br/>如果记录从表的快照而不是change流中读取,则该值始终为0。</td>
+ </tr>
+ <tr>
+ <td>meta.table_name</td>
+ <td>STRING NOT NULL</td>
+ <td>该行所属的表名。</td>
+ </tr>
+ <tr>
+ <td>meta.schema_name</td>
+ <td>STRING NOT NULL</td>
+ <td>该行所属的模式名称。</td>
+ </tr>
+ <tr>
+ <td>meta.database_name</td>
+ <td>STRING NOT NULL</td>
+ <td>该行所属的数据库名称。</td>
+ </tr>
+ <tr>
+ <td>meta.op_ts</td>
+ <td>TIMESTAMP_LTZ(3) NOT NULL</td>
+ <td>它指示在数据库中进行更改的时间。<br/>如果记录从表的快照而不是change流中读取,则该值始终为0。</td>
+ </tr>
+ <tr>
+ <td>meta.op_type</td>
+ <td>STRING</td>
+ <td>数据库操作的类型,如 INSERT/DELETE 等。</td>
+ </tr>
+ <tr>
+ <td>meta.data_canal</td>
+ <td>STRING/BYTES</td>
+ <td>`canal-json` 格式化的行的数据只有在 `source.multiple.enable` 选项为 'true' 时才存在。</td>
+ </tr>
+ <tr>
+ <td>meta.is_ddl</td>
+ <td>BOOLEAN</td>
+ <td>是否是 DDL 语句。</td>
+ </tr>
+ <tr>
+ <td>meta.ts</td>
+ <td>TIMESTAMP_LTZ(3) NOT NULL</td>
+ <td>接收和处理行的当前时间。</td>
+ </tr>
+ <tr>
+ <td>meta.sql_type</td>
+ <td>MAP</td>
+ <td>将 Sql_type 表字段映射到 Java 数据类型 Id。</td>
+ </tr>
+ <tr>
+ <td>meta.oracle_type</td>
+ <td>MAP</td>
+ <td>表的结构。</td>
+ </tr>
+ <tr>
+ <td>meta.pk_names</td>
+ <td>ARRAY</td>
+ <td>表的主键名称。</td>
</tr>
</tbody>
</table>
@@ -388,7 +450,18 @@ CREATE TABLE products (
db_name STRING METADATA FROM 'database_name' VIRTUAL,
schema_name STRING METADATA FROM 'schema_name' VIRTUAL,
table_name STRING METADATA FROM 'table_name' VIRTUAL,
- operation_ts TIMESTAMP_LTZ(3) METADATA FROM 'op_ts' VIRTUAL,
+ op_ts TIMESTAMP_LTZ(3) METADATA FROM 'op_ts' VIRTUAL,
+ meta_db_name STRING METADATA FROM 'meta.database_name' VIRTUAL,
+ meta_schema_name STRING METADATA FROM 'meta.schema_name' VIRTUAL,
+ meta_table_name STRING METADATA FROM 'meta.table_name' VIRTUAL,
+ meat_op_ts TIMESTAMP_LTZ(3) METADATA FROM 'meta.op_ts' VIRTUAL,
+ meta_op_type STRING METADATA FROM 'meta.op_type' VIRTUAL,
+ meta_data_canal STRING METADATA FROM 'meta.data_canal' VIRTUAL,
+ meta_is_ddl BOOLEAN METADATA FROM 'meta.is_ddl' VIRTUAL,
+ meta_ts TIMESTAMP_LTZ(3) METADATA FROM 'meta.ts' VIRTUAL,
+ meta_sql_type MAP<STRING, INT> METADATA FROM 'meta.sql_type' VIRTUAL,
+ meat_oracle_type MAP<STRING, STRING> METADATA FROM 'meta.oracle_type' VIRTUAL,
+ meta_pk_names ARRAY<STRING> METADATA FROM 'meta.pk_names' VIRTUAL
ID INT NOT NULL,
NAME STRING,
DESCRIPTION STRING,
@@ -402,7 +475,7 @@ CREATE TABLE products (
'password' = 'flinkpw',
'database-name' = 'XE',
'schema-name' = 'inventory',
- 'table-name' = 'products'
+ 'table-name' = 'inventory.products'
);
```
@@ -429,6 +502,36 @@ _注意: `scan.startup.mode` 选项的机制依赖于 Debezium 的`snapshot.mode
Oracle Extract 节点不能并行读取,因为只有一个任务可以接收更改事件。
+### 整库、多模式、表同步
+
+Oracle Extract 节点支持整库、多模式、多表同步。开启该功能后,Oracel Extract 节点会将表的物理字段压缩成 'canal-json' 格式的特殊元字段 'data_canal'。
+
+配置参数:
+
+| 参数 | 是否必须 | 默认值 | 数据类型 | 描述 |
+| ---| ---| ---| ---| ---|
+|source.multiple.enable|optional| false|String| 指定`'source.multiple.enable' = 'true'`参数开启整库、多模式、多表同步功能 |
+|schema-name|required|(none)|String| 要监视的 Oracle 数据库的 Schema 名称。如果要捕获多个模式,可以使用逗号分割它们。例如:`'schema-name' = 'SCHEMA1,SCHEMA2'` |
+|table-name| required | (none) |String| 要监视的 Oracle 数据库的表名。如果要捕获多个表,可以使用逗号分割它们。例如:`'table-name' = 'SCHEMA1.TB.*, SCHEMA2.TB1'`|
+
+CREATE TABLE 示例演示该功能语法:
+
+```sql
+CREATE TABLE node(
+ data STRING METADATA FROM 'meta.data_canal' VIRTUAL)
+ WITH (
+ 'connector' = 'oracle-cdc-inlong',
+ 'hostname' = 'localhost',
+ 'port' = '1521',
+ 'username' = 'flinkuser',
+ 'password' = 'flinkpw',
+ 'database-name' = 'XE',
+ 'schema-name' = 'inventory',
+ 'table-name' = 'inventory..*',
+ 'source.multiple.enable' = 'true'
+)
+```
+
## 数据类型映射
<div class="wy-table-responsive">