You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@inlong.apache.org by yu...@apache.org on 2022/11/08 06:29:41 UTC
[inlong-website] branch master updated: [INLONG-579][Doc] Add doc for oracle connector for all migrate (#584)

This is an automated email from the ASF dual-hosted git repository.

yunqing pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/inlong-website.git


The following commit(s) were added to refs/heads/master by this push:
     new f59f2a7186 [INLONG-579][Doc] Add doc for oracle connector for all migrate (#584)
f59f2a7186 is described below

commit f59f2a71867a85f6c447b7ea37fea67f12a64e06
Author: emhui <11...@users.noreply.github.com>
AuthorDate: Tue Nov 8 14:29:35 2022 +0800

    [INLONG-579][Doc] Add doc for oracle connector for all migrate (#584)
    
    * [INLONG-579][Doc] Add doc for oracle connector for all migrate
---
 docs/data_node/extract_node/oracle-cdc.md          | 109 ++++++++++++++++++-
 .../current/data_node/extract_node/oracle-cdc.md   | 117 +++++++++++++++++++--
 2 files changed, 216 insertions(+), 10 deletions(-)

diff --git a/docs/data_node/extract_node/oracle-cdc.md b/docs/data_node/extract_node/oracle-cdc.md
index f861091bcf..e330965912 100644
--- a/docs/data_node/extract_node/oracle-cdc.md
+++ b/docs/data_node/extract_node/oracle-cdc.md
@@ -294,7 +294,7 @@ TODO: It will be supported in the future.
       <td>required</td>
       <td style={{wordWrap: 'break-word'}}>(none)</td>
       <td>String</td>
-      <td>Table name of the Oracle database to monitor.</td>
+      <td>Table name of the Oracle database to monitor. The value is of the form <i>&lt;schema_name&gt;.&lt;table_name&gt;</i></td>
     </tr>
     <tr>
       <td>port</td>
@@ -328,6 +328,13 @@ TODO: It will be supported in the future.
       <td>String</td>
       <td>Inlong metric label, format of value is groupId=xxgroup&streamId=xxstream&nodeId=xxnode.</td> 
     </tr>
+    <tr>
+       <td>source.multiple.enable</td>
+       <td>optional</td>
+       <td style={{wordWrap: 'break-word'}}>false</td>
+       <td>Boolean</td>
+       <td>Whether to enable multiple schema and table migration. If it is' true ', Oracle Extract Node will compress the physical field of the table into a special meta field 'data_canal' in the format of 'canal json'.</td> 
+     </tr>
     </tbody>
 </table>    
 </div>
@@ -378,6 +385,61 @@ The following format metadata can be exposed as read-only (VIRTUAL) columns in a
       <td>TIMESTAMP_LTZ(3) NOT NULL</td>
       <td>It indicates the time that the change was made in the database. <br/>If the record is read from snapshot of the table instead of the change stream, the value is always 0.</td>
     </tr>
+    <tr>
+      <td>meta.table_name</td>
+      <td>STRING NOT NULL</td>
+      <td>Name of the table that contain the row.</td>
+    </tr>
+    <tr>
+      <td>meta.schema_name</td>
+      <td>STRING NOT NULL</td>
+      <td>Name of the schema that contain the row.</td>
+    </tr>
+    <tr>
+      <td>meta.database_name</td>
+      <td>STRING NOT NULL</td>
+      <td>Name of the database that contain the row.</td>
+    </tr>
+    <tr>
+      <td>meta.op_ts</td>
+      <td>TIMESTAMP_LTZ(3) NOT NULL</td>
+      <td>It indicates the time that the change was made in the database. <br/>If the record is read from snapshot of the table instead of the change stream, the value is always 0.</td>
+    </tr>
+    <tr>
+      <td>meta.op_type</td>
+      <td>STRING</td>
+      <td>Type of database operation, such as INSERT/DELETE, etc.</td>
+    </tr>
+    <tr>
+      <td>meta.data_canal</td>
+      <td>STRING/BYTES</td>
+      <td>Data for rows in `canal-json` format only exists when the `source.multiple.enable` option is 'true'.</td>
+    </tr>
+    <tr>
+      <td>meta.is_ddl</td>
+      <td>BOOLEAN</td>
+      <td>Whether the DDL statement.</td>
+    </tr>
+    <tr>
+      <td>meta.ts</td>
+      <td>TIMESTAMP_LTZ(3) NOT NULL</td>
+      <td>The current time when the row was received and processed.</td>
+    </tr>
+    <tr>
+      <td>meta.sql_type</td>
+      <td>MAP</td>
+      <td>Mapping of sql_type table fields to java data type IDs.</td>
+    </tr>
+    <tr>
+      <td>meta.oracle_type</td>
+      <td>MAP</td>
+      <td>Structure of the table.</td>
+    </tr>
+    <tr>
+      <td>meta.pk_names</td>
+      <td>ARRAY</td>
+      <td>Primay key name of the table.</td>
+    </tr>
   </tbody>
 </table>
 
@@ -387,7 +449,18 @@ CREATE TABLE products (
     db_name STRING METADATA FROM 'database_name' VIRTUAL,
     schema_name STRING METADATA FROM 'schema_name' VIRTUAL, 
     table_name STRING METADATA  FROM 'table_name' VIRTUAL,
-    operation_ts TIMESTAMP_LTZ(3) METADATA FROM 'op_ts' VIRTUAL,
+    op_ts TIMESTAMP_LTZ(3) METADATA FROM 'op_ts' VIRTUAL,
+    meta_db_name STRING METADATA FROM 'meta.database_name' VIRTUAL,
+    meta_schema_name STRING METADATA FROM 'meta.schema_name' VIRTUAL, 
+    meta_table_name STRING METADATA  FROM 'meta.table_name' VIRTUAL,
+    meat_op_ts TIMESTAMP_LTZ(3) METADATA FROM 'meta.op_ts' VIRTUAL,
+    meta_op_type STRING METADATA  FROM 'meta.op_type' VIRTUAL,
+    meta_data_canal STRING METADATA  FROM 'meta.data_canal' VIRTUAL,
+    meta_is_ddl BOOLEAN METADATA FROM 'meta.is_ddl' VIRTUAL,
+    meta_ts TIMESTAMP_LTZ(3) METADATA FROM 'meta.ts' VIRTUAL,
+    meta_sql_type MAP<STRING, INT> METADATA FROM 'meta.sql_type' VIRTUAL,
+    meat_oracle_type MAP<STRING, STRING> METADATA FROM 'meta.oracle_type' VIRTUAL,
+    meta_pk_names ARRAY<STRING> METADATA FROM 'meta.pk_names' VIRTUAL
     ID INT NOT NULL,
     NAME STRING,
     DESCRIPTION STRING,
@@ -401,7 +474,7 @@ CREATE TABLE products (
     'password' = 'flinkpw',
     'database-name' = 'XE',
     'schema-name' = 'inventory',
-    'table-name' = 'products'
+    'table-name' = 'inventory.products'
 );
 ```
 
@@ -427,6 +500,36 @@ _Note: the mechanism of `scan.startup.mode` option relying on Debezium's `snapsh
 
 The Oracle Extract Node can't work in parallel reading, because there is only one task can receive change events.
 
+### Whole Database, Multiple Schemas, Multiple Tables Migration
+
+Oracle Extract Node supports the whole database, multiple schemas, multiple tables migration function. When you enable this function, Oracle Extract Node will compress the physical field of the table into a special meta field 'data_canal' in the format of 'canal json'.
+
+config options:
+
+| Option | Required | Default | Type | Description |
+| ---| ---| ---| ---| ---|
+|source.multiple.enable|optional| false|String|Specify `'source.multiple.enable' = 'true'` to enable the whole database, multiple schemas, multiple tables migration function | 
+|schema-name|required|(none)|String| Schema name of the Oracle database to monitor. If you want to capture multiple schemas, you can use commas to separate them. For example: `'schema-name' = 'SCHEMA1,SCHEMA2'` |
+|table-name| required | (none) |String| Table name of the Oracle database to monitor. If you want to capture multiple tables, you can use commas to separate them. For example: `'table-name' = 'SCHEMA1.TB.*, SCHEMA2.TB1'`|
+
+The CREATE TABLE example demonstrates the syntax of this function:
+
+```sql
+CREATE TABLE node(
+    data STRING METADATA FROM 'meta.data_canal' VIRTUAL)
+    WITH (
+    'connector' = 'oracle-cdc-inlong',
+    'hostname' = 'localhost',
+    'port' = '1521',
+    'username' = 'flinkuser',
+    'password' = 'flinkpw',
+    'database-name' = 'XE',
+    'schema-name' = 'inventory',
+    'table-name' = 'inventory..*',
+    'source.multiple.enable' = 'true'
+)
+```
+
 ## Data Type Mapping
 
 <div class="wy-table-responsive">
diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/oracle-cdc.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/oracle-cdc.md
index 565f3b4d53..fdf261a6f4 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/oracle-cdc.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data_node/extract_node/oracle-cdc.md
@@ -294,7 +294,7 @@ TODO: 将在未来支持此功能。
       <td>required</td>
       <td style={{wordWrap: 'break-word'}}>(none)</td>
       <td>String</td>
-      <td>要监视的 Oracle 数据库的表名。</td>
+      <td>要监视的 Oracle 数据库的表名。格式为<i>&lt;schema_name&gt;.&lt;table_name&gt;</i></td>
     </tr>
     <tr>
       <td>port</td>
@@ -329,6 +329,13 @@ Oracle CDC 消费者的可选启动模式，有效枚举为"initial"
        <td>String</td>
        <td>inlong metric 的标签值，该值的构成为groupId&streamId&nodeId。</td> 
      </tr>
+     <tr>
+       <td>source.multiple.enable</td>
+       <td>可选</td>
+       <td style={{wordWrap: 'break-word'}}>false</td>
+       <td>Boolean</td>
+       <td>是否开启多模式、表同步功能，如果为 'true'，Oracle Extract Node 则将表的物理字段压缩成 'canal-json' 格式的特殊元字段 'data_canal'。</td> 
+     </tr>
     </tbody>
 </table>    
 </div>
@@ -361,22 +368,77 @@ restart-strategy.fixed-delay.attempts: 2147483647
     <tr>
       <td>table_name</td>
       <td>STRING NOT NULL</td>
-      <td>Name of the table that contain the row.</td>
+      <td>该行所属的表名。</td>
     </tr>
     <tr>
       <td>schema_name</td>
       <td>STRING NOT NULL</td>
-      <td>Name of the schema that contain the row.</td>
+      <td>该行所属的模式名称。</td>
     </tr>
     <tr>
       <td>database_name</td>
       <td>STRING NOT NULL</td>
-      <td>Name of the database that contain the row.</td>
+      <td>该行所属的数据库名称。</td>
     </tr>
     <tr>
       <td>op_ts</td>
       <td>TIMESTAMP_LTZ(3) NOT NULL</td>
-      <td>It indicates the time that the change was made in the database. <br/>If the record is read from snapshot of the table instead of the change stream, the value is always 0.</td>
+      <td>它指示在数据库中进行更改的时间。<br/>如果记录从表的快照而不是change流中读取，则该值始终为0。</td>
+    </tr>
+    <tr>
+      <td>meta.table_name</td>
+      <td>STRING NOT NULL</td>
+      <td>该行所属的表名。</td>
+    </tr>
+    <tr>
+      <td>meta.schema_name</td>
+      <td>STRING NOT NULL</td>
+      <td>该行所属的模式名称。</td>
+    </tr>
+    <tr>
+      <td>meta.database_name</td>
+      <td>STRING NOT NULL</td>
+      <td>该行所属的数据库名称。</td>
+    </tr>
+    <tr>
+      <td>meta.op_ts</td>
+      <td>TIMESTAMP_LTZ(3) NOT NULL</td>
+      <td>它指示在数据库中进行更改的时间。<br/>如果记录从表的快照而不是change流中读取，则该值始终为0。</td>
+    </tr>
+    <tr>
+      <td>meta.op_type</td>
+      <td>STRING</td>
+      <td>数据库操作的类型，如 INSERT/DELETE 等。</td>
+    </tr>
+    <tr>
+      <td>meta.data_canal</td>
+      <td>STRING/BYTES</td>
+      <td>`canal-json` 格式化的行的数据只有在 `source.multiple.enable` 选项为 'true' 时才存在。</td>
+    </tr>
+    <tr>
+      <td>meta.is_ddl</td>
+      <td>BOOLEAN</td>
+      <td>是否是 DDL 语句。</td>
+    </tr>
+    <tr>
+      <td>meta.ts</td>
+      <td>TIMESTAMP_LTZ(3) NOT NULL</td>
+      <td>接收和处理行的当前时间。</td>
+    </tr>
+    <tr>
+      <td>meta.sql_type</td>
+      <td>MAP</td>
+      <td>将 Sql_type 表字段映射到 Java 数据类型 Id。</td>
+    </tr>
+    <tr>
+      <td>meta.oracle_type</td>
+      <td>MAP</td>
+      <td>表的结构。</td>
+    </tr>
+    <tr>
+      <td>meta.pk_names</td>
+      <td>ARRAY</td>
+      <td>表的主键名称。</td>
     </tr>
   </tbody>
 </table>
@@ -388,7 +450,18 @@ CREATE TABLE products (
     db_name STRING METADATA FROM 'database_name' VIRTUAL,
     schema_name STRING METADATA FROM 'schema_name' VIRTUAL, 
     table_name STRING METADATA  FROM 'table_name' VIRTUAL,
-    operation_ts TIMESTAMP_LTZ(3) METADATA FROM 'op_ts' VIRTUAL,
+    op_ts TIMESTAMP_LTZ(3) METADATA FROM 'op_ts' VIRTUAL,
+    meta_db_name STRING METADATA FROM 'meta.database_name' VIRTUAL,
+    meta_schema_name STRING METADATA FROM 'meta.schema_name' VIRTUAL, 
+    meta_table_name STRING METADATA  FROM 'meta.table_name' VIRTUAL,
+    meat_op_ts TIMESTAMP_LTZ(3) METADATA FROM 'meta.op_ts' VIRTUAL,
+    meta_op_type STRING METADATA  FROM 'meta.op_type' VIRTUAL,
+    meta_data_canal STRING METADATA  FROM 'meta.data_canal' VIRTUAL,
+    meta_is_ddl BOOLEAN METADATA FROM 'meta.is_ddl' VIRTUAL,
+    meta_ts TIMESTAMP_LTZ(3) METADATA FROM 'meta.ts' VIRTUAL,
+    meta_sql_type MAP<STRING, INT> METADATA FROM 'meta.sql_type' VIRTUAL,
+    meat_oracle_type MAP<STRING, STRING> METADATA FROM 'meta.oracle_type' VIRTUAL,
+    meta_pk_names ARRAY<STRING> METADATA FROM 'meta.pk_names' VIRTUAL
     ID INT NOT NULL,
     NAME STRING,
     DESCRIPTION STRING,
@@ -402,7 +475,7 @@ CREATE TABLE products (
     'password' = 'flinkpw',
     'database-name' = 'XE',
     'schema-name' = 'inventory',
-    'table-name' = 'products'
+    'table-name' = 'inventory.products'
 );
 ```
 
@@ -429,6 +502,36 @@ _注意: `scan.startup.mode` 选项的机制依赖于 Debezium 的`snapshot.mode
 
 Oracle Extract 节点不能并行读取，因为只有一个任务可以接收更改事件。
 
+### 整库、多模式、表同步
+
+Oracle Extract 节点支持整库、多模式、多表同步。开启该功能后，Oracel Extract 节点会将表的物理字段压缩成 'canal-json' 格式的特殊元字段 'data_canal'。
+
+配置参数：
+
+| 参数 | 是否必须 | 默认值 | 数据类型 | 描述 |
+| ---| ---| ---| ---| ---|
+|source.multiple.enable|optional| false|String| 指定`'source.multiple.enable' = 'true'`参数开启整库、多模式、多表同步功能 | 
+|schema-name|required|(none)|String| 要监视的 Oracle 数据库的 Schema 名称。如果要捕获多个模式，可以使用逗号分割它们。例如：`'schema-name' = 'SCHEMA1,SCHEMA2'` |
+|table-name| required | (none) |String| 要监视的 Oracle 数据库的表名。如果要捕获多个表，可以使用逗号分割它们。例如：`'table-name' = 'SCHEMA1.TB.*, SCHEMA2.TB1'`|
+
+CREATE TABLE 示例演示该功能语法：
+
+```sql
+CREATE TABLE node(
+    data STRING METADATA FROM 'meta.data_canal' VIRTUAL)
+    WITH (
+    'connector' = 'oracle-cdc-inlong',
+    'hostname' = 'localhost',
+    'port' = '1521',
+    'username' = 'flinkuser',
+    'password' = 'flinkpw',
+    'database-name' = 'XE',
+    'schema-name' = 'inventory',
+    'table-name' = 'inventory..*',
+    'source.multiple.enable' = 'true'
+)
+```
+
 ## 数据类型映射
 
 <div class="wy-table-responsive">