You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by fa...@apache.org on 2022/06/28 08:58:56 UTC

[incubator-seatunnel] branch api-draft updated: [API-Draft][DOC] Add jdbc connector doc (#2069)

This is an automated email from the ASF dual-hosted git repository.

fanjia pushed a commit to branch api-draft
in repository https://gitbox.apache.org/repos/asf/incubator-seatunnel.git


The following commit(s) were added to refs/heads/api-draft by this push:
     new def84d9e9 [API-Draft][DOC] Add jdbc connector doc (#2069)
def84d9e9 is described below

commit def84d9e94b852eb471d152316dd7c472c3798ca
Author: ic4y <83...@users.noreply.github.com>
AuthorDate: Tue Jun 28 16:58:49 2022 +0800

    [API-Draft][DOC] Add jdbc connector doc (#2069)
    
    * Add jdbc connector doc
    
    * Add jdbc connector doc
---
 docs/en/new-connector/sink/Jdbc.md   | 96 ++++++++++++++++++++++++++++++++++++
 docs/en/new-connector/source/Jdbc.md | 75 ++++++++++++++++++++++++++++
 2 files changed, 171 insertions(+)

diff --git a/docs/en/new-connector/sink/Jdbc.md b/docs/en/new-connector/sink/Jdbc.md
new file mode 100644
index 000000000..746b9fe4b
--- /dev/null
+++ b/docs/en/new-connector/sink/Jdbc.md
@@ -0,0 +1,96 @@
+# JDBC
+## Description
+Write data through jdbc. Support Batch mode and Streaming mode, support concurrent writing, support exactly-once semantics (using XA transaction guarantee).
+## Options
+
+| name | type | required | default value |
+| --- | --- | --- | --- |
+| url | String | Yes | - |
+| driver | String | Yes | - |
+| user | String | No | - |
+| password | String | No | - |
+| query | String | Yes | - |
+| connection_check_timeout_sec | Int | No | 30 |
+| max_retries | Int | No | 3 |
+| batch_size | Int | No | 300 |
+| batch_interval_ms | Int | No | 1000 |
+| is_exactly_once | Boolean | No | false |
+| xa_data_source_class_name | String | No | - |
+| max_commit_attempts | Int | No | 3 |
+| transaction_timeout_sec | Int | No | -1 |
+
+### driver [string]
+The jdbc class name used to connect to the remote data source, if you use MySQL the value is com.mysql.cj.jdbc.Driver.
+Warn: for license compliance, you have to provide MySQL JDBC driver yourself, e.g. copy mysql-connector-java-xxx.jar to $SEATNUNNEL_HOME/lib for Standalone.
+
+### user [string]
+userName
+
+### password [string]
+password
+
+### url [string]
+The URL of the JDBC connection. Refer to a case: jdbc:postgresql://localhost/test
+
+### query [string]
+Query statement
+
+### connection_check_timeout_sec [int]
+
+The time in seconds to wait for the database operation used to validate the connection to complete.
+
+### max_retries[int]
+The number of retries to submit failed (executeBatch)
+
+### batch_size[int]
+For batch writing, when the number of buffers reaches the number of `batch_size` or the time reaches `batch_interval_ms`, the data will be flushed into the database
+
+### batch_interval_ms[int]
+For batch writing, when the number of buffers reaches the number of `batch_size` or the time reaches `batch_interval_ms`, the data will be flushed into the database
+
+### is_exactly_once[boolean]
+Whether to enable exactly-once semantics, which will use Xa transactions. If on, you need to set `xa_data_source_class_name`.
+
+### xa_data_source_class_name[string]
+The xa data source class name of the database Driver, for example, mysql is `com.mysql.cj.jdbc.MysqlXADataSource` and postgresql is `org.postgresql.xa.PGXADataSource`
+
+### max_commit_attempts[int]
+The number of retries for transaction commit failures
+
+### transaction_timeout_sec[int]
+The timeout after the transaction is opened, the default is -1 (never timeout). Note that setting the timeout may affect exactly-once semantics
+
+## tips
+In the case of is_exactly_once = "true", Xa transactions are used. This requires database support, and some databases require some setup. For example, postgres needs to set `max_prepared_transactions > 1`
+Such as `ALTER SYSTEM set max_prepared_transactions to 10`.
+
+## Example
+Simple
+```
+jdbc {
+    url = "jdbc:mysql://localhost/test"
+    driver = "com.mysql.cj.jdbc.Driver"
+    user = "root"
+    password = "123456"
+    query = "insert into test_table(name,age) values(?,?)"
+}
+
+```
+
+Exactly-once
+```
+jdbc {
+
+    url = "jdbc:mysql://localhost/test"
+    driver = "com.mysql.cj.jdbc.Driver"
+
+    max_retries = 0
+    user = "root"
+    password = "123456"
+    query = "insert into test_table(name,age) values(?,?)"
+
+    is_exactly_once = "true"
+
+    xa_data_source_class_name = "com.mysql.cj.jdbc.MysqlXADataSource"
+}
+```
diff --git a/docs/en/new-connector/source/Jdbc.md b/docs/en/new-connector/source/Jdbc.md
new file mode 100644
index 000000000..4a56c65f8
--- /dev/null
+++ b/docs/en/new-connector/source/Jdbc.md
@@ -0,0 +1,75 @@
+# JDBC
+## Description
+Read external data source data through JDBC. Currently supports mysql and Postgres databases, and supports Batch mode.
+
+##  Options
+
+| name | type | required | default value |
+| --- | --- | --- | --- |
+| url | String | Yes | - |
+| driver | String | Yes | - |
+| user | String | No | - |
+| password | String | No | - |
+| query | String | Yes | - |
+| connection_check_timeout_sec | Int | No | 30 |
+| partition_column | String | No | - |
+| partition_upper_bound | Long | No | - |
+| partition_lower_bound | Long | No | - |
+
+### driver [string]
+The jdbc class name used to connect to the remote data source, if you use MySQL the value is com.mysql.cj.jdbc.Driver.
+Warn: for license compliance, you have to provide MySQL JDBC driver yourself, e.g. copy mysql-connector-java-xxx.jar to $SEATNUNNEL_HOME/lib for Standalone.
+
+### user [string]
+userName
+
+### password [string]
+password
+
+### url [string]
+The URL of the JDBC connection. Refer to a case: jdbc:postgresql://localhost/test
+
+### query [string]
+Query statement
+
+### connection_check_timeout_sec [int]
+
+The time in seconds to wait for the database operation used to validate the connection to complete.
+
+### partition_column [string]
+The column name for parallelism's partition, only support numeric type.
+
+
+### partition_upper_bound [long]
+The partition_column max value for scan, if not set SeaTunnel will query database get max value.
+
+
+### partition_lower_bound [long]
+The partition_column min value for scan, if not set SeaTunnel will query database get min value.
+
+## tips
+If partition_column is not set, it will run in single concurrency, and if partition_column is set, it will be executed in parallel according to the concurrency of tasks.
+
+## Example
+simple:
+```Jdbc {
+        url = "jdbc:mysql://localhost/test?serverTimezone=GMT%2b8"
+        driver = "com.mysql.cj.jdbc.Driver"
+        connection_check_timeout_sec = 100
+        user = "root"
+        password = "123456"
+        query = "select * from type_bin"
+    }
+```
+parallel:
+```
+    Jdbc {
+        url = "jdbc:mysql://localhost/test?serverTimezone=GMT%2b8"
+        driver = "com.mysql.cj.jdbc.Driver"
+        connection_check_timeout_sec = 100
+        user = "root"
+        password = "123456"
+        query = "select * from type_bin"
+        partition_column= "id"
+    }
+```