You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by GitBox <gi...@apache.org> on 2022/10/28 10:58:39 UTC

[GitHub] [incubator-seatunnel] EricJoy2048 commented on a diff in pull request #3164: [Feature][Connector-V2] Starrocks sink connector

EricJoy2048 commented on code in PR #3164:
URL: https://github.com/apache/incubator-seatunnel/pull/3164#discussion_r1007934366


##########
docs/en/connector-v2/sink/StarRocks.md:
##########
@@ -0,0 +1,122 @@
+# StarRocks
+
+> StarRocks sink connector
+
+## Description
+Used to send data to StarRocks. Both support streaming and batch mode.
+The internal implementation of StarRocks sink connector is cached and imported by stream load in batches.
+## Key features
+
+- [ ] [exactly-once](../../concept/connector-v2-features.md)
+- [ ] [schema projection](../../concept/connector-v2-features.md)
+
+## Options
+
+| name                        | type                         | required | default value   |
+|-----------------------------|------------------------------|----------|-----------------|
+| node_urls                   | list                         | yes      | -               |
+| username                    | string                       | yes      | -               |
+| password                    | string                       | yes      | -               |
+| database                    | string                       | yes      | -               |
+| table                       | string                       | no       | -               |
+| labelPrefix                 | string                       | no       | -               |
+| batch_max_rows              | long                         | no       | 1024            |
+| batch_max_bytes             | int                          | no       | 5 * 1024 * 1024 |
+| batch_interval_ms           | int                          | no       | -               |
+| max_retries                 | int                          | no       | -               |
+| retry_backoff_multiplier_ms | int                          | no       | -               |
+| max_retry_backoff_ms        | int                          | no       | -               |
+| sink.properties.*           | starrocks stream load config | no       | -               |
+
+### node_urls [list]
+
+`StarRocks` cluster address, the format is `["fe_ip:fe_http_port", ...]`
+
+### username [string]
+
+`StarRocks` user username
+
+### password [string]
+
+`StarRocks` user password
+
+### database [string]
+
+The name of StarRocks database
+
+### table [string]
+
+The name of StarRocks table
+
+### labelPrefix [string]
+
+the prefix of  StarRocks stream load label
+
+### batch_max_rows [string]
+
+For batch writing, when the number of buffers reaches the number of `batch_max_rows` or the byte size of `batch_max_bytes` or the time reaches `batch_interval_ms`, the data will be flushed into the StarRocks
+
+### batch_max_bytes [string]
+
+For batch writing, when the number of buffers reaches the number of `batch_max_rows` or the byte size of `batch_max_bytes` or the time reaches `batch_interval_ms`, the data will be flushed into the StarRocks
+
+### batch_interval_ms [string]
+
+For batch writing, when the number of buffers reaches the number of `batch_max_rows` or the byte size of `batch_max_bytes` or the time reaches `batch_interval_ms`, the data will be flushed into the StarRocks
+
+### max_retries [string]
+
+The number of retries to flush failed
+
+### retry_backoff_multiplier_ms [string]
+
+Using as a multiplier for generating the next delay for backoff
+
+### max_retry_backoff_ms [string]
+
+The amount of time to wait before attempting to retry a request to `StarRocks`
+
+### sink.properties.*  [starrocks stream load config]
+
+the parameter of the stream load `data_desc`
+The way to specify the parameter is to add the prefix `sink.properties.` to the original stream load parameter name. 
+For example, the way to specify `strip_outer_array` is: `sink.properties.strip_outer_array`.
+
+#### Supported import data formats
+
+The supported formats include CSV and JSON. Default value: CSV
+
+## Example
+Use JSON format to import data
+```
+sink {
+    StarRocks {
+        nodeUrls = ["e2e_starRocksdb:8030"]
+        username = root
+        password = ""
+        database = "test"
+        table = "e2e_table_sink"
+        batch_max_rows = 10
+        sink.properties.format = "JSON"
+        sink.properties.strip_outer_array = true
+    }
+}
+
+```
+
+Use CSV format to import data
+```
+sink {
+    StarRocks {
+        nodeUrls = ["e2e_starRocksdb:8030"]
+        username = root
+        password = ""
+        database = "test"
+        table = "e2e_table_sink"
+        batch_max_rows = 10
+        sink.properties.format = "CSV"
+        sink.properties.column_separator = "\\x01",
+        sink.properties.row_delimiter = "\\x02"
+    }
+}
+```

Review Comment:
   Please add the `Change log` reference https://github.com/apache/incubator-seatunnel/blob/dev/docs/en/connector-v2/source/SftpFile.md



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org