You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by mo...@apache.org on 2022/10/27 01:37:04 UTC
[doris] branch master updated: [typo](doc) Add the description of json HDFS broker load (#13683)
This is an automated email from the ASF dual-hosted git repository.
morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push:
new 3e8cd0c669 [typo](doc) Add the description of json HDFS broker load (#13683)
3e8cd0c669 is described below
commit 3e8cd0c669f7911c0acc02e446f9499856537da1
Author: Tiewei Fang <43...@users.noreply.github.com>
AuthorDate: Thu Oct 27 09:36:57 2022 +0800
[typo](doc) Add the description of json HDFS broker load (#13683)
Add the instruction of HDFS broker load with json format file.
---
.../Load/BROKER-LOAD.md | 54 +++++++++++++++++++++
.../Load/BROKER-LOAD.md | 55 ++++++++++++++++++++++
2 files changed, 109 insertions(+)
diff --git a/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md b/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md
index 57fb1003b1..6cf381b74f 100644
--- a/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md
+++ b/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md
@@ -72,6 +72,7 @@ WITH BROKER broker_name
[WHERE predicate]
[DELETE ON expr]
[ORDER BY source_sequence]
+ [PROPERTIES ("key1"="value1", ...)]
````
- `[MERGE|APPEND|DELETE]`
@@ -128,6 +129,10 @@ WITH BROKER broker_name
Tables only for the Unique Key model. Used to specify the column in the imported data that represents the Sequence Col. Mainly used to ensure data order when importing.
+ - `PROPERTIES ("key1"="value1", ...)`
+
+ Specify some parameters of the imported format. For example, if the imported file is in `json` format, you can specify parameters such as `json_root`, `jsonpaths`, `fuzzy parse`, etc.
+
- `WITH BROKER broker_name`
Specify the Broker service name to be used. In the public cloud Doris. Broker service name is `bos`
@@ -405,6 +410,55 @@ WITH BROKER broker_name
`my_table` must be an Unqiue Key model table with Sequence Col specified. The data will be ordered according to the value of the `source_sequence` column in the source data.
+10. Import a batch of data from HDFS, specify the file format as `json`, and specify parameters of `json_root` and `jsonpaths`.
+
+ ```sql
+ LOAD LABEL example_db.label10
+ (
+ DATA INFILE("HDFS://test:port/input/file.json")
+ INTO TABLE `my_table`
+ FORMAT AS "json"
+ PROPERTIES(
+ "json_root" = "$.item",
+ "jsonpaths" = "[$.id, $.city, $.code]"
+ )
+ )
+ with HDFS (
+ "hadoop.username" = "user"
+ "password" = ""
+ )
+ PROPERTIES
+ (
+ "timeout"="1200",
+ "max_filter_ratio"="0.1"
+ );
+ ```
+
+ `jsonpaths` can be use with `column list` and `SET(column_mapping)`:
+
+ ```sql
+ LOAD LABEL example_db.label10
+ (
+ DATA INFILE("HDFS://test:port/input/file.json")
+ INTO TABLE `my_table`
+ FORMAT AS "json"
+ (id, code, city)
+ SET (id = id * 10)
+ PROPERTIES(
+ "json_root" = "$.item",
+ "jsonpaths" = "[$.id, $.code, $.city]"
+ )
+ )
+ with HDFS (
+ "hadoop.username" = "user"
+ "password" = ""
+ )
+ PROPERTIES
+ (
+ "timeout"="1200",
+ "max_filter_ratio"="0.1"
+ );
+ ```
### Keywords
BROKER, LOAD
diff --git a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md
index 44b7d5fcee..51bab9d6a1 100644
--- a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md
+++ b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md
@@ -72,6 +72,7 @@ WITH BROKER broker_name
[WHERE predicate]
[DELETE ON expr]
[ORDER BY source_sequence]
+ [PROPERTIES ("key1"="value1", ...)]
```
- `[MERGE|APPEND|DELETE]`
@@ -128,6 +129,10 @@ WITH BROKER broker_name
仅针对 Unique Key 模型的表。用于指定导入数据中表示 Sequence Col 的列。主要用于导入时保证数据顺序。
+ - `PROPERTIES ("key1"="value1", ...)`
+
+ 指定导入的format的一些参数。如导入的文件是`json`格式,则可以在这里指定`json_root`、`jsonpaths`、`fuzzy_parse`等参数。
+
- `WITH BROKER broker_name`
指定需要使用的 Broker 服务名称。在公有云 Doris 中。Broker 服务名称为 `bos`
@@ -404,6 +409,56 @@ WITH BROKER broker_name
`my_table` 必须是 Unqiue Key 模型表,并且指定了 Sequcence Col。数据会按照源数据中 `source_sequence` 列的值来保证顺序性。
+10. 从 HDFS 导入一批数据,指定文件格式为 `json` 并指定 `json_root`、`jsonpaths`
+
+ ```sql
+ LOAD LABEL example_db.label10
+ (
+ DATA INFILE("HDFS://test:port/input/file.json")
+ INTO TABLE `my_table`
+ FORMAT AS "json"
+ PROPERTIES(
+ "json_root" = "$.item",
+ "jsonpaths" = "[$.id, $.city, $.code]"
+ )
+ )
+ with HDFS (
+ "hadoop.username" = "user"
+ "password" = ""
+ )
+ PROPERTIES
+ (
+ "timeout"="1200",
+ "max_filter_ratio"="0.1"
+ );
+ ```
+
+ `jsonpaths` 可与 `column list` 及 `SET (column_mapping)`配合:
+
+ ```sql
+ LOAD LABEL example_db.label10
+ (
+ DATA INFILE("HDFS://test:port/input/file.json")
+ INTO TABLE `my_table`
+ FORMAT AS "json"
+ (id, code, city)
+ SET (id = id * 10)
+ PROPERTIES(
+ "json_root" = "$.item",
+ "jsonpaths" = "[$.id, $.code, $.city]"
+ )
+ )
+ with HDFS (
+ "hadoop.username" = "user"
+ "password" = ""
+ )
+ PROPERTIES
+ (
+ "timeout"="1200",
+ "max_filter_ratio"="0.1"
+ );
+ ```
+
### Keywords
BROKER, LOAD
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org