You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by mo...@apache.org on 2022/10/27 01:37:04 UTC

[doris] branch master updated: [typo](doc) Add the description of json HDFS broker load (#13683)

This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new 3e8cd0c669 [typo](doc) Add the description of json HDFS broker load (#13683)
3e8cd0c669 is described below

commit 3e8cd0c669f7911c0acc02e446f9499856537da1
Author: Tiewei Fang <43...@users.noreply.github.com>
AuthorDate: Thu Oct 27 09:36:57 2022 +0800

    [typo](doc) Add the description of json HDFS broker load (#13683)
    
    Add the instruction of HDFS broker load with json format file.
---
 .../Load/BROKER-LOAD.md                            | 54 +++++++++++++++++++++
 .../Load/BROKER-LOAD.md                            | 55 ++++++++++++++++++++++
 2 files changed, 109 insertions(+)

diff --git a/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md b/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md
index 57fb1003b1..6cf381b74f 100644
--- a/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md
+++ b/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md
@@ -72,6 +72,7 @@ WITH BROKER broker_name
   [WHERE predicate]
   [DELETE ON expr]
   [ORDER BY source_sequence]
+  [PROPERTIES ("key1"="value1", ...)]
   ````
 
   - `[MERGE|APPEND|DELETE]`
@@ -128,6 +129,10 @@ WITH BROKER broker_name
 
     Tables only for the Unique Key model. Used to specify the column in the imported data that represents the Sequence Col. Mainly used to ensure data order when importing.
 
+  - `PROPERTIES ("key1"="value1", ...)`
+
+    Specify some parameters of the imported format. For example, if the imported file is in `json` format, you can specify parameters such as `json_root`, `jsonpaths`, `fuzzy parse`, etc.
+
 - `WITH BROKER broker_name`
 
   Specify the Broker service name to be used. In the public cloud Doris. Broker service name is `bos`
@@ -405,6 +410,55 @@ WITH BROKER broker_name
 
    `my_table` must be an Unqiue Key model table with Sequence Col specified. The data will be ordered according to the value of the `source_sequence` column in the source data.
 
+10. Import a batch of data from HDFS, specify the file format as `json`, and specify parameters of `json_root` and `jsonpaths`.
+
+    ```sql
+    LOAD LABEL example_db.label10
+    (
+        DATA INFILE("HDFS://test:port/input/file.json")
+        INTO TABLE `my_table`
+        FORMAT AS "json"
+        PROPERTIES(
+          "json_root" = "$.item",
+          "jsonpaths" = "[$.id, $.city, $.code]"
+        )       
+    )
+    with HDFS (
+    "hadoop.username" = "user"
+    "password" = ""
+    )
+    PROPERTIES
+    (
+    "timeout"="1200",
+    "max_filter_ratio"="0.1"
+    );
+    ```
+
+    `jsonpaths` can be use with `column list` and `SET(column_mapping)`:
+
+    ```sql
+    LOAD LABEL example_db.label10
+    (
+        DATA INFILE("HDFS://test:port/input/file.json")
+        INTO TABLE `my_table`
+        FORMAT AS "json"
+        (id, code, city)
+        SET (id = id * 10)
+        PROPERTIES(
+          "json_root" = "$.item",
+          "jsonpaths" = "[$.id, $.code, $.city]"
+        )       
+    )
+    with HDFS (
+    "hadoop.username" = "user"
+    "password" = ""
+    )
+    PROPERTIES
+    (
+    "timeout"="1200",
+    "max_filter_ratio"="0.1"
+    );
+    ```
 ### Keywords
 
     BROKER, LOAD
diff --git a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md
index 44b7d5fcee..51bab9d6a1 100644
--- a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md
+++ b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md
@@ -72,6 +72,7 @@ WITH BROKER broker_name
   [WHERE predicate]
   [DELETE ON expr]
   [ORDER BY source_sequence]
+  [PROPERTIES ("key1"="value1", ...)]
   ```
 
   - `[MERGE|APPEND|DELETE]`
@@ -128,6 +129,10 @@ WITH BROKER broker_name
 
     仅针对 Unique Key 模型的表。用于指定导入数据中表示 Sequence Col 的列。主要用于导入时保证数据顺序。
 
+  - `PROPERTIES ("key1"="value1", ...)`
+
+    指定导入的format的一些参数。如导入的文件是`json`格式,则可以在这里指定`json_root`、`jsonpaths`、`fuzzy_parse`等参数。
+
 - `WITH BROKER broker_name`
 
   指定需要使用的 Broker 服务名称。在公有云 Doris 中。Broker 服务名称为 `bos`
@@ -404,6 +409,56 @@ WITH BROKER broker_name
 
    `my_table` 必须是 Unqiue Key 模型表,并且指定了 Sequcence Col。数据会按照源数据中 `source_sequence` 列的值来保证顺序性。
 
+10. 从 HDFS 导入一批数据,指定文件格式为 `json` 并指定 `json_root`、`jsonpaths`
+
+    ```sql
+    LOAD LABEL example_db.label10
+    (
+        DATA INFILE("HDFS://test:port/input/file.json")
+        INTO TABLE `my_table`
+        FORMAT AS "json"
+        PROPERTIES(
+          "json_root" = "$.item",
+          "jsonpaths" = "[$.id, $.city, $.code]"
+        )       
+    )
+    with HDFS (
+    "hadoop.username" = "user"
+    "password" = ""
+    )
+    PROPERTIES
+    (
+    "timeout"="1200",
+    "max_filter_ratio"="0.1"
+    );
+    ```
+
+    `jsonpaths` 可与 `column list` 及 `SET (column_mapping)`配合:
+
+    ```sql
+    LOAD LABEL example_db.label10
+    (
+        DATA INFILE("HDFS://test:port/input/file.json")
+        INTO TABLE `my_table`
+        FORMAT AS "json"
+        (id, code, city)
+        SET (id = id * 10)
+        PROPERTIES(
+          "json_root" = "$.item",
+          "jsonpaths" = "[$.id, $.code, $.city]"
+        )       
+    )
+    with HDFS (
+    "hadoop.username" = "user"
+    "password" = ""
+    )
+    PROPERTIES
+    (
+    "timeout"="1200",
+    "max_filter_ratio"="0.1"
+    );
+    ```
+
 ### Keywords
 
     BROKER, LOAD


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org