You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by mo...@apache.org on 2023/04/20 15:12:25 UTC

[doris] branch master updated: [Improvement](broker) support broker load from tencent Goose File System (#18745)

This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new c6b1b9de80 [Improvement](broker) support broker load from tencent Goose File System (#18745)
c6b1b9de80 is described below

commit c6b1b9de809e739b536dfdd034be2d06333c0ee4
Author: Yulei-Yang <yu...@gmail.com>
AuthorDate: Thu Apr 20 23:12:17 2023 +0800

    [Improvement](broker) support broker load from tencent Goose File System (#18745)
    
    Including below functions:
    1. broker load
    2. export
    3. select into outfile
    4. create repo and backup to gfs
    after config env, use gfs like other hdfs system.
---
 docs/en/docs/advanced/broker.md                               |  1 +
 docs/en/docs/lakehouse/multi-catalog/hive.md                  | 11 +++++++++++
 docs/en/docs/lakehouse/multi-catalog/iceberg.md               |  8 ++++++++
 docs/zh-CN/docs/advanced/broker.md                            |  1 +
 docs/zh-CN/docs/lakehouse/multi-catalog/hive.md               | 11 +++++++++++
 docs/zh-CN/docs/lakehouse/multi-catalog/iceberg.md            |  8 ++++++++
 .../src/main/java/org/apache/doris/analysis/ExportStmt.java   |  3 ++-
 .../main/java/org/apache/doris/analysis/StorageBackend.java   |  1 +
 .../src/main/java/org/apache/doris/backup/BlobStorage.java    |  1 +
 9 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/docs/en/docs/advanced/broker.md b/docs/en/docs/advanced/broker.md
index bcf9efdc0f..aa4ce1fe52 100644
--- a/docs/en/docs/advanced/broker.md
+++ b/docs/en/docs/advanced/broker.md
@@ -31,6 +31,7 @@ Broker is an optional process in the Doris cluster. It is mainly used to support
 - Apache HDFS
 - Aliyun OSS
 - Tencent Cloud CHDFS
+- Tencent Cloud GFS (since 1.2.0)
 - Huawei Cloud OBS (since 1.2.0)
 - Amazon S3 
 - JuiceFS (since 2.0.0)
diff --git a/docs/en/docs/lakehouse/multi-catalog/hive.md b/docs/en/docs/lakehouse/multi-catalog/hive.md
index c2a05ec622..11ba2ecc1f 100644
--- a/docs/en/docs/lakehouse/multi-catalog/hive.md
+++ b/docs/en/docs/lakehouse/multi-catalog/hive.md
@@ -38,6 +38,17 @@ When connnecting to Hive, Doris:
 2. Supports both Managed Table and External Table;
 3. Can identify metadata of Hive, Iceberg, and Hudi stored in Hive Metastore;
 4. Supports Hive tables with data stored in JuiceFS, which can be used the same way as normal Hive tables (put `juicefs-hadoop-x.x.x.jar` in `fe/lib/` and `apache_hdfs_broker/lib/`).
+5. Supports Hive tables with data stored in CHDFS, which can be used the same way as normal Hive tables. Follow below steps to prepare doris environment:
+    1. put chdfs_hadoop_plugin_network-x.x.jar in fe/lib/ and apache_hdfs_broker/lib/
+    2. copy core-site.xml and hdfs-site.xml from hive cluster to fe/conf/ and apache_hdfs_broker/conf
+
+<version since="dev">
+
+6. Supports Hive / Iceberg tables with data stored in GooseFS(GFS), which can be used the same way as normal Hive tables. Follow below steps to prepare doris environment:
+    1. put goosefs-x.x.x-client.jar in fe/lib/ and apache_hdfs_broker/lib/
+    2. add extra properties 'fs.AbstractFileSystem.gfs.impl' = 'com.qcloud.cos.goosefs.hadoop.GooseFileSystem', 'fs.gfs.impl' = 'com.qcloud.cos.goosefs.hadoop.FileSystem' when creating catalog
+
+</version>
 
 ## Create Catalog
 
diff --git a/docs/en/docs/lakehouse/multi-catalog/iceberg.md b/docs/en/docs/lakehouse/multi-catalog/iceberg.md
index f410ebac59..91af94462e 100644
--- a/docs/en/docs/lakehouse/multi-catalog/iceberg.md
+++ b/docs/en/docs/lakehouse/multi-catalog/iceberg.md
@@ -34,6 +34,14 @@ When connecting to Iceberg, Doris:
 1. Supports Iceberg V1/V2 table formats;
 2. Supports Position Delete but not Equality Delete for V2 format;
 
+<version since="dev">
+
+3. Supports Hive / Iceberg tables with data stored in GooseFS(GFS), which can be used the same way as normal Hive tables. Follow below steps to prepare doris environment:
+    1. put goosefs-x.x.x-client.jar in fe/lib/ and apache_hdfs_broker/lib/
+    2. add extra properties 'fs.AbstractFileSystem.gfs.impl' = 'com.qcloud.cos.goosefs.hadoop.GooseFileSystem', 'fs.gfs.impl' = 'com.qcloud.cos.goosefs.hadoop.FileSystem' when creating catalog
+
+</version>
+
 ## Create Catalog
 
 ### Hive Metastore Catalog
diff --git a/docs/zh-CN/docs/advanced/broker.md b/docs/zh-CN/docs/advanced/broker.md
index 2711bc62d9..e82123e726 100644
--- a/docs/zh-CN/docs/advanced/broker.md
+++ b/docs/zh-CN/docs/advanced/broker.md
@@ -31,6 +31,7 @@ Broker 是 Doris 集群中一种可选进程,主要用于支持 Doris 读写
 - Apache HDFS
 - 阿里云 OSS
 - 腾讯云 CHDFS
+- 腾讯云 GFS (1.2.0 版本支持)
 - 华为云 OBS (1.2.0 版本后支持)
 - 亚马逊 S3
 - JuiceFS (2.0.0 版本支持)
diff --git a/docs/zh-CN/docs/lakehouse/multi-catalog/hive.md b/docs/zh-CN/docs/lakehouse/multi-catalog/hive.md
index 99df43bafb..456d16cd34 100644
--- a/docs/zh-CN/docs/lakehouse/multi-catalog/hive.md
+++ b/docs/zh-CN/docs/lakehouse/multi-catalog/hive.md
@@ -36,6 +36,17 @@ under the License.
 2. 支持 Managed Table 和 External Table。
 3. 可以识别 Hive Metastore 中存储的 hive、iceberg、hudi 元数据。
 4. 支持数据存储在 Juicefs 上的 hive 表,用法如下(需要把juicefs-hadoop-x.x.x.jar放在 fe/lib/ 和 apache_hdfs_broker/lib/ 下)。
+5. 支持数据存储在 CHDFS 上的 hive 表。需配置环境:
+   1. 把chdfs_hadoop_plugin_network-x.x.jar 放在 fe/lib/ 和 apache_hdfs_broker/lib/ 下
+   2. 将 hive 所在 Hadoop 集群的 core-site.xml 和 hdfs-site.xml 复制到 fe/conf/ 和 apache_hdfs_broker/conf 目录下
+
+<version since="dev">
+
+6. 支持数据存在在 GooseFS(GFS) 上的 hive、iceberg表。需配置环境:
+   1. 把 goosefs-x.x.x-client.jar 放在 fe/lib/ 和 apache_hdfs_broker/lib/ 下
+   2. 创建 catalog 时增加属性:'fs.AbstractFileSystem.gfs.impl' = 'com.qcloud.cos.goosefs.hadoop.GooseFileSystem', 'fs.gfs.impl' = 'com.qcloud.cos.goosefs.hadoop.FileSystem'
+   
+</version>
 
 ## 创建 Catalog
 
diff --git a/docs/zh-CN/docs/lakehouse/multi-catalog/iceberg.md b/docs/zh-CN/docs/lakehouse/multi-catalog/iceberg.md
index 2d0bda6148..87f1ff429d 100644
--- a/docs/zh-CN/docs/lakehouse/multi-catalog/iceberg.md
+++ b/docs/zh-CN/docs/lakehouse/multi-catalog/iceberg.md
@@ -32,6 +32,14 @@ under the License.
 1. 支持 Iceberg V1/V2 表格式。
 2. V2 格式仅支持 Position Delete 方式,不支持 Equality Delete。
 
+<version since="dev">
+
+3. 支持数据存在在 GooseFS(GFS) 上的 iceberg表。需配置环境:
+    1. 把goosefs-x.x.x-client.jar 放在 fe/lib/ 和 apache_hdfs_broker/lib/ 下
+    2. 创建 catalog 时增加属性:'fs.AbstractFileSystem.gfs.impl' = 'com.qcloud.cos.goosefs.hadoop.GooseFileSystem', 'fs.gfs.impl' = 'com.qcloud.cos.goosefs.hadoop.FileSystem'
+
+</version>
+
 ## 创建 Catalog
 
 ### 基于Hive Metastore创建Catalog
diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/ExportStmt.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/ExportStmt.java
index d9a7e59554..c4096a7ffe 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/ExportStmt.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/ExportStmt.java
@@ -277,9 +277,10 @@ public class ExportStmt extends StatementBase {
                     && !schema.equalsIgnoreCase("oss")
                     && !schema.equalsIgnoreCase("s3a")
                     && !schema.equalsIgnoreCase("cosn")
+                    && !schema.equalsIgnoreCase("gfs")
                     && !schema.equalsIgnoreCase("jfs"))) {
                 throw new AnalysisException("Invalid broker path. please use valid 'hdfs://', 'afs://' , 'bos://',"
-                        + " 'ofs://', 'obs://', 'oss://', 's3a://', 'cosn://' or 'jfs://' path.");
+                        + " 'ofs://', 'obs://', 'oss://', 's3a://', 'cosn://', 'gfs://' or 'jfs://' path.");
             }
         } else if (type == StorageBackend.StorageType.S3) {
             if (schema == null || !schema.equalsIgnoreCase("s3")) {
diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/StorageBackend.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/StorageBackend.java
index b1aacb0aaf..5d6c33c45e 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/StorageBackend.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/StorageBackend.java
@@ -101,6 +101,7 @@ public class StorageBackend implements ParseNode {
         HDFS("Hadoop Distributed File System"),
         LOCAL("Local file system"),
         OFS("Tencent CHDFS"),
+        GFS("Tencent Goose File System"),
         JFS("Juicefs"),
         STREAM("Stream load pipe");
 
diff --git a/fe/fe-core/src/main/java/org/apache/doris/backup/BlobStorage.java b/fe/fe-core/src/main/java/org/apache/doris/backup/BlobStorage.java
index be02be0690..3bf7f50818 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/backup/BlobStorage.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/backup/BlobStorage.java
@@ -52,6 +52,7 @@ public abstract class BlobStorage implements Writable {
             return new S3Storage(properties);
         } else if (type == StorageBackend.StorageType.HDFS
                 || type == StorageBackend.StorageType.OFS
+                || type == StorageBackend.StorageType.GFS
                 || type == StorageBackend.StorageType.JFS) {
             BlobStorage storage = new HdfsStorage(properties);
             // as of ofs files, use hdfs storage, but it's type should be ofs


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org