You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by le...@apache.org on 2020/10/18 09:27:03 UTC

[hudi] branch asf-site updated: [HUDI-1344] Documentation for IBM Cloud Object Storage Support (#2183)

This is an automated email from the ASF dual-hosted git repository.

leesf pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new a992f86  [HUDI-1344] Documentation for IBM Cloud Object Storage Support  (#2183)
a992f86 is described below

commit a992f86f11e821e543cc7cfed38c0f5d1e96fe4d
Author: Guy Khazma <33...@users.noreply.github.com>
AuthorDate: Sun Oct 18 12:26:52 2020 +0300

    [HUDI-1344] Documentation for IBM Cloud Object Storage Support  (#2183)
---
 docs/_docs/0_8_ibm_cos_filesystem.cn.md | 79 +++++++++++++++++++++++++++++++++
 docs/_docs/0_8_ibm_cos_filesystem.md    | 78 ++++++++++++++++++++++++++++++++
 docs/_docs/2_7_cloud.cn.md              |  2 +
 docs/_docs/2_7_cloud.md                 |  2 +
 4 files changed, 161 insertions(+)

diff --git a/docs/_docs/0_8_ibm_cos_filesystem.cn.md b/docs/_docs/0_8_ibm_cos_filesystem.cn.md
new file mode 100644
index 0000000..65b2419
--- /dev/null
+++ b/docs/_docs/0_8_ibm_cos_filesystem.cn.md
@@ -0,0 +1,79 @@
+---
+title: IBM Cloud Object Storage Filesystem
+keywords: hudi, hive, ibm, cos, spark, presto
+permalink: /cn/docs/ibm_cos_hoodie.html
+summary: In this page, we go over how to configure Hudi with IBM Cloud Object Storage filesystem.
+last_modified_at: 2020-10-01T11:38:24-10:00
+language: cn
+---
+In this page, we explain how to get your Hudi spark job to store into IBM Cloud Object Storage.
+
+## IBM COS configs
+
+There are two configurations required for Hudi-IBM Cloud Object Storage compatibility:
+
+- Adding IBM COS Credentials for Hudi
+- Adding required Jars to classpath
+
+### IBM Cloud Object Storage Credentials
+
+Simplest way to use Hudi with IBM Cloud Object Storage, is to configure your `SparkSession` or `SparkContext` with IBM Cloud Object Storage credentials using [Stocator](https://github.com/CODAIT/stocator) storage connector for Spark. Hudi will automatically pick this up and talk to IBM Cloud Object Storage.
+
+Alternatively, add the required configs in your core-site.xml from where Hudi can fetch them. Replace the `fs.defaultFS` with your IBM Cloud Object Storage bucket name and Hudi should be able to read/write from the bucket.
+
+For example, using HMAC keys and service name `myCOS`:
+```xml
+  <property>
+      <name>fs.defaultFS</name>
+      <value>cos://myBucket.myCOS</value>
+  </property>
+
+  <property>
+      <name>fs.cos.flat.list</name>
+      <value>true</value>
+  </property>
+
+  <property>
+	  <name>fs.stocator.scheme.list</name>
+	  <value>cos</value>
+  </property>
+
+  <property>
+	  <name>fs.cos.impl</name>
+	  <value>com.ibm.stocator.fs.ObjectStoreFileSystem</value>
+  </property>
+
+  <property>
+	  <name>fs.stocator.cos.impl</name>
+	  <value>com.ibm.stocator.fs.cos.COSAPIClient</value>
+  </property>
+
+  <property>
+	  <name>fs.stocator.cos.scheme</name>
+	  <value>cos</value>
+  </property>
+
+  <property>
+	  <name>fs.cos.myCos.access.key</name>
+	  <value>ACCESS KEY</value>
+  </property>
+
+  <property>
+	  <name>fs.cos.myCos.endpoint</name>
+	  <value>http://s3-api.us-geo.objectstorage.softlayer.net</value>
+  </property>
+
+  <property>
+	  <name>fs.cos.myCos.secret.key</name>
+	  <value>SECRET KEY</value>
+  </property>
+
+```
+
+For more options see Stocator [documentation](https://github.com/CODAIT/stocator/blob/master/README.md).
+
+### IBM Cloud Object Storage Libs
+
+IBM Cloud Object Storage hadoop libraries to add to our classpath
+
+ - com.ibm.stocator:stocator:1.1.3
diff --git a/docs/_docs/0_8_ibm_cos_filesystem.md b/docs/_docs/0_8_ibm_cos_filesystem.md
new file mode 100644
index 0000000..c7ab8e8
--- /dev/null
+++ b/docs/_docs/0_8_ibm_cos_filesystem.md
@@ -0,0 +1,78 @@
+---
+title: IBM Cloud Object Storage Filesystem
+keywords: hudi, hive, ibm, cos, spark, presto
+permalink: /docs/ibm_cos_hoodie.html
+summary: In this page, we go over how to configure Hudi with IBM Cloud Object Storage filesystem.
+last_modified_at: 2020-10-01T11:38:24-10:00
+---
+In this page, we explain how to get your Hudi spark job to store into IBM Cloud Object Storage.
+
+## IBM COS configs
+
+There are two configurations required for Hudi-IBM Cloud Object Storage compatibility:
+
+- Adding IBM COS Credentials for Hudi
+- Adding required Jars to classpath
+
+### IBM Cloud Object Storage Credentials
+
+Simplest way to use Hudi with IBM Cloud Object Storage, is to configure your `SparkSession` or `SparkContext` with IBM Cloud Object Storage credentials using [Stocator](https://github.com/CODAIT/stocator) storage connector for Spark. Hudi will automatically pick this up and talk to IBM Cloud Object Storage.
+
+Alternatively, add the required configs in your `core-site.xml` from where Hudi can fetch them. Replace the `fs.defaultFS` with your IBM Cloud Object Storage bucket name and Hudi should be able to read/write from the bucket.
+
+For example, using HMAC keys and service name `myCOS`:
+```xml
+  <property>
+      <name>fs.defaultFS</name>
+      <value>cos://myBucket.myCOS</value>
+  </property>
+
+  <property>
+      <name>fs.cos.flat.list</name>
+      <value>true</value>
+  </property>
+
+  <property>
+	  <name>fs.stocator.scheme.list</name>
+	  <value>cos</value>
+  </property>
+
+  <property>
+	  <name>fs.cos.impl</name>
+	  <value>com.ibm.stocator.fs.ObjectStoreFileSystem</value>
+  </property>
+
+  <property>
+	  <name>fs.stocator.cos.impl</name>
+	  <value>com.ibm.stocator.fs.cos.COSAPIClient</value>
+  </property>
+
+  <property>
+	  <name>fs.stocator.cos.scheme</name>
+	  <value>cos</value>
+  </property>
+
+  <property>
+	  <name>fs.cos.myCos.access.key</name>
+	  <value>ACCESS KEY</value>
+  </property>
+
+  <property>
+	  <name>fs.cos.myCos.endpoint</name>
+	  <value>http://s3-api.us-geo.objectstorage.softlayer.net</value>
+  </property>
+
+  <property>
+	  <name>fs.cos.myCos.secret.key</name>
+	  <value>SECRET KEY</value>
+  </property>
+
+```
+
+For more options see Stocator [documentation](https://github.com/CODAIT/stocator/blob/master/README.md).
+
+### IBM Cloud Object Storage Libs
+
+IBM Cloud Object Storage hadoop libraries to add to our classpath
+
+ - com.ibm.stocator:stocator:1.1.3
diff --git a/docs/_docs/2_7_cloud.cn.md b/docs/_docs/2_7_cloud.cn.md
index fbd18b6..73b74b9 100644
--- a/docs/_docs/2_7_cloud.cn.md
+++ b/docs/_docs/2_7_cloud.cn.md
@@ -22,3 +22,5 @@ language: cn
    Azure和Hudi协同工作所需的配置。
  * [Tencent Cloud Object Storage](/cn/docs/cos_hoodie.html) <br/>
    COS和Hudi协同工作所需的配置。
+ * [IBM Cloud Object Storage](/cn/docs/ibm_cos_hoodie.html) <br/>
+   IBM Cloud Object Storage和Hudi协同工作所需的配置。
diff --git a/docs/_docs/2_7_cloud.md b/docs/_docs/2_7_cloud.md
index 226180c..6b82437 100644
--- a/docs/_docs/2_7_cloud.md
+++ b/docs/_docs/2_7_cloud.md
@@ -22,3 +22,5 @@ to cloud stores.
    Configurations required for Azure and Hudi co-operability.
 * [Tencent Cloud Object Storage](/docs/cos_hoodie.html) <br/>
    Configurations required for COS and Hudi co-operability.
+* [IBM Cloud Object Storage](/docs/ibm_cos_hoodie.html) <br/>
+   Configurations required for IBM Cloud Object Storage and Hudi co-operability.