You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/08/09 14:27:32 UTC

[GitHub] [hudi] xuzifu666 opened a new pull request #3436: feat(docs): add ks3 support doc for hudi

xuzifu666 opened a new pull request #3436:
URL: https://github.com/apache/hudi/pull/3436


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
     - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xuzifu666 edited a comment on pull request #3436: [HUDI-2289] Add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xuzifu666 edited a comment on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-896609063


   ![}FS5)`U}7)1BX8NU) N{X02](https://user-images.githubusercontent.com/10645422/128994232-24cd1142-c974-4f5d-865b-dade2855c2ca.png)  the added MD file screenfile like the picture  @yanghua @xushiyan
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xuzifu666 commented on a change in pull request #3436: [HUDI-2289] add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xuzifu666 commented on a change in pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#discussion_r685709028



##########
File path: docs/_docs/3_3_ks3_filesystem.md
##########
@@ -0,0 +1,52 @@
+---
+title: KS3 Filesystem
+keywords: hudi, hive, aws, s3, spark, presto, ks3
+permalink: /docs/ks3_hoodie.html
+summary: In this page, we go over how to configure Hudi with KS3 filesystem.
+last_modified_at: 2021-08-09T15:59:57-04:00
+---
+In this page, we explain how to get your Hudi spark job to store into KS3.
+
+## KS3 configs
+
+There are two configurations required for Hudi-KS3 compatibility:
+
+- Adding KS3 Credentials for Hudi
+- Adding required Jars to classpath
+
+### KS3 Credentials
+
+Simplest way to use Hudi with KS3, is to configure your `SparkSession` or `SparkContext` with KS3 credentials. Hudi will automatically pick this up and talk to KS3.
+
+Alternatively, add the required configs in your core-site.xml from where Hudi can fetch them. Replace the `fs.defaultFS` with your KS3 bucket name and Hudi should be able to read/write from the bucket.
+
+```xml
+  <property>
+      <name>fs.defaultFS</name>
+      <value>hdfs://ks3node</value>
+  </property>
+
+  <property>
+      <name>fs.ks3.impl</name>
+      <value>com.ksyun.kmr.hadoop.fs.Ks3FileSystem</value>
+  </property>
+
+  <property>
+      <name>fs.ks3.AccessKey</name>
+      <value>KS3_KEY</value>
+  </property>
+
+  <property>
+       <name>fs.ks3.AccessSecret</name>
+       <value>KS3_SECRET</value>
+  </property>
+
+```
+
+

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xuzifu666 edited a comment on pull request #3436: [HUDI-2289] Add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xuzifu666 edited a comment on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-896609063


   ![}FS5)`U}7)1BX8NU) N{X02](https://user-images.githubusercontent.com/10645422/128994232-24cd1142-c974-4f5d-865b-dade2855c2ca.png)  the added MD file screenfile as the picture  @yanghua @xushiyan
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xuzifu666 commented on pull request #3436: feat(docs): add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xuzifu666 commented on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-895327840


   add md file to introduce hudi on ks3 filesystem @leesf 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xuzifu666 commented on pull request #3436: [HUDI-2289] add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xuzifu666 commented on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-895785636


   尊敬的PMC:
   您好,当前azure CI每次到&nbsp;unit_test​s_other_modules 模块就会超时 (超过一小时), 然后提交的pr就不能通过CI,发现很多其他的pr也是类似的问题,这个可有办法解决,或者是不是需要看一下azure CI是否有问题。问题如下图所示:
   
   
   https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=1599&amp;view=logs&amp;j=d3721143-1417-5e3d-cf04-c39c0756eab9
   
   
   
   
   
   
   
   感谢您抽出时间查看我的邮件,祝您工作顺利!
   
   
   
   
   ------------------&nbsp;原始邮件&nbsp;------------------
   发件人:                                                                                                                        "apache/hudi"                                                                                    ***@***.***&gt;;
   发送时间:&nbsp;2021年8月10日(星期二) 中午1:46
   ***@***.***&gt;;
   ***@***.******@***.***&gt;;
   主题:&nbsp;Re: [apache/hudi] [HUDI-2289] add ks3 support doc for hudi (#3436)
   
   
   
   
   
    
   @yanghua commented on this pull request.
    
    
   In docs/_docs/3_3_ks3_filesystem.md:
    &gt; + +  
   duplicated empty lines
    
   —
   You are receiving this because you authored the thread.
   Reply to this email directly, view it on GitHub, or unsubscribe.
   Triage notifications on the go with GitHub Mobile for iOS or Android.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xuzifu666 edited a comment on pull request #3436: [HUDI-2289] add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xuzifu666 edited a comment on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-896604769


   > @xuzifu666 Azure CI is set to ignore PRs merging into `asf-site` branch, which are just website/docs updates. I suggest PR creator upload screenshots in the PR to show the changed part on the website, and for reviewers' manual review before merging.
   > 
   > @yanghua @vinothchandar Do you agree with the setting described above?
   > 
   > @xuzifu666 Travis CI is broken now (forever queue due to limited job quota in ASF org), and we're removing it soon, so just ignore travis failure for now.
   
   ok i aggree,  i would upload screensnapshot of added md file to the pr @xushiyan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on pull request #3436: [HUDI-2289] add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-895804043


   > 尊敬的PMC: 您好,当前azure CI每次到&nbsp;unit_test​s_other_modules 模块就会超时 (超过一小时), 然后提交的pr就不能通过CI,发现很多其他的pr也是类似的问题,这个可有办法解决,或者是不是需要看一下azure CI是否有问题。问题如下图所示: https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=1599&amp;view=logs&amp;j=d3721143-1417-5e3d-cf04-c39c0756eab9 感谢您抽出时间查看我的邮件,祝您工作顺利!
   > […](#)
   > ------------------&nbsp;原始邮件&nbsp;------------------ 发件人: "apache/hudi" ***@***.***&gt;; 发送时间:&nbsp;2021年8月10日(星期二) 中午1:46 ***@***.***&gt;; ***@***.******@***.***&gt;; 主题:&nbsp;Re: [apache/hudi] [HUDI-2289] add ks3 support doc for hudi (#3436) @yanghua commented on this pull request. In docs/_docs/3_3_ks3_filesystem.md: &gt; + + duplicated empty lines — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
   
   社区已经有PR正在修复此问题,请保持耐心等待下。
   
   Thanks! There is a PR in the community that is trying to fix the problem you reported. Please be patient. cc @xushiyan  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan merged pull request #3436: [HUDI-2289] Add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xushiyan merged pull request #3436:
URL: https://github.com/apache/hudi/pull/3436


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xuzifu666 commented on pull request #3436: [HUDI-2289] Add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xuzifu666 commented on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-896773198


   CI tools is broken for asf-site, would you help to merge the doc PR into master? @xushiyan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua edited a comment on pull request #3436: [HUDI-2289] add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
yanghua edited a comment on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-895804043


   > 尊敬的PMC: 您好,当前azure CI每次到&nbsp;unit_test​s_other_modules 模块就会超时 (超过一小时), 然后提交的pr就不能通过CI,发现很多其他的pr也是类似的问题,这个可有办法解决,或者是不是需要看一下azure CI是否有问题。问题如下图所示: https://dev.azure.com/apache-hudi-ci-org/apache-hudi-ci/_build/results?buildId=1599&amp;view=logs&amp;j=d3721143-1417-5e3d-cf04-c39c0756eab9 感谢您抽出时间查看我的邮件,祝您工作顺利!
   > […](#)
   > ------------------&nbsp;原始邮件&nbsp;------------------ 发件人: "apache/hudi" ***@***.***&gt;; 发送时间:&nbsp;2021年8月10日(星期二) 中午1:46 ***@***.***&gt;; ***@***.******@***.***&gt;; 主题:&nbsp;Re: [apache/hudi] [HUDI-2289] add ks3 support doc for hudi (#3436) @yanghua commented on this pull request. In docs/_docs/3_3_ks3_filesystem.md: &gt; + + duplicated empty lines — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
   
   
   Thanks! There is a PR in the community that is trying to fix the problem you reported. Please be patient. cc @xushiyan  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xuzifu666 commented on pull request #3436: [HUDI-2289] Add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xuzifu666 commented on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-896609063


   ![}FS5)`U}7)1BX8NU) N{X02](https://user-images.githubusercontent.com/10645422/128994232-24cd1142-c974-4f5d-865b-dade2855c2ca.png)  the add MD file screenfile like the picture  @yanghua @xushiyan
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on pull request #3436: [HUDI-2289] add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-896438183


   > sorry for the newbie questions. whats ks3? :)
   
   It's a Chinese public cloud vendor. More details: https://www.ksyun.com/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on a change in pull request #3436: [HUDI-2289] add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#discussion_r685707469



##########
File path: docs/_docs/3_3_ks3_filesystem.md
##########
@@ -0,0 +1,52 @@
+---
+title: KS3 Filesystem
+keywords: hudi, hive, aws, s3, spark, presto, ks3
+permalink: /docs/ks3_hoodie.html
+summary: In this page, we go over how to configure Hudi with KS3 filesystem.
+last_modified_at: 2021-08-09T15:59:57-04:00
+---
+In this page, we explain how to get your Hudi spark job to store into KS3.
+
+## KS3 configs
+
+There are two configurations required for Hudi-KS3 compatibility:
+
+- Adding KS3 Credentials for Hudi
+- Adding required Jars to classpath
+
+### KS3 Credentials
+
+Simplest way to use Hudi with KS3, is to configure your `SparkSession` or `SparkContext` with KS3 credentials. Hudi will automatically pick this up and talk to KS3.
+
+Alternatively, add the required configs in your core-site.xml from where Hudi can fetch them. Replace the `fs.defaultFS` with your KS3 bucket name and Hudi should be able to read/write from the bucket.
+
+```xml
+  <property>
+      <name>fs.defaultFS</name>
+      <value>hdfs://ks3node</value>
+  </property>
+
+  <property>
+      <name>fs.ks3.impl</name>
+      <value>com.ksyun.kmr.hadoop.fs.Ks3FileSystem</value>
+  </property>
+
+  <property>
+      <name>fs.ks3.AccessKey</name>
+      <value>KS3_KEY</value>
+  </property>
+
+  <property>
+       <name>fs.ks3.AccessSecret</name>
+       <value>KS3_SECRET</value>
+  </property>
+
+```
+
+

Review comment:
       duplicated empty lines




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xuzifu666 commented on pull request #3436: feat(docs): add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xuzifu666 commented on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-895271605


   add doc with support ks3 storage on hudi @leesf 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xuzifu666 commented on a change in pull request #3436: [HUDI-2289] add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xuzifu666 commented on a change in pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#discussion_r685709028



##########
File path: docs/_docs/3_3_ks3_filesystem.md
##########
@@ -0,0 +1,52 @@
+---
+title: KS3 Filesystem
+keywords: hudi, hive, aws, s3, spark, presto, ks3
+permalink: /docs/ks3_hoodie.html
+summary: In this page, we go over how to configure Hudi with KS3 filesystem.
+last_modified_at: 2021-08-09T15:59:57-04:00
+---
+In this page, we explain how to get your Hudi spark job to store into KS3.
+
+## KS3 configs
+
+There are two configurations required for Hudi-KS3 compatibility:
+
+- Adding KS3 Credentials for Hudi
+- Adding required Jars to classpath
+
+### KS3 Credentials
+
+Simplest way to use Hudi with KS3, is to configure your `SparkSession` or `SparkContext` with KS3 credentials. Hudi will automatically pick this up and talk to KS3.
+
+Alternatively, add the required configs in your core-site.xml from where Hudi can fetch them. Replace the `fs.defaultFS` with your KS3 bucket name and Hudi should be able to read/write from the bucket.
+
+```xml
+  <property>
+      <name>fs.defaultFS</name>
+      <value>hdfs://ks3node</value>
+  </property>
+
+  <property>
+      <name>fs.ks3.impl</name>
+      <value>com.ksyun.kmr.hadoop.fs.Ks3FileSystem</value>
+  </property>
+
+  <property>
+      <name>fs.ks3.AccessKey</name>
+      <value>KS3_KEY</value>
+  </property>
+
+  <property>
+       <name>fs.ks3.AccessSecret</name>
+       <value>KS3_SECRET</value>
+  </property>
+
+```
+
+

Review comment:
       done @yanghua 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on pull request #3436: [HUDI-2289] add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xushiyan commented on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-896586180


   @xuzifu666 Azure CI is set to ignore PRs merging into `asf-site` branch, which are just website/docs updates. I suggest PR creator upload screenshots in the PR to show the changed part on the website, and for reviewers' manual review before merging. 
   
   @yanghua @vinothchandar Do you agree with the setting described above?
   
   @xuzifu666 Travis CI is broken now (forever queue due to limited job quota in ASF org), and we're removing it soon, so just ignore travis failure for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xuzifu666 edited a comment on pull request #3436: [HUDI-2289] Add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xuzifu666 edited a comment on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-896773198


   CI tools is broken for asf-site, would you help to merge the doc PR into asf-site? @xushiyan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on pull request #3436: [HUDI-2289] Add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xushiyan commented on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-897104606


   @xuzifu666 as mentioned above, `asf-site` PR will be skipped. this docs update looks good per your screenshot. Thanks,  merging it now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xuzifu666 commented on pull request #3436: [HUDI-2289] add ks3 support doc for hudi

Posted by GitBox <gi...@apache.org>.
xuzifu666 commented on pull request #3436:
URL: https://github.com/apache/hudi/pull/3436#issuecomment-896604769


   > @xuzifu666 Azure CI is set to ignore PRs merging into `asf-site` branch, which are just website/docs updates. I suggest PR creator upload screenshots in the PR to show the changed part on the website, and for reviewers' manual review before merging.
   > 
   > @yanghua @vinothchandar Do you agree with the setting described above?
   > 
   > @xuzifu666 Travis CI is broken now (forever queue due to limited job quota in ASF org), and we're removing it soon, so just ignore travis failure for now.
   
   ok i aggree,  i would upload screensnapshot of added md file to the pr


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org