You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/02/05 09:38:31 UTC

[GitHub] [hudi] ZhangChaoming opened a new pull request #2540: Hadoop env

ZhangChaoming opened a new pull request #2540:
URL: https://github.com/apache/hudi/pull/2540


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   In my oppions, the hadoop configuration totally read from the actual environment. For example, the `fs.hdfs.impl` may be `org.apache.hadoop.fs.viewfs.ViewFileSyste` but not `org.apache.hadoop.hdfs.DistributedFileSystem`. Now, the implemtation was forced to be set with `org.apache.hadoop.hdfs.DistributedFileSystem`, which is not a graceful solution.
   Also, the hadoop dependency scope level should be 'test'.
   
   ## Brief change log
   
     - Remove codes to make the hadoop configuration totally read from the actual environment.
     - Set hadoop dependency scope level to 'test'.
   
   ## Verify this pull request
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   This change added tests and can be verified as follows:
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] ZhangChaoming commented on a change in pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
ZhangChaoming commented on a change in pull request #2540:
URL: https://github.com/apache/hudi/pull/2540#discussion_r578079792



##########
File path: hudi-flink/pom.xml
##########
@@ -157,7 +157,7 @@
     <dependency>
       <groupId>org.apache.hadoop</groupId>
       <artifactId>hadoop-common</artifactId>
-      <scope>compile</scope>
+      <scope>provided</scope>

Review comment:
       It's redundant, but do no harm.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on a change in pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on a change in pull request #2540:
URL: https://github.com/apache/hudi/pull/2540#discussion_r578182457



##########
File path: hudi-common/pom.xml
##########
@@ -154,6 +155,7 @@
     <dependency>
       <groupId>org.apache.hadoop</groupId>
       <artifactId>hadoop-hdfs</artifactId>
+      <scope>provided</scope>

Review comment:
       +1, I think we can remove all the unnecessary Hadoop dependency from the sub-modules since they already exist in the parent pom. Do you mind removing those in this PR @ZhangChaoming ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] ZhangChaoming commented on a change in pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
ZhangChaoming commented on a change in pull request #2540:
URL: https://github.com/apache/hudi/pull/2540#discussion_r578078710



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java
##########
@@ -80,9 +80,6 @@
   private static final PathFilter ALLOW_ALL_FILTER = file -> true;
 
   public static Configuration prepareHadoopConf(Configuration conf) {
-    conf.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());

Review comment:
       I wanna express that this hard code will ignore users configuration.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io commented on pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #2540:
URL: https://github.com/apache/hudi/pull/2540#issuecomment-774638707


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2540?src=pr&el=h1) Report
   > Merging [#2540](https://codecov.io/gh/apache/hudi/pull/2540?src=pr&el=desc) (9ef4f4a) into [master](https://codecov.io/gh/apache/hudi/commit/d74d8e208439df8cb2eb6c24019be55c002d00a5?el=desc) (d74d8e2) will **decrease** coverage by `40.82%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2540/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2540?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #2540       +/-   ##
   ============================================
   - Coverage     50.52%   9.69%   -40.83%     
   + Complexity     3123      48     -3075     
   ============================================
     Files           430      53      -377     
     Lines         19597    1929    -17668     
     Branches       2008     229     -1779     
   ============================================
   - Hits           9901     187     -9714     
   + Misses         8887    1729     -7158     
   + Partials        809      13      -796     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.69% <ø> (-59.79%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2540?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | ... and [406 more](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2540:
URL: https://github.com/apache/hudi/pull/2540#issuecomment-774638707


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2540?src=pr&el=h1) Report
   > Merging [#2540](https://codecov.io/gh/apache/hudi/pull/2540?src=pr&el=desc) (326399f) into [master](https://codecov.io/gh/apache/hudi/commit/d74d8e208439df8cb2eb6c24019be55c002d00a5?el=desc) (d74d8e2) will **decrease** coverage by `40.83%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2540/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2540?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #2540       +/-   ##
   ============================================
   - Coverage     50.52%   9.68%   -40.84%     
   + Complexity     3123      48     -3075     
   ============================================
     Files           430      53      -377     
     Lines         19597    1931    -17666     
     Branches       2008     230     -1778     
   ============================================
   - Hits           9901     187     -9714     
   + Misses         8887    1731     -7156     
   + Partials        809      13      -796     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.68% <ø> (-59.80%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2540?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | ... and [406 more](https://codecov.io/gh/apache/hudi/pull/2540/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 merged pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
garyli1019 merged pull request #2540:
URL: https://github.com/apache/hudi/pull/2540


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] ZhangChaoming commented on a change in pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
ZhangChaoming commented on a change in pull request #2540:
URL: https://github.com/apache/hudi/pull/2540#discussion_r573396780



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java
##########
@@ -80,9 +80,6 @@
   private static final PathFilter ALLOW_ALL_FILTER = file -> true;
 
   public static Configuration prepareHadoopConf(Configuration conf) {
-    conf.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());

Review comment:
       @vinothchandar  On my machine, the implementation of filesyetem is `org.apache.hadoop.fs.viewfs.ViewFileSystem`. IMO, following user's actual configuration will be bertter.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] ZhangChaoming commented on a change in pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
ZhangChaoming commented on a change in pull request #2540:
URL: https://github.com/apache/hudi/pull/2540#discussion_r578238881



##########
File path: hudi-common/pom.xml
##########
@@ -154,6 +155,7 @@
     <dependency>
       <groupId>org.apache.hadoop</groupId>
       <artifactId>hadoop-hdfs</artifactId>
+      <scope>provided</scope>

Review comment:
       @garyli1019 My pleasure.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #2540:
URL: https://github.com/apache/hudi/pull/2540#discussion_r573380111



##########
File path: hudi-common/pom.xml
##########
@@ -154,6 +155,7 @@
     <dependency>
       <groupId>org.apache.hadoop</groupId>
       <artifactId>hadoop-hdfs</artifactId>
+      <scope>provided</scope>

Review comment:
       wondering if the parent pom scope is already `provided` , is nt that inherited>?

##########
File path: hudi-flink/pom.xml
##########
@@ -157,7 +157,7 @@
     <dependency>
       <groupId>org.apache.hadoop</groupId>
       <artifactId>hadoop-common</artifactId>
-      <scope>compile</scope>
+      <scope>provided</scope>

Review comment:
       same questions on the need to repeat this again in the child modules

##########
File path: hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java
##########
@@ -80,9 +80,6 @@
   private static final PathFilter ALLOW_ALL_FILTER = file -> true;
 
   public static Configuration prepareHadoopConf(Configuration conf) {
-    conf.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());

Review comment:
       this is some old old code. Have you ensured that this can run on a non HDFS filesystem now? (HDFS is testing in unit and integration tests already)

##########
File path: hudi-common/pom.xml
##########
@@ -144,6 +144,7 @@
           <artifactId>*</artifactId>
         </exclusion>
       </exclusions>
+      <scope>provided</scope>

Review comment:
       I think so. We explicitly whitelist in bundles anwyay, so does not matter




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on pull request #2540:
URL: https://github.com/apache/hudi/pull/2540#issuecomment-774458353


   hi @ZhangChaoming , thanks for your contribution. the CI failed, would you take a look?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] ZhangChaoming commented on pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
ZhangChaoming commented on pull request #2540:
URL: https://github.com/apache/hudi/pull/2540#issuecomment-774650183


   @garyli1019 Done.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #2540:
URL: https://github.com/apache/hudi/pull/2540#discussion_r573426169



##########
File path: hudi-common/src/main/java/org/apache/hudi/common/fs/FSUtils.java
##########
@@ -80,9 +80,6 @@
   private static final PathFilter ALLOW_ALL_FILTER = file -> true;
 
   public static Configuration prepareHadoopConf(Configuration conf) {
-    conf.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());

Review comment:
       Just to be clear. This code basically says, if you use `hdfs:` or `file:` schemes, these classes are used. Of course, we honor the users configuration. otherwise we wont be able to write to S3 or GCS :) 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on a change in pull request #2540: [HUDI-1586] [Common Core] [Flink Integration] Reduce the coupling of hadoop.

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on a change in pull request #2540:
URL: https://github.com/apache/hudi/pull/2540#discussion_r571607360



##########
File path: hudi-common/pom.xml
##########
@@ -144,6 +144,7 @@
           <artifactId>*</artifactId>
         </exclusion>
       </exclusions>
+      <scope>provided</scope>

Review comment:
       I guess this should be ok. @n3nash agree? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org