You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/01/06 05:02:38 UTC

[GitHub] [hudi] umehrot2 opened a new pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

umehrot2 opened a new pull request #2410:
URL: https://github.com/apache/hudi/pull/2410


   ## What is the purpose of the pull request
   
   Moves HoodieEngineContext class and its dependencies to hudi-common, so that we can parallelize fetching of files and partitions in FileSystemBackedTableMetadata and FSUtils.
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755158309






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755936804


   @yanghua I agree with you mostly. lets rethink our structure more holsitically. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755391853


   > @yanghua Did mull this a lot, along similar lines. I was thinking about Engine as a general construct that provides parallel execution, rather than being tied to the client/writing. For e.g we can use parallelized listing even on the InputFormat implementations.
   > 
   > The core issue is we want to parallelize `FSUtils.getAllPartitionPaths()` (and the underlying call to the HoodieTableMetadata#getAllPartitionPaths()). if you can think of a better way, please let us know.
   
   Maybe I don't have a good way, but I personally tend to make common a little simpler, so that it does not blend with the engine. Engine can not be tied to client/writing, but can engine become a standalone module? For example, between common and client? `common <- engine <- client`. The operations to be parallelized are in the engine module, I don't know if it is feasible.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] umehrot2 edited a comment on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
umehrot2 edited a comment on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755777542


   @yanghua Thank you for the feedback and bringing up a good discussion. Me and Vinoth had a quick call on this, and we still feel that option A of moving `HoodieEngineContext` over to `hudi-common` would be more reasonable. By moving this to `hudi-common` and having like a `HoodieMREngineContext` or `HoodieQueryEngineContext`, different query engines can potentially be able to use the parallelization optimizations. That is why we would want to move just the interface over to `hudi-common`. The query side engine contexts can sit in `hudi-common`, while the writer side engine contexts will continue to be in `hudi-client`. Please let us know if you have concerns with this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] umehrot2 merged pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
umehrot2 merged pull request #2410:
URL: https://github.com/apache/hudi/pull/2410


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755372065


   @yanghua  Did mull this a lot, along similar lines. I was thinking about Engine as a general construct that provides parallel execution, rather than being tied to the client/writing. For e.g we can use parallelized listing even on the InputFormat implementations. 
   
   The core issue is we want to parallelize `FSUtils.getAllPartitionPaths()` (and the underlying call to the HoodieTableMetadata#getAllPartitionPaths()). if you can think of a better way, please let us know. 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755937041


   @umehrot2 should we first merge this and do the actual fixes on top as a separarte PR? if so, please feel free to land this


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755158309


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2410?src=pr&el=h1) Report
   > Merging [#2410](https://codecov.io/gh/apache/hudi/pull/2410?src=pr&el=desc) (0e81d6e) into [master](https://codecov.io/gh/apache/hudi/commit/698694a1571cdcc9848fc79aa34c8cbbf9662bc4?el=desc) (698694a) will **decrease** coverage by `40.19%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2410/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2410?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2410       +/-   ##
   =============================================
   - Coverage     50.23%   10.04%   -40.20%     
   + Complexity     2985       48     -2937     
   =============================================
     Files           410       52      -358     
     Lines         18398     1852    -16546     
     Branches       1884      223     -1661     
   =============================================
   - Hits           9242      186     -9056     
   + Misses         8398     1653     -6745     
   + Partials        758       13      -745     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `10.04% <ø> (-59.62%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2410?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `83.47% <ø> (-5.22%)` | `28.00 <0.00> (ø)` | |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | ... and [385 more](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar edited a comment on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
vinothchandar edited a comment on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755529282


   This points to a deeper discussion. Right now, hudi-common is depended upon by `hudi-hadoop-mr` which has all the Hive/InputFormat code. Should that also be redone as a client-module, if saw we want to support Hive writing ultimately? That is a large change though. 
   
   So, we can do either of the two options here for now.
   
   A) move `HoodieEngineContext` to hudi-common; add a `HoodieMREngineContext`
   B) Move FSUtils#getAllPartitionPaths to hudi-client-common and have another parallelized impl. 
   
   I still prefer A. But want to hear what you both think as well. Let's make a quick call, since we want this in soon :) 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755261406


   @vinothchandar I have a concern about this operation. The common module seems not a good place to hold the context of the client engine. It will break the abstraction. The write client engine context should be hosted in the client common package.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar edited a comment on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
vinothchandar edited a comment on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755529282


   This points to a deeper discussion. Right now, hudi-common is depended upon by `hudi-hadoop-mr` which has all the Hive/InputFormat code. Should that also be redone as a client-module, if saw we want to support Hive writing ultimately? That is a large change though. 
   
   So, we can do either of the two options here for now.
   
   A) move `HoodieEngineContext` to hudi-common; add a `HoodieMREngineContext` (very similar to `HoodieJavaEngineContext`)
   B) Move FSUtils#getAllPartitionPaths to hudi-client-common and have another parallelized impl. 
   
   I still prefer A. But want to hear what you both think as well. Let's make a quick call, since we want this in soon :) 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755878176


   I agree with this operation so as not to block work progress. However, I always feel that in terms of project layout, if we treat writing or reading (or call it query) equally, it will be clearer. Currently, we clearly define the "client" as a writing client. On the query side, just use the ability of an external engine. However, if we hope to introduce a native java query client and a native Python query client in the future, how will we consider their location? Also, if you consider introducing file-level APIs. How to state more clearly and intuitively: `common`, `client`, `write`, `query`(`read`), `engine`...?
   
   I always think that introducing query-side things into common is not elegant enough. Although Hudi is just a library, we can also regard it as a "service" in the future. For any service, `Input` and `Output` are clear and equivalent.
   
   Of course, we can discuss these in-depth in the future. For now, let us merge it and complete the work.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755529282


   This points to a deeper discussion. Right now, hudi-common is depended upon by `hudi-hadoop-mr` which has all the Hive/InputFormat code. Should that also be redone as a client-module, if saw we want to support Hive writing ultimately? That is a large change though. 
   
   So, we can do either of the two options here for now.
   
   A) move `HoodieEngineContext` to hudi-common
   B) Move FSUtils#getAllPartitionPaths to hudi-client-common and have another parallelized impl. 
   
   I still prefer A. But want to hear what you both think as well. Let's make a quick call, since we want this in soon :) 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755158309


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2410?src=pr&el=h1) Report
   > Merging [#2410](https://codecov.io/gh/apache/hudi/pull/2410?src=pr&el=desc) (0e81d6e) into [master](https://codecov.io/gh/apache/hudi/commit/698694a1571cdcc9848fc79aa34c8cbbf9662bc4?el=desc) (698694a) will **decrease** coverage by `40.19%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2410/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2410?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2410       +/-   ##
   =============================================
   - Coverage     50.23%   10.04%   -40.20%     
   + Complexity     2985       48     -2937     
   =============================================
     Files           410       52      -358     
     Lines         18398     1852    -16546     
     Branches       1884      223     -1661     
   =============================================
   - Hits           9242      186     -9056     
   + Misses         8398     1653     -6745     
   + Partials        758       13      -745     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `10.04% <ø> (-59.62%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2410?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `83.47% <ø> (-5.22%)` | `28.00 <0.00> (ø)` | |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | ... and [385 more](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755524721


   >Engine can not be tied to client/writing, but can engine become a standalone module? 
   
   Potentially. We can consider this again, when we think about the additional abstractions needed to make more code move from flink-client, java-client etc back into client-common? 
   
   That said, the problem here is that FSUtils is in `hudi-common`. Alternatively we can move `FSUtils` & the `FileSystemBackedTableMetadata` to client-common? Seems like FSUtils not really called in hudi-common at all? 
   
   
   ![image](https://user-images.githubusercontent.com/1179324/103809116-e6170f80-500d-11eb-9086-e00ef434775a.png)
   
   But, `FileSystemBackedTableMetadata` is used by the query side as well and has to be in `hudi-common`. So we can make a new class inside `client-common` that does parallelized `getAllPartitionPaths()`? 
   
   
   wdyt? @umehrot2 
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io commented on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755158309


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2410?src=pr&el=h1) Report
   > Merging [#2410](https://codecov.io/gh/apache/hudi/pull/2410?src=pr&el=desc) (bd531fc) into [master](https://codecov.io/gh/apache/hudi/commit/698694a1571cdcc9848fc79aa34c8cbbf9662bc4?el=desc) (698694a) will **decrease** coverage by `40.19%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2410/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2410?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2410       +/-   ##
   =============================================
   - Coverage     50.23%   10.04%   -40.20%     
   + Complexity     2985       48     -2937     
   =============================================
     Files           410       52      -358     
     Lines         18398     1852    -16546     
     Branches       1884      223     -1661     
   =============================================
   - Hits           9242      186     -9056     
   + Misses         8398     1653     -6745     
   + Partials        758       13      -745     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `10.04% <ø> (-59.62%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2410?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [.../apache/hudi/utilities/HoodieSnapshotExporter.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZVNuYXBzaG90RXhwb3J0ZXIuamF2YQ==) | `83.47% <ø> (-5.22%)` | `28.00 <0.00> (ø)` | |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | ... and [385 more](https://codecov.io/gh/apache/hudi/pull/2410/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] umehrot2 commented on pull request #2410: [HUDI-1510] Move HoodieEngineContext and its dependencies to hudi-common

Posted by GitBox <gi...@apache.org>.
umehrot2 commented on pull request #2410:
URL: https://github.com/apache/hudi/pull/2410#issuecomment-755777542


   @yanghua Thank you for the feedback and bringing up a good discussion. Me and Vinoth had a quick call on this, and we still feel that option A of moving `HoodieEngineContext` over to `hudi-common` would be more reasonable. By moving this to `hudi-common` and having like a `HoodieMREngineContext` or `HoodieQueryEngineContext`, different query engines can potentially be able to use the parallelization optimizations. That is why we would want to move just the interface over to `hudi-common`. The query side engine contexts can sit in `hudi-common`, while the writer side engine contexts will continue to be in `hudi-client`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org