You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/10/05 17:41:30 UTC

[GitHub] [hudi] rmpifer opened a new pull request #2147: [HUDI-1289] Remove shading pattern for hbase dependencies in hudi-spark-bundle

rmpifer opened a new pull request #2147:
URL: https://github.com/apache/hudi/pull/2147


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   Hbase index currently does not work due to relocation when only using hudi-spark-bundle
   
   ## Brief change log
   
   * Update hudi-spark-bundle pom to not relocate hbase and htrace pattern
   * Remove codec relocation as this is not included in bundle which was causing error
   
   ## Verify this pull request
   
   * Manually verified the change by running an insert on hbase index table through spark-shell using hudi-spark-bundle
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] umehrot2 commented on pull request #2147: [HUDI-1289] Remove shading pattern for hbase dependencies in hudi-spark-bundle

Posted by GitBox <gi...@apache.org>.
umehrot2 commented on pull request #2147:
URL: https://github.com/apache/hudi/pull/2147#issuecomment-707334174


   @rmpifer A couple of points:
   - As @vinothchandar mentioned, it would be worth exploring if by just removing the dependency relocation and still continuing to shade, helps avoid the issues with Hbase index, and at the same time not break bootstrap code.
   - If we do go ahead with removing relocation for Hbase, we may want to remove the relocation in `hudi-hadoop-mr-bundle` and `hudi-presto-bundle` to avoid any other issues this might cause. One such issue we ran into with bootstrap was that Hbase was writing the KeyValue Comparator class name in HFile footer. At read time it would expect to see the exact same class. However this was resolved by creating our own comparator class for Hbase. https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/bootstrap/index/HFileBootstrapIndex.java#L584
   - Lets fix the commit message. We are not removing shading, but avoiding relocation as part of shading process.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar merged pull request #2147: [HUDI-1289] Remove shading pattern for hbase dependencies in hudi-spark-bundle

Posted by GitBox <gi...@apache.org>.
vinothchandar merged pull request #2147:
URL: https://github.com/apache/hudi/pull/2147


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bhasudha commented on pull request #2147: [HUDI-1289] Remove shading pattern for hbase dependencies in hudi-spark-bundle

Posted by GitBox <gi...@apache.org>.
bhasudha commented on pull request #2147:
URL: https://github.com/apache/hudi/pull/2147#issuecomment-709660443


   @rmpifer  We might need to remove this hbase relocation from hudi-utilities-bundle as well. Ran into this issue  when using  DeltaStreamer without HBASE index and in EMR 5.31.0 with security configs. The job failed as soon as it started. Saw errors similar to https://github.com/apache/hudi/issues/2100 . But was able to quickly verify that on removing this relocation from   the hudi-utilities-bundle bundle, the job ran fine. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #2147: [HUDI-1289] Remove shading pattern for hbase dependencies in hudi-spark-bundle

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #2147:
URL: https://github.com/apache/hudi/pull/2147#issuecomment-704437378


   @rmpifer as a quick check, is it possible to shade all the deps of Hbase, leaving hbase classes themselves unshaded?  The most concern we have is around guava etc conflicting with what spark/presto use


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #2147: [HUDI-1289] Remove shading pattern for hbase dependencies in hudi-spark-bundle

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #2147:
URL: https://github.com/apache/hudi/pull/2147#issuecomment-706799665


   @rmpifer if you can confirm the above, we can land this. otherwise LGTM 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] rmpifer commented on pull request #2147: [HUDI-1289] Remove shading pattern for hbase dependencies in hudi-spark-bundle

Posted by GitBox <gi...@apache.org>.
rmpifer commented on pull request #2147:
URL: https://github.com/apache/hudi/pull/2147#issuecomment-708128371


   @vinothchandar Sorry I've been caught up in some other obligations. We would have to explicitly add the dependencies we want to shade from hbase. If the biggest concern is around guava conflicts I think we may just want to include this for now. 
   
   @umehrot2 I'm ok with updating this in `hudi-hadoop-mr-bundle` and `hudi-presto-bundle` as well as long as there are no other concerns there


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org