You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2019/09/23 20:10:57 UTC

[GitHub] [incubator-hudi] vinothchandar commented on issue #915: [HUDI-268] Shade and relocate Avro dependency in hadoop-mr-bundle

vinothchandar commented on issue #915: [HUDI-268] Shade and relocate Avro dependency in hadoop-mr-bundle
URL: https://github.com/apache/incubator-hudi/pull/915#issuecomment-534263942
 
 
   >I think Hudi should strive to work with its own versions of parquet/avro irrespective of the consuming application
   I think we differ here. Speaking from experience of trying to do so, we ran into multiple issues with that approach 
   
    - There is always disparity between what works on a default parquet table on Hive/Spark vs what Hudi tables do 
   - Shading is not always a viable option esp with Avro and the public interfaces cc @bvaradar  
   
   >>we can claim to always compile Hudi with the version of Spark that is actually writing the dataset.
   With avro 1.7.7 and Spark 2.1 I think thats what we were at. Bundling avro 1.7.7 was the problem since on higher spark versions also we are stuck with that. 
   
   >>  I can definitely make it configurable like you said.
   For now, I would recommend doing that and we can document build instructions for different spark/hive combinations. We can also maintain and evolve these in hudi project itself.  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services