You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Raymond Xu (Jira)" <ji...@apache.org> on 2023/03/10 01:15:00 UTC

[jira] [Updated] (HUDI-3674) Remove unnecessary HBase-related dependencies from bundles if there is any

     [ https://issues.apache.org/jira/browse/HUDI-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raymond Xu updated HUDI-3674:
-----------------------------
    Component/s: dependencies

> Remove unnecessary HBase-related dependencies from bundles if there is any
> --------------------------------------------------------------------------
>
>                 Key: HUDI-3674
>                 URL: https://issues.apache.org/jira/browse/HUDI-3674
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: dependencies
>            Reporter: Ethan Guo
>            Priority: Blocker
>             Fix For: 0.13.1
>
>
> [https://github.com/apache/hudi/pull/5004/files] A follow-up of HUDI-1180. 
> vinothchandar 6 days ago Member
> is the absolute minimal set of artifacts needed
>  
>  alexeykudinkin 6 days ago Contributor
> Need not to take as part of this PR, but i actually want to suggest one step further:
> Since we're mostly reliant on HFile and the classes it's dependent on, can we try to filter out packages that won't break it?
> My hunch is that we can greatly reduce 16Mb overhead number by just cleaning up all the stuff that is bolted onto HBase.
> 👍
> 1
>  
>  codope 4 days ago Member
> That's a good idea. In fact, i've tried out but it's a very manual time-consuming process to verify. I gave up after a few failures. And keep future upgrades in mind. But, i would be very happy to reduce the bundle size in any way we can and we should take another stab at this idea in future.
>  
>  yihua 4 days ago Author Member
> Yeah, that's good to have. The problem as @codope pointed out is that such a process is time-consuming. For now, what I can say is that the newly added artifacts are necessary, since I started with the old pom, incrementally added new artifacts as I saw NoClassDef exception until every test can pass.
> One thing we may try later is to add and trim hudi-hbase-shaded by excluding transitives and only depend on hudi-hbase-shaded here.
>  
>  alexeykudinkin 3 days ago Contributor
> Yeah, it's tedious manual process for sure, but i think we can do it pretty fast: we just look at the packages imported by HFile, then look at files that are imported by HFile, and so on. Then after that we can run the tests if we collected it properly or not.
> The hypothesis is that this set should be reasonably bounded (why wouldn't it?) so this iteration should be pretty fast.
> Can you please create a task and link it here to follow-up?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)