You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Allen Wittenauer (JIRA)" <ji...@apache.org> on 2016/02/29 23:58:18 UTC
[jira] [Comment Edited] (HADOOP-12857) Rework hadoop-tools-dist
[ https://issues.apache.org/jira/browse/HADOOP-12857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172799#comment-15172799 ]
Allen Wittenauer edited comment on HADOOP-12857 at 2/29/16 10:57 PM:
---------------------------------------------------------------------
FWIW, I've got some stupid/simple shell code that takes the output of mvn dependency:list and builds a shell profile script.
Some random notes:
* It currently looks for *ALL* of the depended upon jars in the tools dir. This is less than efficient for what are hopefully obvious reasons.
* HADOOP-10115 pretty much means that the shell profiles will need to be built well after we've processed the hadoop-tools dir in order to know what is/isn't already bundled via hadoop-common.
So contemplating two approaches in order to make the latter option work:
# Try to trigger mvn dependency:list in the build stage for those modules that need it. Push the output through the build process up until hadoop-dist gets triggered. Take that output and generate the profiles then.
# In hadoop-dist, run mvn dependency:list for all (except some blacklisted ones) modules under hadoop-tools (and thus effectively having mvn running mvn), and then generate profiles as in #1.
To make matters more complicated, I've been informed over the weekend that Big Top based distributions stupidly merge all of hadoop-tools into hadoop-common's lib dir. So they'll always have the perf hit and other issues that having a flat dir structure causes.
was (Author: aw):
FWIW, I've got some stupid/simple shell code that takes the output of mvn dependency:list and builds a shell profile script.
Some random notes:
* It currently looks for *ALL* of the jars in the tools dir. This is less than efficient for what are hopefully obvious reasons.
* HADOOP-10115 pretty much means that the shell profiles will need to be built well after we've processed the hadoop-tools dir in order to know what is/isn't already bundled via hadoop-common.
So contemplating two approaches in order to make the latter option work:
# Try to trigger mvn dependency:list in the build stage for those modules that need it. Push the output through the build process up until hadoop-dist gets triggered. Take that output and generate the profiles then.
# In hadoop-dist, run mvn dependency:list for all (except some blacklisted ones) modules under hadoop-tools (and thus effectively having mvn running mvn), and then generate profiles as in #1.
To make matters more complicated, I've been informed over the weekend that Big Top based distributions stupidly merge all of hadoop-tools into hadoop-common's lib dir. So they'll always have the perf hit and other issues that having a flat dir structure causes.
> Rework hadoop-tools-dist
> ------------------------
>
> Key: HADOOP-12857
> URL: https://issues.apache.org/jira/browse/HADOOP-12857
> Project: Hadoop Common
> Issue Type: Improvement
> Components: build
> Affects Versions: 3.0.0
> Reporter: Allen Wittenauer
>
> As hadoop-tools grows bigger and bigger, it's becoming evident that having a single directory that gets sucked in is starting to become a big burden as the number of tools grows. Let's rework this to be smarter.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)