You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by Nathaniel Braun <n....@criteo.com> on 2016/08/16 11:43:48 UTC

Extra JAR files in the minimal distribution package?

Hello,


I'm building the 0.8.3 version of Tez to test on my cluster.


Looking at the minimal distribution pacakge, I can see the following to JAR files inside:

   - hadoop-mapreduce-client-common

   - hadoop-mapreduce-client-core


Aren't these supposed to be excluded from the build, the same way other Hadoop libraries are?


Thanks!


Regards,

Nathaniel



Re: Extra JAR files in the minimal distribution package?

Posted by Hitesh Shah <hi...@apache.org>.
Hello Nathaniel, 

You are probably right that they should not be as long as the cluster classpath used contains the MR jars. I believe these jars were retained as a result of using yarn.application.classpath for augmenting the runtime classpath when using the classpath from the cluster instead of the full tarball approach. The yarn.application.classpath config by default brings in only common, hdfs and yarn jars and not necessarily the MR jars. 
 
@Jon Eagles, @Jason Lowe - do you have any additional comments on how this is deployed at Yahoo? I believe you use a combination of the minimal tarball and the mapreduce/hadoop tarball - in this case, have you removed the MR jars from the minimal tarball? 

thanks
— Hitesh


> On Aug 16, 2016, at 4:43 AM, Nathaniel Braun <n....@criteo.com> wrote:
> 
> Hello,
> 
> I'm building the 0.8.3 version of Tez to test on my cluster.
> 
> Looking at the minimal distribution pacakge, I can see the following to JAR files inside:
>    - hadoop-mapreduce-client-common
>    - hadoop-mapreduce-client-core
> 
> Aren't these supposed to be excluded from the build, the same way other Hadoop libraries are?
> 
> Thanks!
> 
> Regards,
> Nathaniel