You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by Chris K Wensel <ch...@wensel.net> on 2014/09/11 21:09:24 UTC
noob local resource question
I'm setting my what used to be called a hadoop job jar as a local resource, with APPLICATION visibility, of type PATTERN with the pattern "(?:classes/|lib/).*" (right from the JobConf)
the good news is when a remote tez client starts, the job jar is downloaded, and unpacked using the pattern
proof:
find /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/ | grep logparser.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/.tmp_logparser.jar.crc
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/jgraphx-2.0.0.1.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/jgrapht-ext-0.9.0.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/jgraph-5.13.0.0.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/cascading-xml-3.0.0-wip-dev.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/tagsoup-1.2.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/riffle-0.1-dev.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/jgrapht-core-0.9.0.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/cascading-hadoop2-tez-3.0.0-wip-dev.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/janino-2.6.1.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/cascading-core-3.0.0-wip-dev.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/commons-compiler-2.6.1.jar
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/logparser.jar
the bad news is that the 'launch_container.sh' is only adding
ln -sf "/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar" "logparser.jar"
so the containers only see
/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/container_1410456214816_0001_01_000004/logparser.jar
which isn't terribly helpful as it result in
2014-09-11 10:57:41,001 INFO [TezChild] org.apache.tez.runtime.task.TezTaskRunner: Encounted an error while executing task: attempt_1410456214816_0001_1_00_000000_2
org.apache.tez.dag.api.TezUncheckedException: Unable to load class: cascading.flow.tez.FlowProcessor
at org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:45)
at org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:96)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.createProcessor(LogicalIOProcessorRuntimeTask.java:563)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:187)
i'm obviously missing something. or is making classic job jars (with a lib folder) isn't really supported transparently anymore (as an option) which will cause some grief.
this is hadoop 2.4.1
ckw
--
Chris K Wensel
chris@concurrentinc.com
http://concurrentinc.com
RE: noob local resource question
Posted by Bikas Saha <bi...@hortonworks.com>.
We should probably update the javadocs to clarify that depending on the
local resource configuration yarn will unpack archives etc but yarn will
not do anything to the classpath because yarn does not know the
semantics/structure of that archive. To do something like that, the user
needs to add files (based on the structure of the archive) to the
classpath using the setTaskEnvironment() API.
If you have a generic helper that does the classpath addition based on
archive structure then we could consider adding that as a helper method in
TezUtils.
Bikas
-----Original Message-----
From: Chris K Wensel [mailto:chris@wensel.net]
Sent: Thursday, September 11, 2014 1:41 PM
To: user@tez.apache.org
Subject: Re: noob local resource question
Thanks.
some of the confusion comes from DAG offering up commonTaskLocalFiles
(which support extraction patterns and magically making the resources
classpath aware, -- now obviously not the extracted bits --) but not a
'commonTaskEnvironment', so some naive leaps were made.
ckw
On Sep 11, 2014, at 1:15 PM, Hitesh Shah <hi...@apache.org> wrote:
> Hi Chris,
>
> Unlike MR and its support of distributed cache, Tez does not make any
inferences into the structure of the LocalResources specified ( i.e
structure of tarball, jar, etc ) and therefore expects the user to modify
the class path as needed.
>
> It might be something worth considering as a new feature ( please file a
jira ) but the current implementation expects the user to setup the
classpath as needed to handle tar-balls, fat-jars, etc correctly.
>
> - Hitesh
>
>
> On Sep 11, 2014, at 12:09 PM, Chris K Wensel <ch...@wensel.net> wrote:
>
>>
>> I'm setting my what used to be called a hadoop job jar as a local
>> resource, with APPLICATION visibility, of type PATTERN with the
>> pattern "(?:classes/|lib/).*" (right from the JobConf)
>>
>> the good news is when a remote tez client starts, the job jar is
>> downloaded, and unpacked using the pattern
>>
>> proof:
>>
>> find
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/ | grep logparser.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/.tmp_logparser.jar.crc
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/lib
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/lib/jgraphx-2.0.0.1.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/lib/jgrapht-ext-0.9.0.j
>> ar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/lib/jgraph-5.13.0.0.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/lib/cascading-xml-3.0.0
>> -wip-dev.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/lib/tagsoup-1.2.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/lib/riffle-0.1-dev.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/lib/jgrapht-core-0.9.0.
>> jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/lib/cascading-hadoop2-t
>> ez-3.0.0-wip-dev.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/lib/janino-2.6.1.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/lib/cascading-core-3.0.
>> 0-wip-dev.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/lib/commons-compiler-2.
>> 6.1.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/filecache/14/logparser.jar/logparser.jar
>>
>> the bad news is that the 'launch_container.sh' is only adding ln -sf
>>
"/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410
456214816_0001/filecache/14/logparser.jar" "logparser.jar"
>>
>> so the containers only see
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_
>> 1410456214816_0001/container_1410456214816_0001_01_000004/logparser.j
>> ar
>>
>> which isn't terribly helpful as it result in
>>
>> 2014-09-11 10:57:41,001 INFO [TezChild]
>> org.apache.tez.runtime.task.TezTaskRunner: Encounted an error while
>> executing task: attempt_1410456214816_0001_1_00_000000_2
>> org.apache.tez.dag.api.TezUncheckedException: Unable to load class:
cascading.flow.tez.FlowProcessor
>> at
org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:45)
>> at
org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.
java:96)
>> at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.createProcessor(Logic
alIOProcessorRuntimeTask.java:563)
>> at
>> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(Logic
>> alIOProcessorRuntimeTask.java:187)
>>
>> i'm obviously missing something. or is making classic job jars (with a
lib folder) isn't really supported transparently anymore (as an option)
which will cause some grief.
>>
>> this is hadoop 2.4.1
>>
>> ckw
>>
>> --
>> Chris K Wensel
>> chris@concurrentinc.com
>> http://concurrentinc.com
>>
>
--
Chris K Wensel
chris@concurrentinc.com
http://concurrentinc.com
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
Re: noob local resource question
Posted by Chris K Wensel <ch...@wensel.net>.
Thanks.
some of the confusion comes from DAG offering up commonTaskLocalFiles (which support extraction patterns and magically making the resources classpath aware, -- now obviously not the extracted bits --) but not a 'commonTaskEnvironment', so some naive leaps were made.
ckw
On Sep 11, 2014, at 1:15 PM, Hitesh Shah <hi...@apache.org> wrote:
> Hi Chris,
>
> Unlike MR and its support of distributed cache, Tez does not make any inferences into the structure of the LocalResources specified ( i.e structure of tarball, jar, etc ) and therefore expects the user to modify the class path as needed.
>
> It might be something worth considering as a new feature ( please file a jira ) but the current implementation expects the user to setup the classpath as needed to handle tar-balls, fat-jars, etc correctly.
>
> — Hitesh
>
>
> On Sep 11, 2014, at 12:09 PM, Chris K Wensel <ch...@wensel.net> wrote:
>
>>
>> I'm setting my what used to be called a hadoop job jar as a local resource, with APPLICATION visibility, of type PATTERN with the pattern "(?:classes/|lib/).*" (right from the JobConf)
>>
>> the good news is when a remote tez client starts, the job jar is downloaded, and unpacked using the pattern
>>
>> proof:
>>
>> find /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/ | grep logparser.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/.tmp_logparser.jar.crc
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/jgraphx-2.0.0.1.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/jgrapht-ext-0.9.0.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/jgraph-5.13.0.0.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/cascading-xml-3.0.0-wip-dev.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/tagsoup-1.2.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/riffle-0.1-dev.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/jgrapht-core-0.9.0.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/cascading-hadoop2-tez-3.0.0-wip-dev.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/janino-2.6.1.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/cascading-core-3.0.0-wip-dev.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/commons-compiler-2.6.1.jar
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/logparser.jar
>>
>> the bad news is that the 'launch_container.sh' is only adding
>> ln -sf "/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar" "logparser.jar"
>>
>> so the containers only see
>> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/container_1410456214816_0001_01_000004/logparser.jar
>>
>> which isn't terribly helpful as it result in
>>
>> 2014-09-11 10:57:41,001 INFO [TezChild] org.apache.tez.runtime.task.TezTaskRunner: Encounted an error while executing task: attempt_1410456214816_0001_1_00_000000_2
>> org.apache.tez.dag.api.TezUncheckedException: Unable to load class: cascading.flow.tez.FlowProcessor
>> at org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:45)
>> at org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:96)
>> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.createProcessor(LogicalIOProcessorRuntimeTask.java:563)
>> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:187)
>>
>> i'm obviously missing something. or is making classic job jars (with a lib folder) isn't really supported transparently anymore (as an option) which will cause some grief.
>>
>> this is hadoop 2.4.1
>>
>> ckw
>>
>> --
>> Chris K Wensel
>> chris@concurrentinc.com
>> http://concurrentinc.com
>>
>
--
Chris K Wensel
chris@concurrentinc.com
http://concurrentinc.com
Re: noob local resource question
Posted by Hitesh Shah <hi...@apache.org>.
Hi Chris,
Unlike MR and its support of distributed cache, Tez does not make any inferences into the structure of the LocalResources specified ( i.e structure of tarball, jar, etc ) and therefore expects the user to modify the class path as needed.
It might be something worth considering as a new feature ( please file a jira ) but the current implementation expects the user to setup the classpath as needed to handle tar-balls, fat-jars, etc correctly.
— Hitesh
On Sep 11, 2014, at 12:09 PM, Chris K Wensel <ch...@wensel.net> wrote:
>
> I'm setting my what used to be called a hadoop job jar as a local resource, with APPLICATION visibility, of type PATTERN with the pattern "(?:classes/|lib/).*" (right from the JobConf)
>
> the good news is when a remote tez client starts, the job jar is downloaded, and unpacked using the pattern
>
> proof:
>
> find /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/ | grep logparser.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/.tmp_logparser.jar.crc
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/jgraphx-2.0.0.1.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/jgrapht-ext-0.9.0.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/jgraph-5.13.0.0.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/cascading-xml-3.0.0-wip-dev.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/tagsoup-1.2.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/riffle-0.1-dev.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/jgrapht-core-0.9.0.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/cascading-hadoop2-tez-3.0.0-wip-dev.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/janino-2.6.1.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/cascading-core-3.0.0-wip-dev.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/lib/commons-compiler-2.6.1.jar
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar/logparser.jar
>
> the bad news is that the 'launch_container.sh' is only adding
> ln -sf "/tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/filecache/14/logparser.jar" "logparser.jar"
>
> so the containers only see
> /tmp/hadoop-root/nm-local-dir/usercache/cwensel/appcache/application_1410456214816_0001/container_1410456214816_0001_01_000004/logparser.jar
>
> which isn't terribly helpful as it result in
>
> 2014-09-11 10:57:41,001 INFO [TezChild] org.apache.tez.runtime.task.TezTaskRunner: Encounted an error while executing task: attempt_1410456214816_0001_1_00_000000_2
> org.apache.tez.dag.api.TezUncheckedException: Unable to load class: cascading.flow.tez.FlowProcessor
> at org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:45)
> at org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:96)
> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.createProcessor(LogicalIOProcessorRuntimeTask.java:563)
> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:187)
>
> i'm obviously missing something. or is making classic job jars (with a lib folder) isn't really supported transparently anymore (as an option) which will cause some grief.
>
> this is hadoop 2.4.1
>
> ckw
>
> --
> Chris K Wensel
> chris@concurrentinc.com
> http://concurrentinc.com
>