You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by Sanjay Maurya <sa...@yahoo.com> on 2015/09/01 11:43:13 UTC

how to load jar for Tez DAG

Hi ,
I am running a DAG , but at run time Vertex is failing because it is not able to load the class for my custom processor which is present in my jar
My question is how to load the custom classes in Tez runtime . Is there any similar API present in TEZ like we have in org.apache.hadoop.mapreduce.Job (setJarByClass) .
Below is the exception I am getting:
15/09/01 09:26:05 INFO examples.MyAggregator: DAG diagnostics: [Vertex failed, vertexName=GroupBy, vertexId=vertex_1441084367785_0009_1_00, diagnostics=[Task failed, taskId=task_1441084367785_0009_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:org.apache.tez.dag.api.TezUncheckedException: Unable to load class: org.apache.tez.examples.MyAggregator$GroupByProcessor        at org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:45)        at org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:96)        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.createProcessor(LogicalIOProcessorRuntimeTask.java:597)        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:210)        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)        at java.security.AccessController.doPrivileged(Native Method)        at javax.security.auth.Subject.doAs(Subject.java:415)        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)        at java.util.concurrent.FutureTask.run(FutureTask.java:262)        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)        at java.lang.Thread.run(Thread.java:745)Caused by: java.lang.ClassNotFoundException: org.apache.tez.examples.MyAggregator$GroupByProcessor        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)        at java.security.AccessController.doPrivileged(Native Method)        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)        at java.lang.Class.forName0(Native Method)        at java.lang.Class.forName(Class.java:278)        at org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:43)        ... 15 more

Re: how to load jar for Tez DAG

Posted by Sanjay Maurya <sa...@yahoo.com>.
Thanks Hitesh
-Sanjay 


     On Tuesday, September 1, 2015 8:29 PM, Hitesh Shah <hi...@apache.org> wrote:
   

 Or if this jar is always going to be needed for all jobs, you can configure “tez.aux.uris” to point to the location of the jar on HDFS. 

— Hitesh


On Sep 1, 2015, at 4:17 AM, Rajesh Balamohan <rb...@apache.org> wrote:

> You can use DAG.addTaskLocalFiles() for this.
> 
> Rough example:
>    
>    //Get the jar details
>    Path path = new Path("hdfs://nn:port/tmp/customJar.jar"); <-- Custom jar path
>    FileSystem fs = path.getFileSystem(tezConf);
>    FileStatus fileStatus = fs.getFileStatus(path);
>    long size = fileStatus.getLen();
>    long modTime = fileStatus.getModificationTime();
> 
>    //Create local resource (ConveterUtils should be available in org.apache.hadoop.yarn.util)
>    LocalResource localResource = LocalResource.newInstance(ConverterUtils.getYarnUrlFromPath(path),
>        LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, size, modTime);
> 
>    //Add resources to DAG
>    Map<String, LocalResource> additionalResources = Maps.newHashMap();
>    additionalResources.put("MyCustom.jar", localResource);
>    dag.addTaskLocalFiles(additionalResources);
>    System.out.println("Added jars to DAG");
>    
> 
> On Tue, Sep 1, 2015 at 3:13 PM, Sanjay Maurya <sa...@yahoo.com> wrote:
> Hi ,
> 
> I am running a DAG , but at run time Vertex is failing because it is not able to load the class for my custom processor which is present in my jar
> 
> My question is how to load the custom classes in Tez runtime . Is there any similar API present in TEZ like we have in org.apache.hadoop.mapreduce.Job (setJarByClass) .
> 
> Below is the exception I am getting:
> 
> 15/09/01 09:26:05 INFO examples.MyAggregator: DAG diagnostics: [Vertex failed, vertexName=GroupBy, vertexId=vertex_1441084367785_0009_1_00, diagnostics=[Task failed, taskId=task_1441084367785_0009_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:org.apache.tez.dag.api.TezUncheckedException: Unable to load class: org.apache.tez.examples.MyAggregator$GroupByProcessor
>        at org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:45)
>        at org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:96)
>        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.createProcessor(LogicalIOProcessorRuntimeTask.java:597)
>        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:210)
>        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:415)
>        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>        at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: org.apache.tez.examples.MyAggregator$GroupByProcessor
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>        at java.lang.Class.forName0(Native Method)
>        at java.lang.Class.forName(Class.java:278)
>        at org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:43)
>        ... 15 more
> 
> 


  

Re: how to load jar for Tez DAG

Posted by Hitesh Shah <hi...@apache.org>.
Or if this jar is always going to be needed for all jobs, you can configure “tez.aux.uris” to point to the location of the jar on HDFS. 

— Hitesh


On Sep 1, 2015, at 4:17 AM, Rajesh Balamohan <rb...@apache.org> wrote:

> You can use DAG.addTaskLocalFiles() for this.
> 
> Rough example:
>     
>     //Get the jar details
>     Path path = new Path("hdfs://nn:port/tmp/customJar.jar"); <-- Custom jar path
>     FileSystem fs = path.getFileSystem(tezConf);
>     FileStatus fileStatus = fs.getFileStatus(path);
>     long size = fileStatus.getLen();
>     long modTime = fileStatus.getModificationTime();
> 
>     //Create local resource (ConveterUtils should be available in org.apache.hadoop.yarn.util)
>     LocalResource localResource = LocalResource.newInstance(ConverterUtils.getYarnUrlFromPath(path),
>         LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, size, modTime);
> 
>     //Add resources to DAG
>     Map<String, LocalResource> additionalResources = Maps.newHashMap();
>     additionalResources.put("MyCustom.jar", localResource);
>     dag.addTaskLocalFiles(additionalResources);
>     System.out.println("Added jars to DAG");
>     
> 
> On Tue, Sep 1, 2015 at 3:13 PM, Sanjay Maurya <sa...@yahoo.com> wrote:
> Hi ,
> 
> I am running a DAG , but at run time Vertex is failing because it is not able to load the class for my custom processor which is present in my jar
> 
> My question is how to load the custom classes in Tez runtime . Is there any similar API present in TEZ like we have in org.apache.hadoop.mapreduce.Job (setJarByClass) .
> 
> Below is the exception I am getting:
> 
> 15/09/01 09:26:05 INFO examples.MyAggregator: DAG diagnostics: [Vertex failed, vertexName=GroupBy, vertexId=vertex_1441084367785_0009_1_00, diagnostics=[Task failed, taskId=task_1441084367785_0009_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:org.apache.tez.dag.api.TezUncheckedException: Unable to load class: org.apache.tez.examples.MyAggregator$GroupByProcessor
>         at org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:45)
>         at org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:96)
>         at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.createProcessor(LogicalIOProcessorRuntimeTask.java:597)
>         at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:210)
>         at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>         at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>         at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>         at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>         at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: org.apache.tez.examples.MyAggregator$GroupByProcessor
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>         at java.lang.Class.forName0(Native Method)
>         at java.lang.Class.forName(Class.java:278)
>         at org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:43)
>         ... 15 more
> 
> 


Re: how to load jar for Tez DAG

Posted by Rajesh Balamohan <rb...@apache.org>.
You can use DAG.addTaskLocalFiles() for this.

Rough example:

    //Get the jar details
    Path path = new Path("hdfs://nn:port/tmp/customJar.jar"); <-- Custom
jar path
    FileSystem fs = path.getFileSystem(tezConf);
    FileStatus fileStatus = fs.getFileStatus(path);
    long size = fileStatus.getLen();
    long modTime = fileStatus.getModificationTime();

    //Create local resource (ConveterUtils should be available
in org.apache.hadoop.yarn.util)
    LocalResource localResource =
LocalResource.newInstance(ConverterUtils.getYarnUrlFromPath(path),
        LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, size,
modTime);

    //Add resources to DAG
    Map<String, LocalResource> additionalResources = Maps.newHashMap();
    additionalResources.put("MyCustom.jar", localResource);
    dag.addTaskLocalFiles(additionalResources);
    System.out.println("Added jars to DAG");


On Tue, Sep 1, 2015 at 3:13 PM, Sanjay Maurya <sa...@yahoo.com>
wrote:

> Hi ,
>
> I am running a DAG , but at run time Vertex is failing because it is not
> able to load the class for my custom processor which is present in my jar
>
> My question is how to load the custom classes in Tez runtime . Is there
> any similar API present in TEZ like we have in *org.apache.hadoop.mapreduce.Job
> (*setJarByClass*) .*
>
> Below is the exception I am getting:
>
> 15/09/01 09:26:05 INFO examples.MyAggregator: DAG diagnostics: [Vertex
> failed, vertexName=GroupBy, vertexId=vertex_1441084367785_0009_1_00,
> diagnostics=[Task failed, taskId=task_1441084367785_0009_1_00_000000,
> diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running
> task:org.apache.tez.dag.api.TezUncheckedException: Unable to load class:
> org.apache.tez.examples.MyAggregator$GroupByProcessor
>         at
> org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:45)
>         at
> org.apache.tez.common.ReflectionUtils.createClazzInstance(ReflectionUtils.java:96)
>         at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.createProcessor(LogicalIOProcessorRuntimeTask.java:597)
>         at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:210)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>         at
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>         at
> org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.tez.examples.MyAggregator$GroupByProcessor
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>         at java.lang.Class.forName0(Native Method)
>         at java.lang.Class.forName(Class.java:278)
>         at
> org.apache.tez.common.ReflectionUtils.getClazz(ReflectionUtils.java:43)
>         ... 15 more
>
>