You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by yu feng <ol...@gmail.com> on 2015/09/07 16:18:16 UTC

Can not execute mapreduce job as missing jars

After submit a mapreduce job, we get job status,But it tells the job
failed!  we check this application on RM website(xxx:8088/cluster/apps), we
find those log :

 Application application_1418904565842_3597024 failed 2 times due to AM
Container for appattempt_1418904565842_3597024_000002 exited with exitCode:
1 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:252)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
main : command provided 1

we check yarn log with this command : yarn logs -applicationId
application_1418904565842_3597024,get those log :
Container: container_1418904565842_3597024_01_000001 on
hadoop88.photo.163.org_56708
======================================================================================
LogType: stderr
LogLength: 664
Log Contents:
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/hadoop/mapreduce/v2/app/MRAppMaster
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.mapreduce.v2.app.MRAppMaster
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class:
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.  Program will exit.


I thinks this means the task can not find the jar files, So I upload all
the jars that kylin dependent to HDFS, and before submit this job*(in
AbstractHadoopJob.attachKylinPropsAndMetadata function )* I set "tmpjars"
to those files located on HDFS(this way can avoid uploading all files when
submit every mapreduce job).

This measure works in kylin-0.7.2, I get the same error in kylin-1.0 and I
guess this measure will work in kylin-1.0 too, But I do not think this is a
good idea,

It will be highly appreciated if you have some good idea or some
suggestion. Thanks...

Re: Can not execute mapreduce job as missing jars

Posted by yu feng <ol...@gmail.com>.
I just doubt about when kylin set some other jars that map or reduce
dependents to classpath such as json/apache-commons, Just a moment ago , I
notice the difference between kylin-job-xxx.jar in $KYLIN_HOME/lib
and kylin-job-xxx.jar in tomcat/webapps/kylin/WEB-INF/lib/, the former jar
file is much bigger than the the latter one:

nrpt@classa-nrpt1:~/kylin-1.0-incubating$ ls -lah
 lib/kylin-job-1.0-incubating.jar
-rw-r--r-- 1 nrpt netease 9.6M Sep  7 20:54 lib/kylin-job-1.0-incubating.jar
nrpt@classa-nrpt1:~/kylin-1.0-incubating$ ls -lah
tomcat/webapps/kylin/WEB-INF/lib/kylin-job-1.0-incubating.jar
-rw-r--r-- 1 nrpt netease 325K Sep  8  2015
tomcat/webapps/kylin/WEB-INF/lib/kylin-job-1.0-incubating.jar

with open the first one , I find it is the combination of all dependent jar
files and it resolve my doubt...


2015-09-08 10:16 GMT+08:00 yu feng <ol...@gmail.com>:

> I find those log like this :
> [pool-5-thread-1]:[2015-09-07
> 20:58:41,746][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob.setJobClasspath(AbstractHadoopJob.java:166)]
> - Hadoop job classpath is:
> /home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/etc/hadoop:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/common/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/common/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/hdfs/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/hdfs/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/contrib/capacity-scheduler/*.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-api-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-client-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-common-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-common-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-site-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-io-2.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-3.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-lzo-0.4.20.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/hamcrest-core-1.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/javax.inject-1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/junit-4.10.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/xz-1.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/modules/*.jar
> .....................
>
>
> There are two line in this log . the first line is the hadoop default
> classpath outputed by command 'mapred classpath' , and it ends with a new
> line '\n', I think the character need to be deleted.
> what's more , I check source code of setting job classpath :
>
>         if (kylinHBaseDependency != null) {
>             // yarn classpath is comma separated
>             kylinHBaseDependency = kylinHBaseDependency.replace(":", ",");
>             classpath = classpath + "," + kylinHBaseDependency;
>         }
>
>         if (kylinHiveDependency != null) {
>             // yarn classpath is comma separated
>             kylinHiveDependency = kylinHiveDependency.replace(":", ",");
>             classpath = classpath + "," + kylinHiveDependency;
>         }
>
>         jobConf.set(MAP_REDUCE_CLASSPATH, classpath + "," +
> kylinHiveDependency);
>         logger.info("Hadoop job classpath is: " +
> job.getConfiguration().get(MAP_REDUCE_CLASSPATH));
>
> it looks like we append kylinHiveDependency to classpath twice, I do not
> know what it means..
>
> Lastly, the property of
> MAP_REDUCE_CLASSPATH(mapreduce.application.classpath) is setted up  some
> jar files located at local filesystem, Actually I do not know if those
> files will be uploaded to HDFS just as the files added in property '
> tmpjars'.
>
> 2015-09-08 9:35 GMT+08:00 Shi, Shaofeng <sh...@ebay.com>:
>
>> Hi feng, your map reduce classpath might not be correctly configured;
>> Kylin will read the ³mapreduce.application.classpath² from default job
>> configuration; if not found that, it will run ³mapred classpath² command
>> to get the classpath, and then append hive/hbase dependencies; Please
>> check kylin.log to see whether the final classpath includes the jar for
>> this missing class;
>>
>> The message in kylin.log is as below, you can search it:
>>
>> Hadoop job classpath is:
>>
>>
>> On 9/7/15, 10:18 PM, "yu feng" <ol...@gmail.com> wrote:
>>
>> >After submit a mapreduce job, we get job status,But it tells the job
>> >failed!  we check this application on RM website(xxx:8088/cluster/apps),
>> >we
>> >find those log :
>> >
>> > Application application_1418904565842_3597024 failed 2 times due to AM
>> >Container for appattempt_1418904565842_3597024_000002 exited with
>> >exitCode:
>> >1 due to: Exception from container-launch:
>> >org.apache.hadoop.util.Shell$ExitCodeException:
>> >at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>> >at org.apache.hadoop.util.Shell.run(Shell.java:379)
>> >at
>> >org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>> >at
>>
>> >org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchCon
>> >tainer(LinuxContainerExecutor.java:252)
>> >at
>>
>> >org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Contai
>> >nerLaunch.call(ContainerLaunch.java:283)
>> >at
>>
>> >org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Contai
>> >nerLaunch.call(ContainerLaunch.java:79)
>> >at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >at
>>
>> >java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.
>> >java:895)
>> >at
>>
>> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>> >:918)
>> >at java.lang.Thread.run(Thread.java:662)
>> >main : command provided 1
>> >
>> >we check yarn log with this command : yarn logs -applicationId
>> >application_1418904565842_3597024,get those log :
>> >Container: container_1418904565842_3597024_01_000001 on
>> >hadoop88.photo.163.org_56708
>>
>> >==========================================================================
>> >============
>> >LogType: stderr
>> >LogLength: 664
>> >Log Contents:
>> >Exception in thread "main" java.lang.NoClassDefFoundError:
>> >org/apache/hadoop/mapreduce/v2/app/MRAppMaster
>> >Caused by: java.lang.ClassNotFoundException:
>> >org.apache.hadoop.mapreduce.v2.app.MRAppMaster
>> >at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>> >at java.security.AccessController.doPrivileged(Native Method)
>> >at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>> >at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>> >at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>> >at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>> >Could not find the main class:
>> >org.apache.hadoop.mapreduce.v2.app.MRAppMaster.  Program will exit.
>> >
>> >
>> >I thinks this means the task can not find the jar files, So I upload all
>> >the jars that kylin dependent to HDFS, and before submit this job*(in
>> >AbstractHadoopJob.attachKylinPropsAndMetadata function )* I set "tmpjars"
>> >to those files located on HDFS(this way can avoid uploading all files
>> when
>> >submit every mapreduce job).
>> >
>> >This measure works in kylin-0.7.2, I get the same error in kylin-1.0 and
>> I
>> >guess this measure will work in kylin-1.0 too, But I do not think this is
>> >a
>> >good idea,
>> >
>> >It will be highly appreciated if you have some good idea or some
>> >suggestion. Thanks...
>>
>>
>

Re: Can not execute mapreduce job as missing jars

Posted by yu feng <ol...@gmail.com>.
add JIRA : https://issues.apache.org/jira/browse/KYLIN-1021

2015-09-08 13:53 GMT+08:00 Shi, Shaofeng <sh...@ebay.com>:

> Hi Feng, thanks for pointing out that the hive jar repeated in the job
> configuration, I just created a JIRA to fix it:
> https://issues.apache.org/jira/browse/KYLIN-1015
>
> Regrading your question about the jar files located in local disk instead
> of HDFS, yes the hadoop/hive/hbase jars should exist in local disk on each
> machine of the hadoop cluster, with the same locations; Kylin will not
> upload those jars; Please check and ensure the consistency of your hadoop
> cluster.
>
> On 9/8/15, 10:16 AM, "yu feng" <ol...@gmail.com> wrote:
>
> >I find those log like this :
> >[pool-5-thread-1]:[2015-09-07
> >20:58:41,746][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob.setJobCl
> >asspath(AbstractHadoopJob.java:166)]
> >- Hadoop job classpath is:
> >/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/etc/hadoop:/home/nrpt/hadoop_hiv
> >e_hbase/hadoop-2.2.0/share/hadoop/common/lib/*:/home/nrpt/hadoop_hive_hbas
> >e/hadoop-2.2.0/share/hadoop/common/*:/home/nrpt/hadoop_hive_hbase/hadoop-2
> >.2.0/share/hadoop/hdfs/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/sha
> >re/hadoop/hdfs/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/ya
> >rn/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/*:/ho
> >me/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/*:/home/
> >nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/*:/home/nrpt/ha
> >doop_hive_hbase/hadoop-2.2.0/contrib/capacity-scheduler/*.jar:/home/nrpt/h
> >adoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-clie
> >nt-app-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/ma
> >preduce/hadoop-mapreduce-client-common-2.2.0.jar:/home/nrpt/hadoop_hive_hb
> >ase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0
> >.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hado
> >op-mapreduce-client-hs-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0
> >/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.2.0.jar:/home
> >/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapredu
> >ce-client-jobclient-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/sh
> >are/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar:/ho
> >me/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapre
> >duce-client-shuffle-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/sh
> >are/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar:/home/nrpt/hadoop
> >_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-api-2.2.0.jar:/home
> >/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applica
> >tions-distributedshell-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0
> >/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.2.0.ja
> >r:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-
> >client-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/ya
> >rn/hadoop-yarn-common-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/
> >share/hadoop/yarn/hadoop-yarn-server-common-2.2.0.jar:/home/nrpt/hadoop_hi
> >ve_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.2
> >.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-
> >yarn-server-resourcemanager-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-
> >2.2.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.2.0.jar:/home/nrpt/hado
> >op_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-
> >2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hado
> >op-yarn-site-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/had
> >oop/mapreduce/lib/aopalliance-1.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-
> >2.2.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/nrpt/hadoop_hive_hbase/
> >hadoop-2.2.0/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/home/nrpt/hadoop_h
> >ive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.j
> >ar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/co
> >mmons-io-2.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/ma
> >preduce/lib/guice-3.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/
> >hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/nrpt/hadoop_hive_hbase/ha
> >doop-2.2.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.2.0.jar:/home/n
> >rpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-lzo-0
> >.4.20.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce
> >/lib/hamcrest-core-1.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share
> >/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/home/nrpt/hadoop_hive_hb
> >ase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/
> >home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/javax.
> >inject-1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapred
> >uce/lib/jersey-core-1.9.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/shar
> >e/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/nrpt/hadoop_hive_hbase/h
> >adoop-2.2.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/nrpt/ha
> >doop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/junit-4.10.jar:/ho
> >me/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/log4j-1.
> >2.17.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/
> >lib/netty-3.6.2.Final.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/
> >hadoop/mapreduce/lib/paranamer-2.3.jar:/home/nrpt/hadoop_hive_hbase/hadoop
> >-2.2.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/nrpt/hadoo
> >p_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.j
> >ar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/xz
> >-1.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/modules/*.jar
> >.....................
> >
> >
> >There are two line in this log . the first line is the hadoop default
> >classpath outputed by command 'mapred classpath' , and it ends with a new
> >line '\n', I think the character need to be deleted.
> >what's more , I check source code of setting job classpath :
> >
> >        if (kylinHBaseDependency != null) {
> >            // yarn classpath is comma separated
> >            kylinHBaseDependency = kylinHBaseDependency.replace(":", ",");
> >            classpath = classpath + "," + kylinHBaseDependency;
> >        }
> >
> >        if (kylinHiveDependency != null) {
> >            // yarn classpath is comma separated
> >            kylinHiveDependency = kylinHiveDependency.replace(":", ",");
> >            classpath = classpath + "," + kylinHiveDependency;
> >        }
> >
> >        jobConf.set(MAP_REDUCE_CLASSPATH, classpath + "," +
> >kylinHiveDependency);
> >        logger.info("Hadoop job classpath is: " +
> >job.getConfiguration().get(MAP_REDUCE_CLASSPATH));
> >
> >it looks like we append kylinHiveDependency to classpath twice, I do not
> >know what it means..
> >
> >Lastly, the property of
> >MAP_REDUCE_CLASSPATH(mapreduce.application.classpath) is setted up  some
> >jar files located at local filesystem, Actually I do not know if those
> >files will be uploaded to HDFS just as the files added in property
> >'tmpjars
> >'.
> >
> >2015-09-08 9:35 GMT+08:00 Shi, Shaofeng <sh...@ebay.com>:
> >
> >> Hi feng, your map reduce classpath might not be correctly configured;
> >> Kylin will read the ³mapreduce.application.classpath² from default job
> >> configuration; if not found that, it will run ³mapred classpath² command
> >> to get the classpath, and then append hive/hbase dependencies; Please
> >> check kylin.log to see whether the final classpath includes the jar for
> >> this missing class;
> >>
> >> The message in kylin.log is as below, you can search it:
> >>
> >> Hadoop job classpath is:
> >>
> >>
> >> On 9/7/15, 10:18 PM, "yu feng" <ol...@gmail.com> wrote:
> >>
> >> >After submit a mapreduce job, we get job status,But it tells the job
> >> >failed!  we check this application on RM
> >>website(xxx:8088/cluster/apps),
> >> >we
> >> >find those log :
> >> >
> >> > Application application_1418904565842_3597024 failed 2 times due to AM
> >> >Container for appattempt_1418904565842_3597024_000002 exited with
> >> >exitCode:
> >> >1 due to: Exception from container-launch:
> >> >org.apache.hadoop.util.Shell$ExitCodeException:
> >> >at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
> >> >at org.apache.hadoop.util.Shell.run(Shell.java:379)
> >> >at
> >>
> >>>org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589
> >>>)
> >> >at
> >>
> >>>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchC
> >>>on
> >> >tainer(LinuxContainerExecutor.java:252)
> >> >at
> >>
> >>>org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Cont
> >>>ai
> >> >nerLaunch.call(ContainerLaunch.java:283)
> >> >at
> >>
> >>>org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Cont
> >>>ai
> >> >nerLaunch.call(ContainerLaunch.java:79)
> >> >at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> >> >at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >> >at
> >>
> >>>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
> >>>r.
> >> >java:895)
> >> >at
> >>
> >>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> >>>va
> >> >:918)
> >> >at java.lang.Thread.run(Thread.java:662)
> >> >main : command provided 1
> >> >
> >> >we check yarn log with this command : yarn logs -applicationId
> >> >application_1418904565842_3597024,get those log :
> >> >Container: container_1418904565842_3597024_01_000001 on
> >> >hadoop88.photo.163.org_56708
> >>
> >>>========================================================================
> >>>==
> >> >============
> >> >LogType: stderr
> >> >LogLength: 664
> >> >Log Contents:
> >> >Exception in thread "main" java.lang.NoClassDefFoundError:
> >> >org/apache/hadoop/mapreduce/v2/app/MRAppMaster
> >> >Caused by: java.lang.ClassNotFoundException:
> >> >org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> >> >at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> >> >at java.security.AccessController.doPrivileged(Native Method)
> >> >at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> >> >at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> >> >at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> >> >at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> >> >Could not find the main class:
> >> >org.apache.hadoop.mapreduce.v2.app.MRAppMaster.  Program will exit.
> >> >
> >> >
> >> >I thinks this means the task can not find the jar files, So I upload
> >>all
> >> >the jars that kylin dependent to HDFS, and before submit this job*(in
> >> >AbstractHadoopJob.attachKylinPropsAndMetadata function )* I set
> >>"tmpjars"
> >> >to those files located on HDFS(this way can avoid uploading all files
> >>when
> >> >submit every mapreduce job).
> >> >
> >> >This measure works in kylin-0.7.2, I get the same error in kylin-1.0
> >>and I
> >> >guess this measure will work in kylin-1.0 too, But I do not think this
> >>is
> >> >a
> >> >good idea,
> >> >
> >> >It will be highly appreciated if you have some good idea or some
> >> >suggestion. Thanks...
> >>
> >>
>
>

Re: Can not execute mapreduce job as missing jars

Posted by "Shi, Shaofeng" <sh...@ebay.com>.
Hi Feng, thanks for pointing out that the hive jar repeated in the job
configuration, I just created a JIRA to fix it:
https://issues.apache.org/jira/browse/KYLIN-1015

Regrading your question about the jar files located in local disk instead
of HDFS, yes the hadoop/hive/hbase jars should exist in local disk on each
machine of the hadoop cluster, with the same locations; Kylin will not
upload those jars; Please check and ensure the consistency of your hadoop
cluster. 

On 9/8/15, 10:16 AM, "yu feng" <ol...@gmail.com> wrote:

>I find those log like this :
>[pool-5-thread-1]:[2015-09-07
>20:58:41,746][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob.setJobCl
>asspath(AbstractHadoopJob.java:166)]
>- Hadoop job classpath is:
>/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/etc/hadoop:/home/nrpt/hadoop_hiv
>e_hbase/hadoop-2.2.0/share/hadoop/common/lib/*:/home/nrpt/hadoop_hive_hbas
>e/hadoop-2.2.0/share/hadoop/common/*:/home/nrpt/hadoop_hive_hbase/hadoop-2
>.2.0/share/hadoop/hdfs/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/sha
>re/hadoop/hdfs/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/ya
>rn/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/*:/ho
>me/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/*:/home/
>nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/*:/home/nrpt/ha
>doop_hive_hbase/hadoop-2.2.0/contrib/capacity-scheduler/*.jar:/home/nrpt/h
>adoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-clie
>nt-app-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/ma
>preduce/hadoop-mapreduce-client-common-2.2.0.jar:/home/nrpt/hadoop_hive_hb
>ase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0
>.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hado
>op-mapreduce-client-hs-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0
>/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.2.0.jar:/home
>/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapredu
>ce-client-jobclient-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/sh
>are/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar:/ho
>me/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapre
>duce-client-shuffle-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/sh
>are/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar:/home/nrpt/hadoop
>_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-api-2.2.0.jar:/home
>/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applica
>tions-distributedshell-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0
>/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.2.0.ja
>r:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-
>client-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/ya
>rn/hadoop-yarn-common-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/
>share/hadoop/yarn/hadoop-yarn-server-common-2.2.0.jar:/home/nrpt/hadoop_hi
>ve_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.2
>.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-
>yarn-server-resourcemanager-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-
>2.2.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.2.0.jar:/home/nrpt/hado
>op_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-
>2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hado
>op-yarn-site-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/had
>oop/mapreduce/lib/aopalliance-1.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-
>2.2.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/nrpt/hadoop_hive_hbase/
>hadoop-2.2.0/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/home/nrpt/hadoop_h
>ive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.j
>ar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/co
>mmons-io-2.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/ma
>preduce/lib/guice-3.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/
>hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/nrpt/hadoop_hive_hbase/ha
>doop-2.2.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.2.0.jar:/home/n
>rpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-lzo-0
>.4.20.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce
>/lib/hamcrest-core-1.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share
>/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/home/nrpt/hadoop_hive_hb
>ase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/
>home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/javax.
>inject-1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapred
>uce/lib/jersey-core-1.9.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/shar
>e/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/nrpt/hadoop_hive_hbase/h
>adoop-2.2.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/nrpt/ha
>doop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/junit-4.10.jar:/ho
>me/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/log4j-1.
>2.17.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/
>lib/netty-3.6.2.Final.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/
>hadoop/mapreduce/lib/paranamer-2.3.jar:/home/nrpt/hadoop_hive_hbase/hadoop
>-2.2.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/nrpt/hadoo
>p_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.j
>ar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/xz
>-1.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/modules/*.jar
>.....................
>
>
>There are two line in this log . the first line is the hadoop default
>classpath outputed by command 'mapred classpath' , and it ends with a new
>line '\n', I think the character need to be deleted.
>what's more , I check source code of setting job classpath :
>
>        if (kylinHBaseDependency != null) {
>            // yarn classpath is comma separated
>            kylinHBaseDependency = kylinHBaseDependency.replace(":", ",");
>            classpath = classpath + "," + kylinHBaseDependency;
>        }
>
>        if (kylinHiveDependency != null) {
>            // yarn classpath is comma separated
>            kylinHiveDependency = kylinHiveDependency.replace(":", ",");
>            classpath = classpath + "," + kylinHiveDependency;
>        }
>
>        jobConf.set(MAP_REDUCE_CLASSPATH, classpath + "," +
>kylinHiveDependency);
>        logger.info("Hadoop job classpath is: " +
>job.getConfiguration().get(MAP_REDUCE_CLASSPATH));
>
>it looks like we append kylinHiveDependency to classpath twice, I do not
>know what it means..
>
>Lastly, the property of
>MAP_REDUCE_CLASSPATH(mapreduce.application.classpath) is setted up  some
>jar files located at local filesystem, Actually I do not know if those
>files will be uploaded to HDFS just as the files added in property
>'tmpjars
>'.
>
>2015-09-08 9:35 GMT+08:00 Shi, Shaofeng <sh...@ebay.com>:
>
>> Hi feng, your map reduce classpath might not be correctly configured;
>> Kylin will read the ³mapreduce.application.classpath² from default job
>> configuration; if not found that, it will run ³mapred classpath² command
>> to get the classpath, and then append hive/hbase dependencies; Please
>> check kylin.log to see whether the final classpath includes the jar for
>> this missing class;
>>
>> The message in kylin.log is as below, you can search it:
>>
>> Hadoop job classpath is:
>>
>>
>> On 9/7/15, 10:18 PM, "yu feng" <ol...@gmail.com> wrote:
>>
>> >After submit a mapreduce job, we get job status,But it tells the job
>> >failed!  we check this application on RM
>>website(xxx:8088/cluster/apps),
>> >we
>> >find those log :
>> >
>> > Application application_1418904565842_3597024 failed 2 times due to AM
>> >Container for appattempt_1418904565842_3597024_000002 exited with
>> >exitCode:
>> >1 due to: Exception from container-launch:
>> >org.apache.hadoop.util.Shell$ExitCodeException:
>> >at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>> >at org.apache.hadoop.util.Shell.run(Shell.java:379)
>> >at
>> 
>>>org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589
>>>)
>> >at
>> 
>>>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchC
>>>on
>> >tainer(LinuxContainerExecutor.java:252)
>> >at
>> 
>>>org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Cont
>>>ai
>> >nerLaunch.call(ContainerLaunch.java:283)
>> >at
>> 
>>>org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Cont
>>>ai
>> >nerLaunch.call(ContainerLaunch.java:79)
>> >at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >at
>> 
>>>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
>>>r.
>> >java:895)
>> >at
>> 
>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>va
>> >:918)
>> >at java.lang.Thread.run(Thread.java:662)
>> >main : command provided 1
>> >
>> >we check yarn log with this command : yarn logs -applicationId
>> >application_1418904565842_3597024,get those log :
>> >Container: container_1418904565842_3597024_01_000001 on
>> >hadoop88.photo.163.org_56708
>> 
>>>========================================================================
>>>==
>> >============
>> >LogType: stderr
>> >LogLength: 664
>> >Log Contents:
>> >Exception in thread "main" java.lang.NoClassDefFoundError:
>> >org/apache/hadoop/mapreduce/v2/app/MRAppMaster
>> >Caused by: java.lang.ClassNotFoundException:
>> >org.apache.hadoop.mapreduce.v2.app.MRAppMaster
>> >at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>> >at java.security.AccessController.doPrivileged(Native Method)
>> >at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>> >at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>> >at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>> >at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>> >Could not find the main class:
>> >org.apache.hadoop.mapreduce.v2.app.MRAppMaster.  Program will exit.
>> >
>> >
>> >I thinks this means the task can not find the jar files, So I upload
>>all
>> >the jars that kylin dependent to HDFS, and before submit this job*(in
>> >AbstractHadoopJob.attachKylinPropsAndMetadata function )* I set
>>"tmpjars"
>> >to those files located on HDFS(this way can avoid uploading all files
>>when
>> >submit every mapreduce job).
>> >
>> >This measure works in kylin-0.7.2, I get the same error in kylin-1.0
>>and I
>> >guess this measure will work in kylin-1.0 too, But I do not think this
>>is
>> >a
>> >good idea,
>> >
>> >It will be highly appreciated if you have some good idea or some
>> >suggestion. Thanks...
>>
>>


Re: Can not execute mapreduce job as missing jars

Posted by yu feng <ol...@gmail.com>.
I find those log like this :
[pool-5-thread-1]:[2015-09-07
20:58:41,746][INFO][org.apache.kylin.job.hadoop.AbstractHadoopJob.setJobClasspath(AbstractHadoopJob.java:166)]
- Hadoop job classpath is:
/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/etc/hadoop:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/common/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/common/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/hdfs/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/hdfs/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/*:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/contrib/capacity-scheduler/*.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-api-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-client-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-common-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-common-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-site-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-io-2.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-3.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.2.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-lzo-0.4.20.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/hamcrest-core-1.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/javax.inject-1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/junit-4.10.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/share/hadoop/mapreduce/lib/xz-1.0.jar:/home/nrpt/hadoop_hive_hbase/hadoop-2.2.0/modules/*.jar
.....................


There are two line in this log . the first line is the hadoop default
classpath outputed by command 'mapred classpath' , and it ends with a new
line '\n', I think the character need to be deleted.
what's more , I check source code of setting job classpath :

        if (kylinHBaseDependency != null) {
            // yarn classpath is comma separated
            kylinHBaseDependency = kylinHBaseDependency.replace(":", ",");
            classpath = classpath + "," + kylinHBaseDependency;
        }

        if (kylinHiveDependency != null) {
            // yarn classpath is comma separated
            kylinHiveDependency = kylinHiveDependency.replace(":", ",");
            classpath = classpath + "," + kylinHiveDependency;
        }

        jobConf.set(MAP_REDUCE_CLASSPATH, classpath + "," +
kylinHiveDependency);
        logger.info("Hadoop job classpath is: " +
job.getConfiguration().get(MAP_REDUCE_CLASSPATH));

it looks like we append kylinHiveDependency to classpath twice, I do not
know what it means..

Lastly, the property of
MAP_REDUCE_CLASSPATH(mapreduce.application.classpath) is setted up  some
jar files located at local filesystem, Actually I do not know if those
files will be uploaded to HDFS just as the files added in property 'tmpjars
'.

2015-09-08 9:35 GMT+08:00 Shi, Shaofeng <sh...@ebay.com>:

> Hi feng, your map reduce classpath might not be correctly configured;
> Kylin will read the ³mapreduce.application.classpath² from default job
> configuration; if not found that, it will run ³mapred classpath² command
> to get the classpath, and then append hive/hbase dependencies; Please
> check kylin.log to see whether the final classpath includes the jar for
> this missing class;
>
> The message in kylin.log is as below, you can search it:
>
> Hadoop job classpath is:
>
>
> On 9/7/15, 10:18 PM, "yu feng" <ol...@gmail.com> wrote:
>
> >After submit a mapreduce job, we get job status,But it tells the job
> >failed!  we check this application on RM website(xxx:8088/cluster/apps),
> >we
> >find those log :
> >
> > Application application_1418904565842_3597024 failed 2 times due to AM
> >Container for appattempt_1418904565842_3597024_000002 exited with
> >exitCode:
> >1 due to: Exception from container-launch:
> >org.apache.hadoop.util.Shell$ExitCodeException:
> >at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
> >at org.apache.hadoop.util.Shell.run(Shell.java:379)
> >at
> >org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> >at
> >org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchCon
> >tainer(LinuxContainerExecutor.java:252)
> >at
> >org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Contai
> >nerLaunch.call(ContainerLaunch.java:283)
> >at
> >org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Contai
> >nerLaunch.call(ContainerLaunch.java:79)
> >at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> >at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >at
> >java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.
> >java:895)
> >at
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
> >:918)
> >at java.lang.Thread.run(Thread.java:662)
> >main : command provided 1
> >
> >we check yarn log with this command : yarn logs -applicationId
> >application_1418904565842_3597024,get those log :
> >Container: container_1418904565842_3597024_01_000001 on
> >hadoop88.photo.163.org_56708
> >==========================================================================
> >============
> >LogType: stderr
> >LogLength: 664
> >Log Contents:
> >Exception in thread "main" java.lang.NoClassDefFoundError:
> >org/apache/hadoop/mapreduce/v2/app/MRAppMaster
> >Caused by: java.lang.ClassNotFoundException:
> >org.apache.hadoop.mapreduce.v2.app.MRAppMaster
> >at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
> >at java.security.AccessController.doPrivileged(Native Method)
> >at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
> >at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> >at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
> >at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
> >Could not find the main class:
> >org.apache.hadoop.mapreduce.v2.app.MRAppMaster.  Program will exit.
> >
> >
> >I thinks this means the task can not find the jar files, So I upload all
> >the jars that kylin dependent to HDFS, and before submit this job*(in
> >AbstractHadoopJob.attachKylinPropsAndMetadata function )* I set "tmpjars"
> >to those files located on HDFS(this way can avoid uploading all files when
> >submit every mapreduce job).
> >
> >This measure works in kylin-0.7.2, I get the same error in kylin-1.0 and I
> >guess this measure will work in kylin-1.0 too, But I do not think this is
> >a
> >good idea,
> >
> >It will be highly appreciated if you have some good idea or some
> >suggestion. Thanks...
>
>

Re: Can not execute mapreduce job as missing jars

Posted by "Shi, Shaofeng" <sh...@ebay.com>.
Hi feng, your map reduce classpath might not be correctly configured;
Kylin will read the ³mapreduce.application.classpath² from default job
configuration; if not found that, it will run ³mapred classpath² command
to get the classpath, and then append hive/hbase dependencies; Please
check kylin.log to see whether the final classpath includes the jar for
this missing class;

The message in kylin.log is as below, you can search it:

Hadoop job classpath is:


On 9/7/15, 10:18 PM, "yu feng" <ol...@gmail.com> wrote:

>After submit a mapreduce job, we get job status,But it tells the job
>failed!  we check this application on RM website(xxx:8088/cluster/apps),
>we
>find those log :
>
> Application application_1418904565842_3597024 failed 2 times due to AM
>Container for appattempt_1418904565842_3597024_000002 exited with
>exitCode:
>1 due to: Exception from container-launch:
>org.apache.hadoop.util.Shell$ExitCodeException:
>at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
>at org.apache.hadoop.util.Shell.run(Shell.java:379)
>at 
>org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>at
>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchCon
>tainer(LinuxContainerExecutor.java:252)
>at
>org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Contai
>nerLaunch.call(ContainerLaunch.java:283)
>at
>org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Contai
>nerLaunch.call(ContainerLaunch.java:79)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at
>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.
>java:895)
>at
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:918)
>at java.lang.Thread.run(Thread.java:662)
>main : command provided 1
>
>we check yarn log with this command : yarn logs -applicationId
>application_1418904565842_3597024,get those log :
>Container: container_1418904565842_3597024_01_000001 on
>hadoop88.photo.163.org_56708
>==========================================================================
>============
>LogType: stderr
>LogLength: 664
>Log Contents:
>Exception in thread "main" java.lang.NoClassDefFoundError:
>org/apache/hadoop/mapreduce/v2/app/MRAppMaster
>Caused by: java.lang.ClassNotFoundException:
>org.apache.hadoop.mapreduce.v2.app.MRAppMaster
>at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>at java.security.AccessController.doPrivileged(Native Method)
>at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>Could not find the main class:
>org.apache.hadoop.mapreduce.v2.app.MRAppMaster.  Program will exit.
>
>
>I thinks this means the task can not find the jar files, So I upload all
>the jars that kylin dependent to HDFS, and before submit this job*(in
>AbstractHadoopJob.attachKylinPropsAndMetadata function )* I set "tmpjars"
>to those files located on HDFS(this way can avoid uploading all files when
>submit every mapreduce job).
>
>This measure works in kylin-0.7.2, I get the same error in kylin-1.0 and I
>guess this measure will work in kylin-1.0 too, But I do not think this is
>a
>good idea,
>
>It will be highly appreciated if you have some good idea or some
>suggestion. Thanks...