You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kylin.apache.org by Ashika Umanga Umagiliya <um...@gmail.com> on 2016/09/14 05:41:28 UTC

"kylin_sales_cube" sample cube generation fails at Step 3 "Extract Fact Table Distinct Columns"

Greetings,

Our Hadoop cluster is 2.4.2.
We installed Kylin on a seperate edge node(client node).
I managed to create sample data using "sample.sh" script.

But when I try to build the Cube, it stops at the Step 4. (caused by failed
MR Job)


​
MapReduce job in the Hadoop UI looked fine:



​
Somehow the MR job has stopped in the middle as shown in following pic:


​

When I see the logs for the MRJob I see some ClassNotFound exceptions :

MRJob log:
---------------------------------------------------------

2016-09-14 05:23:05,567 INFO [Socket Reader #1 for port 59775]
SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for
job_1472454550517_20461 (auth:SIMPLE)
2016-09-14 05:23:05,576 INFO [Socket Reader #1 for port 59775]
SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
Authorization successful for job_1472454550517_20461 (auth:TOKEN) for
protocol=interface org.apache.hadoop.mapred.TaskUmbilicalProtocol
2016-09-14 05:23:05,588 INFO [IPC Server handler 2 on 59775]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID :
jvm_1472454550517_20461_m_000002 asked for a task
2016-09-14 05:23:05,589 INFO [IPC Server handler 2 on 59775]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID:
jvm_1472454550517_20461_m_000002 given task:
attempt_1472454550517_20461_m_000000_0
2016-09-14 05:23:06,481 ERROR [IPC Server handler 2 on 59775]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
attempt_1472454550517_20461_m_000000_0 - exited :
java.lang.ClassNotFoundException: org.apache.thrift.TBase
  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at java.lang.ClassLoader.defineClass1(Native Method)
  at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
  at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
  at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
  at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at java.lang.Class.getDeclaredFields0(Native Method)
  at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
  at java.lang.Class.getDeclaredField(Class.java:2068)
  at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1703)
  at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
  at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
  at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
  at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
  at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)
  at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
  at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
  at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
  at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
  at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:501)
  at
org.apache.hive.hcatalog.mapreduce.InputJobInfo.readObject(InputJobInfo.java:181)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:497)
  at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
  at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
  at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
  at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUtil.java:118)
  at
org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.createRecordReader(HCatBaseInputFormat.java:183)
  at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:515)
  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

2016-09-14 05:23:06,481 INFO [IPC Server handler 2 on 59775]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from
attempt_1472454550517_20461_m_000000_0: Error:
java.lang.ClassNotFoundException: org.apache.thrift.TBase
  at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at java.lang.ClassLoader.defineClass1(Native Method)
  at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
  at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
  at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
  at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at java.lang.Class.getDeclaredFields0(Native Method)
  at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
  at java.lang.Class.getDeclaredField(Class.java:2068)
  at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1703)
  at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
  at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
  at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
  at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
  at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)
  at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1623)
  at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
  at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
  at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
  at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:501)
  at
org.apache.hive.hcatalog.mapreduce.InputJobInfo.readObject(InputJobInfo.java:181)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:497)
  at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
  at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
  at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
  at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUtil.java:118)
  at
org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.createRecordReader(HCatBaseInputFormat.java:183)
  at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:515)
  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
  at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
----------------------------------------

Any tips ?

Re: "kylin_sales_cube" sample cube generation fails at Step 3 "Extract Fact Table Distinct Columns"

Posted by ShaoFeng Shi <sh...@apache.org>.
Hi Ashika,

This "NoClassDefFoundError" occurs in Kylin node (when submitting a MR
job), so the "kylin.job.mr.lib.dir" doesn't have effect. Usually it was
caused by the hive jar wasn't added to classpath when starting Kylin.

As you know, Kylin uses "hbase" shell to startup, and in kylin.sh it will
append the dependency jars to env variable "HBASE_CLASSPATH". Ideally HBase
shell will check and use this variable to start the JVM. While in some
hadoop distributions, the hbase shell will reset that variable, instead of
append. To verify this issue, you can simply do a test like:

export HBASE_CLASSPATH=ABC
hbase classpath

..../etc/tez/conf/:/usr/hdp/2.2.4.2-2/tez/*:/usr/hdp/2.2.4.2-2/tez/lib/*:/etc/tez/conf:/usr/hdp/2.2.4.2-2/hadoop/conf:/usr/hdp/2.2.4.2-2/hadoop/*:/usr/hdp/2.2.4.2-2/hadoop/lib/*:/usr/hdp/2.2.4.2-2/zookeeper/*:/usr/hdp/2.2.4.2-2/zookeeper/lib/*:
*ABC*


In the output, if you see "ABC" there, that means the hbase shell script is
as expected; Otherwise you need check and fix it to (e.g, in HDP
2.2, /usr/hdp/2.2.4.2-2/hbase/bin/hbase):

export
HBASE_CLASSPATH=$HADOOP_CONF:$HADOOP_HOME/*:$HADOOP_HOME/lib/*:$ZOOKEEPER_HOME/*:$ZOOKEEPER_HOME/lib/*:$HBASE_CLASSPATH


2016-09-26 10:11 GMT+08:00 Ashika Umanga Umagiliya <um...@gmail.com>:

> Thanks for the prompt reply,
> I managed to fix the the path issue.
>
> Our Kylin node has different installation folder structure compared to the
> nodes in our Hadoop Cluster.
> So I set the "kylin.job.mr.lib.dir" pointing to the local folder in Kylin
> node.
> This folder has following list of JARs.
> But as you can see, I still get the java.lang.NoClassDefFoundError:
> org/apache/hive/hcatalog/mapreduce/HCatInputFormat error (even though the
> hive-hcatalog-core JAR is there)
>
>
> JARS in "kylin.job.mr.lib.dir" folder :
> ----------------------------------------------
>
> file:/home/kylin/mr-libs/hive-common-1.2.1000.2.4.2.0-258.
> jar,file:/home/kylin/mr-libs/libthrift-0.9.2.jar,file:/
> home/kylin/mr-libs/hive-jdbc-1.2.1000.2.4.2.0-258-
> standalone.jar,file:/home/kylin/mr-libs/hive-hcatalog-
> pig-adapter-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-
> libs/hive-hcatalog-streaming-1.2.1000.2.4.2.0-258.jar,file:
> /home/kylin/mr-libs/hive-hwi-1.2.1000.2.4.2.0-258.jar,file:
> /home/kylin/mr-libs/hive-shims-0.20S-1.2.1000.2.4.2.0-
> 258.jar,file:/home/kylin/mr-libs/hadoop-common-2.7.1.2.4.
> 2.0-258.jar,file:/home/kylin/mr-libs/hive-hcatalog-server-
> extensions-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-
> libs/hadoop-common-2.7.1.2.4.2.0-258-tests.jar,file:/home/
> kylin/mr-libs/hive-contrib-1.2.1000.2.4.2.0-258.jar,file:/
> home/kylin/mr-libs/hive-jdbc-1.2.1000.2.4.2.0-258.jar,file:
> /home/kylin/mr-libs/hive-service-1.2.1000.2.4.2.0-258.
> jar,file:/home/kylin/mr-libs/hive-shims-0.23-1.2.1000.2.4.
> 2.0-258.jar,file:/home/kylin/mr-libs/hive-metastore-1.2.
> 1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-hbase-
> handler-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/
> hive-hcatalog-core-1.2.1000.2.4.2.0-258.jar,file:/home/
> kylin/mr-libs/hive-shims-scheduler-1.2.1000.2.4.2.0-
> 258.jar,file:/home/kylin/mr-libs/hive-ant-1.2.1000.2.4.2.
> 0-258.jar,file:/home/kylin/mr-libs/hadoop-aws-2.7.1.2.4.2.0-
> 258.jar,file:/home/kylin/mr-libs/hive-shims-1.2.1000.2.4.
> 2.0-258.jar,file:/home/kylin/mr-libs/hive-accumulo-handler-
> 1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-exec-
> 1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-
> beeline-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/
> hive-cli-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/
> hive-serde-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-
> libs/hive-shims-common-1.2.1000.2.4.2.0-258.jar,file:/
> home/kylin/mr-libs/hadoop-azure-2.7.1.2.4.2.0-258.jar,
> file:/home/kylin/mr-libs/hadoop-nfs-2.7.1.2.4.2.0-258.
> jar,file:/home/kylin/hdp_c5000/hive/hcatalog/share/
> hcatalog/hive-hcatalog-core-1.2.1000.2.4.2.0-258.jar,file:/
> home/kylin/hdp_c5000/spark-2.10-1.6.1.2.4.2.0-258/lib/
> spark-examples-1.6.1.2.4.2.0-258-hadoop2.7.1.2.4.2.0-258.jar
>
>
>
> ---------------------------------
> 2016-09-26 02:03:36,824 ERROR [pool-8-thread-10]
> threadpool.DefaultScheduler:140 : ExecuteException
> job:6b1ffb55-6e8e-41f5-8940-da930527e3cd
> org.apache.kylin.job.exception.ExecuteException: org.apache.kylin.job.exception.ExecuteException:
> java.lang.NoClassDefFoundError: org/apache/hive/hcatalog/
> mapreduce/HCatInputFormat
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:123)
> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(
> DefaultScheduler.java:136)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.kylin.job.exception.ExecuteException: java.lang.NoClassDefFoundError:
> org/apache/hive/hcatalog/mapreduce/HCatInputFormat
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:123)
> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:57)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:113)
> ... 4 more
> Caused by: java.lang.NoClassDefFoundError: org/apache/hive/hcatalog/
> mapreduce/HCatInputFormat
> at org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.
> configureJob(HiveMRInput.java:89)
> at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(
> FactDistinctColumnsJob.java:123)
> at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.
> run(FactDistinctColumnsJob.java:103)
> at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:88)
> at org.apache.kylin.engine.mr.common.MapReduceExecutable.
> doWork(MapReduceExecutable.java:120)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:113)
> ... 6 more
>
>
> On Mon, Sep 26, 2016 at 10:36 AM, ShaoFeng Shi <sh...@apache.org>
> wrote:
>
>> Kafka lib is only needed when building cube from kafka;
>>
>> It seems there is an emtpy path be passed, then causing this error.
>> Unfortunately there is no debug log for this. Please run the
>> bin/find-*-dependency.sh and check their outputs to see whether there is
>> invalid path.
>>
>> 2016-09-26 8:32 GMT+08:00 Ashika Umanga Umagiliya <um...@gmail.com>:
>>
>>> Thanks ,
>>>
>>> Seems setting the property solved missing JAR issue.
>>> But now getting a some new error in the same MR job:
>>> Any tips please ?
>>> Also , we dont have Kafka libraries installed in the Kylin node.Does
>>> Kylin need Kafka libraries as well ?
>>> I see in the log lines, it detects wrong JAR file for Kafka libs.
>>>
>>>
>>> ------------------
>>>
>>>
>>> 2016-09-26 00:26:36,103 INFO  [pool-8-thread-3]
>>> common.AbstractHadoopJob:240 : No Kafka dependency jars set in the
>>> environment, will find them from jvm:
>>> 2016-09-26 00:26:36,110 INFO  [pool-8-thread-3]
>>> common.AbstractHadoopJob:246 : kafka jar file:
>>> /home/atscale/hdp_c5000/spark-2.10-1.6.1.2.4.2.0-258/lib/spa
>>> rk-examples-1.6.1.2.4.2.0-258-hadoop2.7.1.2.4.2.0-258.jar
>>> 2016-09-26 00:26:36,111 ERROR [pool-8-thread-3]
>>> steps.FactDistinctColumnsJob:111 : error in FactDistinctColumnsJob
>>> java.lang.IllegalArgumentException: Can not create a Path from an empty
>>> string
>>> at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
>>> at org.apache.hadoop.fs.Path.<init>(Path.java:134)
>>> at org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobTm
>>> pJarsAndFiles(AbstractHadoopJob.java:309)
>>> at org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobCl
>>> asspath(AbstractHadoopJob.java:266)
>>> at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(
>>> FactDistinctColumnsJob.java:88)
>>> at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:88)
>>> at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork
>>> (MapReduceExecutable.java:120)
>>> at org.apache.kylin.job.execution.AbstractExecutable.execute(Ab
>>> stractExecutable.java:113)
>>> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWo
>>> rk(DefaultChainedExecutable.java:57)
>>> at org.apache.kylin.job.execution.AbstractExecutable.execute(Ab
>>> stractExecutable.java:113)
>>> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRun
>>> ner.run(DefaultScheduler.java:136)
>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1142)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:617)
>>> at java.lang.Thread.run(Thread.java:745)
>>> 2016-09-26 00:26:36,114 INFO  [pool-8-thread-3]
>>> common.AbstractHadoopJob:511 : tempMetaFileString is : null
>>> 2016-09-26 00:26:36,115 ERROR [pool-8-thread-3]
>>> common.MapReduceExecutable:127 : error execute
>>> MapReduceExecutable{id=dd0f11bc-d20f-47b3-a2b7-9a44393ed22a-02,
>>> name=Extract Fact Table Distinct Columns, state=RUNNING}
>>> java.lang.IllegalArgumentException: Can not create a Path from an empty
>>> string
>>> at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
>>> at org.apache.hadoop.fs.Path.<init>(Path.java:134)
>>> at org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobTm
>>> pJarsAndFiles(AbstractHadoopJob.java:309)
>>> at org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobCl
>>> asspath(AbstractHadoopJob.java:266)
>>> at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(
>>> FactDistinctColumnsJob.java:88)
>>> at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:88)
>>> at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork
>>> (MapReduceExecutable.java:120)
>>> at org.apache.kylin.job.execution.AbstractExecutable.execute(Ab
>>> stractExecutable.java:113)
>>> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWo
>>> rk(DefaultChainedExecutable.java:57)
>>> at org.apache.kylin.job.execution.AbstractExecutable.execute(Ab
>>> stractExecutable.java:113)
>>> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRun
>>> ner.run(DefaultScheduler.java:136)
>>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1142)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:617)
>>> at java.lang.Thread.run(Thread.java:745)
>>> 2016-09-26 00:26:36,117 DEBUG [pool-8-thread-3]
>>> hbase.HBaseResourceStore:262 : Update row /execute_output/dd0f11bc-d20f-47b3-a2b7-9a44393ed22a-02
>>> from oldTs: 1474849595977, to newTs: 1474849596115, operation r
>>>
>>> On Mon, Sep 19, 2016 at 9:24 AM, Li Yang <li...@apache.org> wrote:
>>>
>>>> So thrift jars are not on MR classpath. Many ways to fix. From Kylin
>>>> side, there is a property called "kylin.job.mr.lib.dir" that you can
>>>> use.
>>>>
>>>> See more here: https://issues.apache.org/jira/browse/KYLIN-1021
>>>>
>>>> Cheers
>>>> Yang
>>>>
>>>>
>>>> On Wed, Sep 14, 2016 at 1:41 PM, Ashika Umanga Umagiliya <
>>>> umanga.pdn@gmail.com> wrote:
>>>>
>>>>> Greetings,
>>>>>
>>>>> Our Hadoop cluster is 2.4.2.
>>>>> We installed Kylin on a seperate edge node(client node).
>>>>> I managed to create sample data using "sample.sh" script.
>>>>>
>>>>> But when I try to build the Cube, it stops at the Step 4. (caused by
>>>>> failed MR Job)
>>>>>
>>>>>
>>>>> ​
>>>>> MapReduce job in the Hadoop UI looked fine:
>>>>>
>>>>>
>>>>>
>>>>> ​
>>>>> Somehow the MR job has stopped in the middle as shown in following pic:
>>>>>
>>>>>
>>>>> ​
>>>>>
>>>>> When I see the logs for the MRJob I see some ClassNotFound exceptions
>>>>> :
>>>>>
>>>>> MRJob log:
>>>>> ---------------------------------------------------------
>>>>>
>>>>> 2016-09-14 05:23:05,567 INFO [Socket Reader #1 for port 59775]
>>>>> SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for
>>>>> job_1472454550517_20461 (auth:SIMPLE)
>>>>> 2016-09-14 05:23:05,576 INFO [Socket Reader #1 for port 59775]
>>>>> SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
>>>>> Authorization successful for job_1472454550517_20461 (auth:TOKEN) for
>>>>> protocol=interface org.apache.hadoop.mapred.TaskUmbilicalProtocol
>>>>> 2016-09-14 05:23:05,588 INFO [IPC Server handler 2 on 59775]
>>>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID :
>>>>> jvm_1472454550517_20461_m_000002 asked for a task
>>>>> 2016-09-14 05:23:05,589 INFO [IPC Server handler 2 on 59775]
>>>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID:
>>>>> jvm_1472454550517_20461_m_000002 given task:
>>>>> attempt_1472454550517_20461_m_000000_0
>>>>> 2016-09-14 05:23:06,481 ERROR [IPC Server handler 2 on 59775]
>>>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
>>>>> attempt_1472454550517_20461_m_000000_0 - exited :
>>>>> java.lang.ClassNotFoundException: org.apache.thrift.TBase
>>>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>   at java.lang.ClassLoader.defineClass1(Native Method)
>>>>>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>>>>>   at java.security.SecureClassLoader.defineClass(SecureClassLoade
>>>>> r.java:142)
>>>>>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>>>>>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>>>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>>>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>   at java.lang.Class.getDeclaredFields0(Native Method)
>>>>>   at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
>>>>>   at java.lang.Class.getDeclaredField(Class.java:2068)
>>>>>   at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.
>>>>> java:1703)
>>>>>   at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
>>>>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
>>>>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
>>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>>   at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
>>>>>   at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
>>>>>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.jav
>>>>> a:598)
>>>>>   at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream
>>>>> .java:1623)
>>>>>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.ja
>>>>> va:1518)
>>>>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>>>>> am.java:1774)
>>>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java
>>>>> :1351)
>>>>>   at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea
>>>>> m.java:2000)
>>>>>   at java.io.ObjectInputStream.defaultReadObject(ObjectInputStrea
>>>>> m.java:501)
>>>>>   at org.apache.hive.hcatalog.mapreduce.InputJobInfo.readObject(I
>>>>> nputJobInfo.java:181)
>>>>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>>>> ssorImpl.java:62)
>>>>>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>>> thodAccessorImpl.java:43)
>>>>>   at java.lang.reflect.Method.invoke(Method.java:497)
>>>>>   at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass
>>>>> .java:1058)
>>>>>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.j
>>>>> ava:1900)
>>>>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>>>>> am.java:1801)
>>>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java
>>>>> :1351)
>>>>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>>>   at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUti
>>>>> l.java:118)
>>>>>   at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.creat
>>>>> eRecordReader(HCatBaseInputFormat.java:183)
>>>>>   at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<in
>>>>> it>(MapTask.java:515)
>>>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758)
>>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>>   at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>>> upInformation.java:1709)
>>>>>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>>>>>
>>>>> 2016-09-14 05:23:06,481 INFO [IPC Server handler 2 on 59775]
>>>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
>>>>> from attempt_1472454550517_20461_m_000000_0: Error:
>>>>> java.lang.ClassNotFoundException: org.apache.thrift.TBase
>>>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>   at java.lang.ClassLoader.defineClass1(Native Method)
>>>>>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>>>>>   at java.security.SecureClassLoader.defineClass(SecureClassLoade
>>>>> r.java:142)
>>>>>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>>>>>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>>>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>>>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>   at java.lang.Class.getDeclaredFields0(Native Method)
>>>>>   at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
>>>>>   at java.lang.Class.getDeclaredField(Class.java:2068)
>>>>>   at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.
>>>>> java:1703)
>>>>>   at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
>>>>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
>>>>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
>>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>>   at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
>>>>>   at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
>>>>>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.jav
>>>>> a:598)
>>>>>   at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream
>>>>> .java:1623)
>>>>>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.ja
>>>>> va:1518)
>>>>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>>>>> am.java:1774)
>>>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java
>>>>> :1351)
>>>>>   at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea
>>>>> m.java:2000)
>>>>>   at java.io.ObjectInputStream.defaultReadObject(ObjectInputStrea
>>>>> m.java:501)
>>>>>   at org.apache.hive.hcatalog.mapreduce.InputJobInfo.readObject(I
>>>>> nputJobInfo.java:181)
>>>>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>>>> ssorImpl.java:62)
>>>>>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>>> thodAccessorImpl.java:43)
>>>>>   at java.lang.reflect.Method.invoke(Method.java:497)
>>>>>   at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass
>>>>> .java:1058)
>>>>>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.j
>>>>> ava:1900)
>>>>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>>>>> am.java:1801)
>>>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java
>>>>> :1351)
>>>>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>>>   at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUti
>>>>> l.java:118)
>>>>>   at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.creat
>>>>> eRecordReader(HCatBaseInputFormat.java:183)
>>>>>   at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<in
>>>>> it>(MapTask.java:515)
>>>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758)
>>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>>   at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>>> upInformation.java:1709)
>>>>>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>>>>> ----------------------------------------
>>>>>
>>>>> Any tips ?
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Umanga
>>> http://jp.linkedin.com/in/umanga
>>> http://umanga.ifreepages.com
>>>
>>
>>
>>
>> --
>> Best regards,
>>
>> Shaofeng Shi 史少锋
>>
>>
>
>
> --
> Umanga
> http://jp.linkedin.com/in/umanga
> http://umanga.ifreepages.com
>



-- 
Best regards,

Shaofeng Shi 史少锋

Re: "kylin_sales_cube" sample cube generation fails at Step 3 "Extract Fact Table Distinct Columns"

Posted by Ashika Umanga Umagiliya <um...@gmail.com>.
Thanks for the prompt reply,
I managed to fix the the path issue.

Our Kylin node has different installation folder structure compared to the
nodes in our Hadoop Cluster.
So I set the "kylin.job.mr.lib.dir" pointing to the local folder in Kylin
node.
This folder has following list of JARs.
But as you can see, I still get the java.lang.NoClassDefFoundError:
org/apache/hive/hcatalog/mapreduce/HCatInputFormat error (even though the
hive-hcatalog-core JAR is there)


JARS in "kylin.job.mr.lib.dir" folder :
----------------------------------------------

file:/home/kylin/mr-libs/hive-common-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/libthrift-0.9.2.jar,file:/home/kylin/mr-libs/hive-jdbc-1.2.1000.2.4.2.0-258-standalone.jar,file:/home/kylin/mr-libs/hive-hcatalog-pig-adapter-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-hcatalog-streaming-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-hwi-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-shims-0.20S-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hadoop-common-2.7.1.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-hcatalog-server-extensions-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hadoop-common-2.7.1.2.4.2.0-258-tests.jar,file:/home/kylin/mr-libs/hive-contrib-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-jdbc-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-service-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-shims-0.23-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-metastore-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-hbase-handler-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-hcatalog-core-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-shims-scheduler-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-ant-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hadoop-aws-2.7.1.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-shims-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-accumulo-handler-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-exec-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-beeline-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-cli-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-serde-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hive-shims-common-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hadoop-azure-2.7.1.2.4.2.0-258.jar,file:/home/kylin/mr-libs/hadoop-nfs-2.7.1.2.4.2.0-258.jar,file:/home/kylin/hdp_c5000/hive/hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1000.2.4.2.0-258.jar,file:/home/kylin/hdp_c5000/spark-2.10-1.6.1.2.4.2.0-258/lib/spark-examples-1.6.1.2.4.2.0-258-hadoop2.7.1.2.4.2.0-258.jar



---------------------------------
2016-09-26 02:03:36,824 ERROR [pool-8-thread-10]
threadpool.DefaultScheduler:140 : ExecuteException
job:6b1ffb55-6e8e-41f5-8940-da930527e3cd
org.apache.kylin.job.exception.ExecuteException:
org.apache.kylin.job.exception.ExecuteException:
java.lang.NoClassDefFoundError:
org/apache/hive/hcatalog/mapreduce/HCatInputFormat
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:123)
at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.kylin.job.exception.ExecuteException:
java.lang.NoClassDefFoundError:
org/apache/hive/hcatalog/mapreduce/HCatInputFormat
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:123)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
... 4 more
Caused by: java.lang.NoClassDefFoundError:
org/apache/hive/hcatalog/mapreduce/HCatInputFormat
at
org.apache.kylin.source.hive.HiveMRInput$HiveTableInputFormat.configureJob(HiveMRInput.java:89)
at
org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.setupMapper(FactDistinctColumnsJob.java:123)
at
org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:103)
at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:88)
at
org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
... 6 more


On Mon, Sep 26, 2016 at 10:36 AM, ShaoFeng Shi <sh...@apache.org>
wrote:

> Kafka lib is only needed when building cube from kafka;
>
> It seems there is an emtpy path be passed, then causing this error.
> Unfortunately there is no debug log for this. Please run the
> bin/find-*-dependency.sh and check their outputs to see whether there is
> invalid path.
>
> 2016-09-26 8:32 GMT+08:00 Ashika Umanga Umagiliya <um...@gmail.com>:
>
>> Thanks ,
>>
>> Seems setting the property solved missing JAR issue.
>> But now getting a some new error in the same MR job:
>> Any tips please ?
>> Also , we dont have Kafka libraries installed in the Kylin node.Does
>> Kylin need Kafka libraries as well ?
>> I see in the log lines, it detects wrong JAR file for Kafka libs.
>>
>>
>> ------------------
>>
>>
>> 2016-09-26 00:26:36,103 INFO  [pool-8-thread-3]
>> common.AbstractHadoopJob:240 : No Kafka dependency jars set in the
>> environment, will find them from jvm:
>> 2016-09-26 00:26:36,110 INFO  [pool-8-thread-3]
>> common.AbstractHadoopJob:246 : kafka jar file:
>> /home/atscale/hdp_c5000/spark-2.10-1.6.1.2.4.2.0-258/lib/spa
>> rk-examples-1.6.1.2.4.2.0-258-hadoop2.7.1.2.4.2.0-258.jar
>> 2016-09-26 00:26:36,111 ERROR [pool-8-thread-3]
>> steps.FactDistinctColumnsJob:111 : error in FactDistinctColumnsJob
>> java.lang.IllegalArgumentException: Can not create a Path from an empty
>> string
>> at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
>> at org.apache.hadoop.fs.Path.<init>(Path.java:134)
>> at org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobTm
>> pJarsAndFiles(AbstractHadoopJob.java:309)
>> at org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobCl
>> asspath(AbstractHadoopJob.java:266)
>> at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(
>> FactDistinctColumnsJob.java:88)
>> at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:88)
>> at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork
>> (MapReduceExecutable.java:120)
>> at org.apache.kylin.job.execution.AbstractExecutable.execute(
>> AbstractExecutable.java:113)
>> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWo
>> rk(DefaultChainedExecutable.java:57)
>> at org.apache.kylin.job.execution.AbstractExecutable.execute(
>> AbstractExecutable.java:113)
>> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRun
>> ner.run(DefaultScheduler.java:136)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1142)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>> 2016-09-26 00:26:36,114 INFO  [pool-8-thread-3]
>> common.AbstractHadoopJob:511 : tempMetaFileString is : null
>> 2016-09-26 00:26:36,115 ERROR [pool-8-thread-3]
>> common.MapReduceExecutable:127 : error execute
>> MapReduceExecutable{id=dd0f11bc-d20f-47b3-a2b7-9a44393ed22a-02,
>> name=Extract Fact Table Distinct Columns, state=RUNNING}
>> java.lang.IllegalArgumentException: Can not create a Path from an empty
>> string
>> at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
>> at org.apache.hadoop.fs.Path.<init>(Path.java:134)
>> at org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobTm
>> pJarsAndFiles(AbstractHadoopJob.java:309)
>> at org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobCl
>> asspath(AbstractHadoopJob.java:266)
>> at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(
>> FactDistinctColumnsJob.java:88)
>> at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:88)
>> at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork
>> (MapReduceExecutable.java:120)
>> at org.apache.kylin.job.execution.AbstractExecutable.execute(
>> AbstractExecutable.java:113)
>> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWo
>> rk(DefaultChainedExecutable.java:57)
>> at org.apache.kylin.job.execution.AbstractExecutable.execute(
>> AbstractExecutable.java:113)
>> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRun
>> ner.run(DefaultScheduler.java:136)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1142)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>> 2016-09-26 00:26:36,117 DEBUG [pool-8-thread-3]
>> hbase.HBaseResourceStore:262 : Update row /execute_output/dd0f11bc-d20f-47b3-a2b7-9a44393ed22a-02
>> from oldTs: 1474849595977, to newTs: 1474849596115, operation r
>>
>> On Mon, Sep 19, 2016 at 9:24 AM, Li Yang <li...@apache.org> wrote:
>>
>>> So thrift jars are not on MR classpath. Many ways to fix. From Kylin
>>> side, there is a property called "kylin.job.mr.lib.dir" that you can
>>> use.
>>>
>>> See more here: https://issues.apache.org/jira/browse/KYLIN-1021
>>>
>>> Cheers
>>> Yang
>>>
>>>
>>> On Wed, Sep 14, 2016 at 1:41 PM, Ashika Umanga Umagiliya <
>>> umanga.pdn@gmail.com> wrote:
>>>
>>>> Greetings,
>>>>
>>>> Our Hadoop cluster is 2.4.2.
>>>> We installed Kylin on a seperate edge node(client node).
>>>> I managed to create sample data using "sample.sh" script.
>>>>
>>>> But when I try to build the Cube, it stops at the Step 4. (caused by
>>>> failed MR Job)
>>>>
>>>>
>>>> ​
>>>> MapReduce job in the Hadoop UI looked fine:
>>>>
>>>>
>>>>
>>>> ​
>>>> Somehow the MR job has stopped in the middle as shown in following pic:
>>>>
>>>>
>>>> ​
>>>>
>>>> When I see the logs for the MRJob I see some ClassNotFound exceptions :
>>>>
>>>> MRJob log:
>>>> ---------------------------------------------------------
>>>>
>>>> 2016-09-14 05:23:05,567 INFO [Socket Reader #1 for port 59775]
>>>> SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for
>>>> job_1472454550517_20461 (auth:SIMPLE)
>>>> 2016-09-14 05:23:05,576 INFO [Socket Reader #1 for port 59775]
>>>> SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
>>>> Authorization successful for job_1472454550517_20461 (auth:TOKEN) for
>>>> protocol=interface org.apache.hadoop.mapred.TaskUmbilicalProtocol
>>>> 2016-09-14 05:23:05,588 INFO [IPC Server handler 2 on 59775]
>>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID :
>>>> jvm_1472454550517_20461_m_000002 asked for a task
>>>> 2016-09-14 05:23:05,589 INFO [IPC Server handler 2 on 59775]
>>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID:
>>>> jvm_1472454550517_20461_m_000002 given task:
>>>> attempt_1472454550517_20461_m_000000_0
>>>> 2016-09-14 05:23:06,481 ERROR [IPC Server handler 2 on 59775]
>>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
>>>> attempt_1472454550517_20461_m_000000_0 - exited :
>>>> java.lang.ClassNotFoundException: org.apache.thrift.TBase
>>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>   at java.lang.ClassLoader.defineClass1(Native Method)
>>>>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>>>>   at java.security.SecureClassLoader.defineClass(SecureClassLoade
>>>> r.java:142)
>>>>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>>>>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>   at java.lang.Class.getDeclaredFields0(Native Method)
>>>>   at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
>>>>   at java.lang.Class.getDeclaredField(Class.java:2068)
>>>>   at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.
>>>> java:1703)
>>>>   at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
>>>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
>>>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>   at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
>>>>   at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
>>>>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)
>>>>   at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream
>>>> .java:1623)
>>>>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.ja
>>>> va:1518)
>>>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>>>> am.java:1774)
>>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>>   at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea
>>>> m.java:2000)
>>>>   at java.io.ObjectInputStream.defaultReadObject(ObjectInputStrea
>>>> m.java:501)
>>>>   at org.apache.hive.hcatalog.mapreduce.InputJobInfo.readObject(I
>>>> nputJobInfo.java:181)
>>>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>>> ssorImpl.java:62)
>>>>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>> thodAccessorImpl.java:43)
>>>>   at java.lang.reflect.Method.invoke(Method.java:497)
>>>>   at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass
>>>> .java:1058)
>>>>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.j
>>>> ava:1900)
>>>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>>>> am.java:1801)
>>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>>   at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUti
>>>> l.java:118)
>>>>   at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.creat
>>>> eRecordReader(HCatBaseInputFormat.java:183)
>>>>   at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<in
>>>> it>(MapTask.java:515)
>>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758)
>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>   at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>> upInformation.java:1709)
>>>>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>>>>
>>>> 2016-09-14 05:23:06,481 INFO [IPC Server handler 2 on 59775]
>>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
>>>> from attempt_1472454550517_20461_m_000000_0: Error:
>>>> java.lang.ClassNotFoundException: org.apache.thrift.TBase
>>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>   at java.lang.ClassLoader.defineClass1(Native Method)
>>>>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>>>>   at java.security.SecureClassLoader.defineClass(SecureClassLoade
>>>> r.java:142)
>>>>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>>>>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>   at java.lang.Class.getDeclaredFields0(Native Method)
>>>>   at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
>>>>   at java.lang.Class.getDeclaredField(Class.java:2068)
>>>>   at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.
>>>> java:1703)
>>>>   at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
>>>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
>>>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>   at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
>>>>   at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
>>>>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)
>>>>   at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream
>>>> .java:1623)
>>>>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.ja
>>>> va:1518)
>>>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>>>> am.java:1774)
>>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>>   at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea
>>>> m.java:2000)
>>>>   at java.io.ObjectInputStream.defaultReadObject(ObjectInputStrea
>>>> m.java:501)
>>>>   at org.apache.hive.hcatalog.mapreduce.InputJobInfo.readObject(I
>>>> nputJobInfo.java:181)
>>>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>>> ssorImpl.java:62)
>>>>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>>> thodAccessorImpl.java:43)
>>>>   at java.lang.reflect.Method.invoke(Method.java:497)
>>>>   at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass
>>>> .java:1058)
>>>>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.j
>>>> ava:1900)
>>>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>>>> am.java:1801)
>>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>>   at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUti
>>>> l.java:118)
>>>>   at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.creat
>>>> eRecordReader(HCatBaseInputFormat.java:183)
>>>>   at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<in
>>>> it>(MapTask.java:515)
>>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758)
>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>   at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>>> upInformation.java:1709)
>>>>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>>>> ----------------------------------------
>>>>
>>>> Any tips ?
>>>>
>>>
>>>
>>
>>
>> --
>> Umanga
>> http://jp.linkedin.com/in/umanga
>> http://umanga.ifreepages.com
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>


-- 
Umanga
http://jp.linkedin.com/in/umanga
http://umanga.ifreepages.com

Re: "kylin_sales_cube" sample cube generation fails at Step 3 "Extract Fact Table Distinct Columns"

Posted by ShaoFeng Shi <sh...@apache.org>.
Kafka lib is only needed when building cube from kafka;

It seems there is an emtpy path be passed, then causing this error.
Unfortunately there is no debug log for this. Please run the
bin/find-*-dependency.sh and check their outputs to see whether there is
invalid path.

2016-09-26 8:32 GMT+08:00 Ashika Umanga Umagiliya <um...@gmail.com>:

> Thanks ,
>
> Seems setting the property solved missing JAR issue.
> But now getting a some new error in the same MR job:
> Any tips please ?
> Also , we dont have Kafka libraries installed in the Kylin node.Does Kylin
> need Kafka libraries as well ?
> I see in the log lines, it detects wrong JAR file for Kafka libs.
>
>
> ------------------
>
>
> 2016-09-26 00:26:36,103 INFO  [pool-8-thread-3]
> common.AbstractHadoopJob:240 : No Kafka dependency jars set in the
> environment, will find them from jvm:
> 2016-09-26 00:26:36,110 INFO  [pool-8-thread-3]
> common.AbstractHadoopJob:246 : kafka jar file:
> /home/atscale/hdp_c5000/spark-2.10-1.6.1.2.4.2.0-258/lib/
> spark-examples-1.6.1.2.4.2.0-258-hadoop2.7.1.2.4.2.0-258.jar
> 2016-09-26 00:26:36,111 ERROR [pool-8-thread-3]
> steps.FactDistinctColumnsJob:111 : error in FactDistinctColumnsJob
> java.lang.IllegalArgumentException: Can not create a Path from an empty
> string
> at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
> at org.apache.hadoop.fs.Path.<init>(Path.java:134)
> at org.apache.kylin.engine.mr.common.AbstractHadoopJob.
> setJobTmpJarsAndFiles(AbstractHadoopJob.java:309)
> at org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobClasspath(
> AbstractHadoopJob.java:266)
> at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.
> run(FactDistinctColumnsJob.java:88)
> at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:88)
> at org.apache.kylin.engine.mr.common.MapReduceExecutable.
> doWork(MapReduceExecutable.java:120)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:113)
> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:57)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:113)
> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(
> DefaultScheduler.java:136)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2016-09-26 00:26:36,114 INFO  [pool-8-thread-3]
> common.AbstractHadoopJob:511 : tempMetaFileString is : null
> 2016-09-26 00:26:36,115 ERROR [pool-8-thread-3]
> common.MapReduceExecutable:127 : error execute MapReduceExecutable{id=
> dd0f11bc-d20f-47b3-a2b7-9a44393ed22a-02, name=Extract Fact Table Distinct
> Columns, state=RUNNING}
> java.lang.IllegalArgumentException: Can not create a Path from an empty
> string
> at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
> at org.apache.hadoop.fs.Path.<init>(Path.java:134)
> at org.apache.kylin.engine.mr.common.AbstractHadoopJob.
> setJobTmpJarsAndFiles(AbstractHadoopJob.java:309)
> at org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobClasspath(
> AbstractHadoopJob.java:266)
> at org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.
> run(FactDistinctColumnsJob.java:88)
> at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:88)
> at org.apache.kylin.engine.mr.common.MapReduceExecutable.
> doWork(MapReduceExecutable.java:120)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:113)
> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:57)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:113)
> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(
> DefaultScheduler.java:136)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2016-09-26 00:26:36,117 DEBUG [pool-8-thread-3]
> hbase.HBaseResourceStore:262 : Update row /execute_output/dd0f11bc-d20f-47b3-a2b7-9a44393ed22a-02
> from oldTs: 1474849595977, to newTs: 1474849596115, operation r
>
> On Mon, Sep 19, 2016 at 9:24 AM, Li Yang <li...@apache.org> wrote:
>
>> So thrift jars are not on MR classpath. Many ways to fix. From Kylin
>> side, there is a property called "kylin.job.mr.lib.dir" that you can use.
>>
>> See more here: https://issues.apache.org/jira/browse/KYLIN-1021
>>
>> Cheers
>> Yang
>>
>>
>> On Wed, Sep 14, 2016 at 1:41 PM, Ashika Umanga Umagiliya <
>> umanga.pdn@gmail.com> wrote:
>>
>>> Greetings,
>>>
>>> Our Hadoop cluster is 2.4.2.
>>> We installed Kylin on a seperate edge node(client node).
>>> I managed to create sample data using "sample.sh" script.
>>>
>>> But when I try to build the Cube, it stops at the Step 4. (caused by
>>> failed MR Job)
>>>
>>>
>>> ​
>>> MapReduce job in the Hadoop UI looked fine:
>>>
>>>
>>>
>>> ​
>>> Somehow the MR job has stopped in the middle as shown in following pic:
>>>
>>>
>>> ​
>>>
>>> When I see the logs for the MRJob I see some ClassNotFound exceptions :
>>>
>>> MRJob log:
>>> ---------------------------------------------------------
>>>
>>> 2016-09-14 05:23:05,567 INFO [Socket Reader #1 for port 59775]
>>> SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for
>>> job_1472454550517_20461 (auth:SIMPLE)
>>> 2016-09-14 05:23:05,576 INFO [Socket Reader #1 for port 59775]
>>> SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
>>> Authorization successful for job_1472454550517_20461 (auth:TOKEN) for
>>> protocol=interface org.apache.hadoop.mapred.TaskUmbilicalProtocol
>>> 2016-09-14 05:23:05,588 INFO [IPC Server handler 2 on 59775]
>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID :
>>> jvm_1472454550517_20461_m_000002 asked for a task
>>> 2016-09-14 05:23:05,589 INFO [IPC Server handler 2 on 59775]
>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID:
>>> jvm_1472454550517_20461_m_000002 given task:
>>> attempt_1472454550517_20461_m_000000_0
>>> 2016-09-14 05:23:06,481 ERROR [IPC Server handler 2 on 59775]
>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
>>> attempt_1472454550517_20461_m_000000_0 - exited :
>>> java.lang.ClassNotFoundException: org.apache.thrift.TBase
>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>   at java.lang.ClassLoader.defineClass1(Native Method)
>>>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>>>   at java.security.SecureClassLoader.defineClass(SecureClassLoade
>>> r.java:142)
>>>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>>>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>   at java.lang.Class.getDeclaredFields0(Native Method)
>>>   at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
>>>   at java.lang.Class.getDeclaredField(Class.java:2068)
>>>   at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.
>>> java:1703)
>>>   at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
>>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
>>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>   at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
>>>   at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
>>>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)
>>>   at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream
>>> .java:1623)
>>>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.ja
>>> va:1518)
>>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>>> am.java:1774)
>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>   at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea
>>> m.java:2000)
>>>   at java.io.ObjectInputStream.defaultReadObject(ObjectInputStrea
>>> m.java:501)
>>>   at org.apache.hive.hcatalog.mapreduce.InputJobInfo.readObject(I
>>> nputJobInfo.java:181)
>>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:62)
>>>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>>   at java.lang.reflect.Method.invoke(Method.java:497)
>>>   at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass
>>> .java:1058)
>>>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.j
>>> ava:1900)
>>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>>> am.java:1801)
>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>   at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUti
>>> l.java:118)
>>>   at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.creat
>>> eRecordReader(HCatBaseInputFormat.java:183)
>>>   at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<in
>>> it>(MapTask.java:515)
>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758)
>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>   at javax.security.auth.Subject.doAs(Subject.java:422)
>>>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>> upInformation.java:1709)
>>>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>>>
>>> 2016-09-14 05:23:06,481 INFO [IPC Server handler 2 on 59775]
>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
>>> from attempt_1472454550517_20461_m_000000_0: Error:
>>> java.lang.ClassNotFoundException: org.apache.thrift.TBase
>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>   at java.lang.ClassLoader.defineClass1(Native Method)
>>>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>>>   at java.security.SecureClassLoader.defineClass(SecureClassLoade
>>> r.java:142)
>>>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>>>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>   at java.lang.Class.getDeclaredFields0(Native Method)
>>>   at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
>>>   at java.lang.Class.getDeclaredField(Class.java:2068)
>>>   at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.
>>> java:1703)
>>>   at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
>>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
>>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>   at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
>>>   at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
>>>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)
>>>   at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream
>>> .java:1623)
>>>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.ja
>>> va:1518)
>>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>>> am.java:1774)
>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>   at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea
>>> m.java:2000)
>>>   at java.io.ObjectInputStream.defaultReadObject(ObjectInputStrea
>>> m.java:501)
>>>   at org.apache.hive.hcatalog.mapreduce.InputJobInfo.readObject(I
>>> nputJobInfo.java:181)
>>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>>> ssorImpl.java:62)
>>>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>>> thodAccessorImpl.java:43)
>>>   at java.lang.reflect.Method.invoke(Method.java:497)
>>>   at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass
>>> .java:1058)
>>>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.j
>>> ava:1900)
>>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>>> am.java:1801)
>>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>   at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUti
>>> l.java:118)
>>>   at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.creat
>>> eRecordReader(HCatBaseInputFormat.java:183)
>>>   at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<in
>>> it>(MapTask.java:515)
>>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758)
>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>   at javax.security.auth.Subject.doAs(Subject.java:422)
>>>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>>> upInformation.java:1709)
>>>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>>> ----------------------------------------
>>>
>>> Any tips ?
>>>
>>
>>
>
>
> --
> Umanga
> http://jp.linkedin.com/in/umanga
> http://umanga.ifreepages.com
>



-- 
Best regards,

Shaofeng Shi 史少锋

Re: "kylin_sales_cube" sample cube generation fails at Step 3 "Extract Fact Table Distinct Columns"

Posted by Ashika Umanga Umagiliya <um...@gmail.com>.
Thanks ,

Seems setting the property solved missing JAR issue.
But now getting a some new error in the same MR job:
Any tips please ?
Also , we dont have Kafka libraries installed in the Kylin node.Does Kylin
need Kafka libraries as well ?
I see in the log lines, it detects wrong JAR file for Kafka libs.


------------------


2016-09-26 00:26:36,103 INFO  [pool-8-thread-3]
common.AbstractHadoopJob:240 : No Kafka dependency jars set in the
environment, will find them from jvm:
2016-09-26 00:26:36,110 INFO  [pool-8-thread-3]
common.AbstractHadoopJob:246 : kafka jar file:
/home/atscale/hdp_c5000/spark-2.10-1.6.1.2.4.2.0-258/lib/spark-examples-1.6.1.2.4.2.0-258-hadoop2.7.1.2.4.2.0-258.jar
2016-09-26 00:26:36,111 ERROR [pool-8-thread-3]
steps.FactDistinctColumnsJob:111 : error in FactDistinctColumnsJob
java.lang.IllegalArgumentException: Can not create a Path from an empty
string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
at org.apache.hadoop.fs.Path.<init>(Path.java:134)
at
org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobTmpJarsAndFiles(AbstractHadoopJob.java:309)
at
org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobClasspath(AbstractHadoopJob.java:266)
at
org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:88)
at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:88)
at
org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2016-09-26 00:26:36,114 INFO  [pool-8-thread-3]
common.AbstractHadoopJob:511 : tempMetaFileString is : null
2016-09-26 00:26:36,115 ERROR [pool-8-thread-3]
common.MapReduceExecutable:127 : error execute
MapReduceExecutable{id=dd0f11bc-d20f-47b3-a2b7-9a44393ed22a-02,
name=Extract Fact Table Distinct Columns, state=RUNNING}
java.lang.IllegalArgumentException: Can not create a Path from an empty
string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
at org.apache.hadoop.fs.Path.<init>(Path.java:134)
at
org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobTmpJarsAndFiles(AbstractHadoopJob.java:309)
at
org.apache.kylin.engine.mr.common.AbstractHadoopJob.setJobClasspath(AbstractHadoopJob.java:266)
at
org.apache.kylin.engine.mr.steps.FactDistinctColumnsJob.run(FactDistinctColumnsJob.java:88)
at org.apache.kylin.engine.mr.MRUtil.runMRJob(MRUtil.java:88)
at
org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:120)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
at
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:57)
at
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:113)
at
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:136)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2016-09-26 00:26:36,117 DEBUG [pool-8-thread-3]
hbase.HBaseResourceStore:262 : Update row
/execute_output/dd0f11bc-d20f-47b3-a2b7-9a44393ed22a-02 from oldTs:
1474849595977, to newTs: 1474849596115, operation r

On Mon, Sep 19, 2016 at 9:24 AM, Li Yang <li...@apache.org> wrote:

> So thrift jars are not on MR classpath. Many ways to fix. From Kylin side,
> there is a property called "kylin.job.mr.lib.dir" that you can use.
>
> See more here: https://issues.apache.org/jira/browse/KYLIN-1021
>
> Cheers
> Yang
>
>
> On Wed, Sep 14, 2016 at 1:41 PM, Ashika Umanga Umagiliya <
> umanga.pdn@gmail.com> wrote:
>
>> Greetings,
>>
>> Our Hadoop cluster is 2.4.2.
>> We installed Kylin on a seperate edge node(client node).
>> I managed to create sample data using "sample.sh" script.
>>
>> But when I try to build the Cube, it stops at the Step 4. (caused by
>> failed MR Job)
>>
>>
>> ​
>> MapReduce job in the Hadoop UI looked fine:
>>
>>
>>
>> ​
>> Somehow the MR job has stopped in the middle as shown in following pic:
>>
>>
>> ​
>>
>> When I see the logs for the MRJob I see some ClassNotFound exceptions :
>>
>> MRJob log:
>> ---------------------------------------------------------
>>
>> 2016-09-14 05:23:05,567 INFO [Socket Reader #1 for port 59775]
>> SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for
>> job_1472454550517_20461 (auth:SIMPLE)
>> 2016-09-14 05:23:05,576 INFO [Socket Reader #1 for port 59775]
>> SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
>> Authorization successful for job_1472454550517_20461 (auth:TOKEN) for
>> protocol=interface org.apache.hadoop.mapred.TaskUmbilicalProtocol
>> 2016-09-14 05:23:05,588 INFO [IPC Server handler 2 on 59775]
>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID :
>> jvm_1472454550517_20461_m_000002 asked for a task
>> 2016-09-14 05:23:05,589 INFO [IPC Server handler 2 on 59775]
>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID:
>> jvm_1472454550517_20461_m_000002 given task:
>> attempt_1472454550517_20461_m_000000_0
>> 2016-09-14 05:23:06,481 ERROR [IPC Server handler 2 on 59775]
>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
>> attempt_1472454550517_20461_m_000000_0 - exited :
>> java.lang.ClassNotFoundException: org.apache.thrift.TBase
>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>   at java.lang.ClassLoader.defineClass1(Native Method)
>>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>>   at java.security.SecureClassLoader.defineClass(SecureClassLoade
>> r.java:142)
>>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>>   at java.security.AccessController.doPrivileged(Native Method)
>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>   at java.lang.Class.getDeclaredFields0(Native Method)
>>   at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
>>   at java.lang.Class.getDeclaredField(Class.java:2068)
>>   at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.
>> java:1703)
>>   at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
>>   at java.security.AccessController.doPrivileged(Native Method)
>>   at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
>>   at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
>>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)
>>   at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream
>> .java:1623)
>>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>> am.java:1774)
>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>   at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea
>> m.java:2000)
>>   at java.io.ObjectInputStream.defaultReadObject(ObjectInputStrea
>> m.java:501)
>>   at org.apache.hive.hcatalog.mapreduce.InputJobInfo.readObject(
>> InputJobInfo.java:181)
>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:62)
>>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>   at java.lang.reflect.Method.invoke(Method.java:497)
>>   at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass
>> .java:1058)
>>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.
>> java:1900)
>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>> am.java:1801)
>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>   at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUti
>> l.java:118)
>>   at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.creat
>> eRecordReader(HCatBaseInputFormat.java:183)
>>   at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<
>> init>(MapTask.java:515)
>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758)
>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>   at java.security.AccessController.doPrivileged(Native Method)
>>   at javax.security.auth.Subject.doAs(Subject.java:422)
>>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>> upInformation.java:1709)
>>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>>
>> 2016-09-14 05:23:06,481 INFO [IPC Server handler 2 on 59775]
>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report
>> from attempt_1472454550517_20461_m_000000_0: Error:
>> java.lang.ClassNotFoundException: org.apache.thrift.TBase
>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>   at java.lang.ClassLoader.defineClass1(Native Method)
>>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>>   at java.security.SecureClassLoader.defineClass(SecureClassLoade
>> r.java:142)
>>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>>   at java.security.AccessController.doPrivileged(Native Method)
>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>   at java.lang.Class.getDeclaredFields0(Native Method)
>>   at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
>>   at java.lang.Class.getDeclaredField(Class.java:2068)
>>   at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.
>> java:1703)
>>   at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
>>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
>>   at java.security.AccessController.doPrivileged(Native Method)
>>   at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
>>   at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
>>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)
>>   at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream
>> .java:1623)
>>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>> am.java:1774)
>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>   at java.io.ObjectInputStream.defaultReadFields(ObjectInputStrea
>> m.java:2000)
>>   at java.io.ObjectInputStream.defaultReadObject(ObjectInputStrea
>> m.java:501)
>>   at org.apache.hive.hcatalog.mapreduce.InputJobInfo.readObject(
>> InputJobInfo.java:181)
>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:62)
>>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>   at java.lang.reflect.Method.invoke(Method.java:497)
>>   at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass
>> .java:1058)
>>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.
>> java:1900)
>>   at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStre
>> am.java:1801)
>>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>   at org.apache.hive.hcatalog.common.HCatUtil.deserialize(HCatUti
>> l.java:118)
>>   at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.creat
>> eRecordReader(HCatBaseInputFormat.java:183)
>>   at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<
>> init>(MapTask.java:515)
>>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758)
>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>   at java.security.AccessController.doPrivileged(Native Method)
>>   at javax.security.auth.Subject.doAs(Subject.java:422)
>>   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
>> upInformation.java:1709)
>>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>> ----------------------------------------
>>
>> Any tips ?
>>
>
>


-- 
Umanga
http://jp.linkedin.com/in/umanga
http://umanga.ifreepages.com

Re: "kylin_sales_cube" sample cube generation fails at Step 3 "Extract Fact Table Distinct Columns"

Posted by Li Yang <li...@apache.org>.
So thrift jars are not on MR classpath. Many ways to fix. From Kylin side,
there is a property called "kylin.job.mr.lib.dir" that you can use.

See more here: https://issues.apache.org/jira/browse/KYLIN-1021

Cheers
Yang


On Wed, Sep 14, 2016 at 1:41 PM, Ashika Umanga Umagiliya <
umanga.pdn@gmail.com> wrote:

> Greetings,
>
> Our Hadoop cluster is 2.4.2.
> We installed Kylin on a seperate edge node(client node).
> I managed to create sample data using "sample.sh" script.
>
> But when I try to build the Cube, it stops at the Step 4. (caused by
> failed MR Job)
>
>
> ​
> MapReduce job in the Hadoop UI looked fine:
>
>
>
> ​
> Somehow the MR job has stopped in the middle as shown in following pic:
>
>
> ​
>
> When I see the logs for the MRJob I see some ClassNotFound exceptions :
>
> MRJob log:
> ---------------------------------------------------------
>
> 2016-09-14 05:23:05,567 INFO [Socket Reader #1 for port 59775]
> SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful for
> job_1472454550517_20461 (auth:SIMPLE)
> 2016-09-14 05:23:05,576 INFO [Socket Reader #1 for port 59775]
> SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
> Authorization successful for job_1472454550517_20461 (auth:TOKEN) for
> protocol=interface org.apache.hadoop.mapred.TaskUmbilicalProtocol
> 2016-09-14 05:23:05,588 INFO [IPC Server handler 2 on 59775]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID :
> jvm_1472454550517_20461_m_000002 asked for a task
> 2016-09-14 05:23:05,589 INFO [IPC Server handler 2 on 59775]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: JVM with ID:
> jvm_1472454550517_20461_m_000002 given task:
> attempt_1472454550517_20461_m_000000_0
> 2016-09-14 05:23:06,481 ERROR [IPC Server handler 2 on 59775]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
> attempt_1472454550517_20461_m_000000_0 - exited : java.lang.ClassNotFoundException:
> org.apache.thrift.TBase
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>   at java.security.SecureClassLoader.defineClass(
> SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredFields0(Native Method)
>   at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
>   at java.lang.Class.getDeclaredField(Class.java:2068)
>   at java.io.ObjectStreamClass.getDeclaredSUID(
> ObjectStreamClass.java:1703)
>   at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
>   at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)
>   at java.io.ObjectInputStream.readNonProxyDesc(
> ObjectInputStream.java:1623)
>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>   at java.io.ObjectInputStream.readOrdinaryObject(
> ObjectInputStream.java:1774)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>   at java.io.ObjectInputStream.defaultReadFields(
> ObjectInputStream.java:2000)
>   at java.io.ObjectInputStream.defaultReadObject(
> ObjectInputStream.java:501)
>   at org.apache.hive.hcatalog.mapreduce.InputJobInfo.
> readObject(InputJobInfo.java:181)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at java.io.ObjectStreamClass.invokeReadObject(
> ObjectStreamClass.java:1058)
>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
>   at java.io.ObjectInputStream.readOrdinaryObject(
> ObjectInputStream.java:1801)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>   at org.apache.hive.hcatalog.common.HCatUtil.deserialize(
> HCatUtil.java:118)
>   at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.
> createRecordReader(HCatBaseInputFormat.java:183)
>   at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>
> (MapTask.java:515)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1709)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
>
> 2016-09-14 05:23:06,481 INFO [IPC Server handler 2 on 59775]
> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from
> attempt_1472454550517_20461_m_000000_0: Error: java.lang.ClassNotFoundException:
> org.apache.thrift.TBase
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.ClassLoader.defineClass1(Native Method)
>   at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
>   at java.security.SecureClassLoader.defineClass(
> SecureClassLoader.java:142)
>   at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>   at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.getDeclaredFields0(Native Method)
>   at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
>   at java.lang.Class.getDeclaredField(Class.java:2068)
>   at java.io.ObjectStreamClass.getDeclaredSUID(
> ObjectStreamClass.java:1703)
>   at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
>   at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
>   at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:598)
>   at java.io.ObjectInputStream.readNonProxyDesc(
> ObjectInputStream.java:1623)
>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>   at java.io.ObjectInputStream.readOrdinaryObject(
> ObjectInputStream.java:1774)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>   at java.io.ObjectInputStream.defaultReadFields(
> ObjectInputStream.java:2000)
>   at java.io.ObjectInputStream.defaultReadObject(
> ObjectInputStream.java:501)
>   at org.apache.hive.hcatalog.mapreduce.InputJobInfo.
> readObject(InputJobInfo.java:181)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
>   at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at java.io.ObjectStreamClass.invokeReadObject(
> ObjectStreamClass.java:1058)
>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
>   at java.io.ObjectInputStream.readOrdinaryObject(
> ObjectInputStream.java:1801)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>   at org.apache.hive.hcatalog.common.HCatUtil.deserialize(
> HCatUtil.java:118)
>   at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.
> createRecordReader(HCatBaseInputFormat.java:183)
>   at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>
> (MapTask.java:515)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:758)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1709)
>   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
> ----------------------------------------
>
> Any tips ?
>