You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kylin.apache.org by ShaoFeng Shi <sh...@apache.org> on 2017/06/19 16:10:40 UTC
Re: Build sample error with spark on kylin 2.0.0
Are you running Kylin on windows? If yes, check:
https://stackoverflow.com/questions/33211599/hadoop-error-on-windows-java-lang-unsatisfiedlinkerror
2017-06-19 21:55 GMT+08:00 skyyws <sk...@163.com>:
> Hi all,
> I met an error when using spark engine build kylin sample on step "Build
> Cube with Spark", here is the exception log:
> ------------------------------------------------------------
> -----------------------------
> Exception in thread "main" java.lang.UnsatisfiedLinkError:
> org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteAr
> ray(II[BI[BIILjava/lang/String;JZ)V
> at org.apache.hadoop.util.NativeCrc32.
> nativeComputeChunkedSumsByteArray(Native Method)
> at org.apache.hadoop.util.NativeCrc32.
> calculateChunkedSumsByteArray(NativeCrc32.java:86)
> at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(
> DataChecksum.java:430)
> at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(
> FSOutputSummer.java:202)
> at org.apache.hadoop.fs.FSOutputSummer.write1(
> FSOutputSummer.java:124)
> at org.apache.hadoop.fs.FSOutputSummer.write(
> FSOutputSummer.java:110)
> at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(
> FSDataOutputStream.java:58)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:80)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
> at org.apache.spark.deploy.yarn.Client.copyFileToRemote(
> Client.scala:317)
> at org.apache.spark.deploy.yarn.Client.org$apache$spark$
> deploy$yarn$Client$$distribute$1(Client.scala:407)
> at org.apache.spark.deploy.yarn.Client$$anonfun$
> prepareLocalResources$5.apply(Client.scala:446)
> at org.apache.spark.deploy.yarn.Client$$anonfun$
> prepareLocalResources$5.apply(Client.scala:444)
> at scala.collection.immutable.List.foreach(List.scala:318)
> at org.apache.spark.deploy.yarn.Client.prepareLocalResources(
> Client.scala:444)
> at org.apache.spark.deploy.yarn.Client.
> createContainerLaunchContext(Client.scala:727)
> at org.apache.spark.deploy.yarn.Client.submitApplication(
> Client.scala:142)
> at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.
> start(YarnClientSchedulerBackend.scala:57)
> at org.apache.spark.scheduler.TaskSchedulerImpl.start(
> TaskSchedulerImpl.scala:144)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
> at org.apache.spark.api.java.JavaSparkContext.<init>(
> JavaSparkContext.scala:59)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:150)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.
> java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(
> SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.
> scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 17/06/19 21:22:06 INFO storage.DiskBlockManager: Shutdown hook called
> 17/06/19 21:22:06 INFO util.ShutdownHookManager: Shutdown hook called
> 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28
> 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/httpd-
> 9bcb9a5d-569f-4f28-ad89-038a9020eda8
> 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/userFiles-
> 2e9ff265-3d37-40e0-8894-6fd4d1a3ad8b
>
> at org.apache.kylin.common.util.CliCommandExecutor.execute(
> CliCommandExecutor.java:92)
> at org.apache.kylin.engine.spark.SparkExecutable.doWork(
> SparkExecutable.java:124)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:64)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> JobRunner.run(DefaultScheduler.java:142)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> ------------------------------------------------------------
> -----------------------------
> I can use the kylin in-build spark-shell to do some operations like:
> ------------------------------------------------------------
> -----------------------------
> var textFile = sc.textFile("hdfs://xxxx/xxxx/README.md")
> textFile.count()
> textFile.first()
> textFile.filter(line => line.contains("hello")).count()
> ------------------------------------------------------------
> -----------------------------
> Here is the env info:
> kylin version is 2.0.0
> hadoop version is 2.7.*
> spark version is 1.6.*
> ------------------------------------------------------------
> -----------------------------
> Anyone can help me?THX
>
>
> 2017-06-19
> skyyws
--
Best regards,
Shaofeng Shi 史少锋
Re: Re: Re: Build sample error with spark on kylin 2.0.0
Posted by skyyws <sk...@163.com>.
Another question, when kylin execute the codes blew ( In SparkBatchCubingJobBuilder2.addLayerCubingSteps() ):
------------------------------------------------
StringUtil.appendWithSeparator(jars, findJar("org.htrace.HTraceConfiguration", null)); // htrace-core.jar
StringUtil.appendWithSeparator(jars, findJar("org.apache.htrace.Trace", null)); // htrace-core.jar
StringUtil.appendWithSeparator(jars, findJar("org.cloudera.htrace.HTraceConfiguration", null)); // htrace-core.jar
------------------------------------------------
It found three related jars:
------------------------------------------------
/xxx/kylin-deploy/kylin-2.0.0/lib/kylin-job-2.0.0.jar
/xxx/hadoop-2.7.3/share/hadoop/common/lib/htrace-core-3.1.0-incubating.jar
/xxx/hbase-0.98.8-hadoop2/lib/htrace-core-2.04.jar
------------------------------------------------
And these three jars were add into the "--jars" params when execute spark-submit in step "Build Cube with Spark". Is that correct?
2017-06-22
skyyws
发件人:"skyyws"<sk...@163.com>
发送时间:2017-06-22 17:21
主题:Re: Re: Re: Build sample error with spark on kylin 2.0.0
收件人:"dev"<de...@kylin.apache.org>
抄送:
Hi, ShaoFeng, sorry for trouble you again. I fixed the "NoclassDefFoundError"(just replace soft link with copy these jars), and there is another problem:
--------------------------------------------------------------------------------------------------------------
17/06/22 17:04:40 INFO TaskSetManager: Starting task 1.3 in stage 0.0 (TID 7, hadoop759.lt.163.org, partition 1,RACK_LOCAL, 3275 bytes)
17/06/22 17:05:14 WARN TaskSetManager: Lost task 0.3 in stage 0.0 (TID 6, hadoop759.lt.163.org): java.lang.IllegalArgumentException: Failed to find metadata store by url: kylin_metadata_2_0_0@hbase
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:99)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:110)
at org.apache.kylin.cube.CubeDescManager.getStore(CubeDescManager.java:370)
at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(CubeDescManager.java:298)
at org.apache.kylin.cube.CubeDescManager.<init>(CubeDescManager.java:109)
at org.apache.kylin.cube.CubeDescManager.getInstance(CubeDescManager.java:81)
at org.apache.kylin.cube.CubeInstance.getDescriptor(CubeInstance.java:114)
at org.apache.kylin.cube.CubeSegment.getCubeDesc(CubeSegment.java:119)
at org.apache.kylin.cube.CubeSegment.isEnableSharding(CubeSegment.java:477)
at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(RowKeyEncoder.java:48)
at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(AbstractRowKeyEncoder.java:48)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:205)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:193)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:191)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
17/06/22 17:05:14 ERROR TaskSetManager: Task 0 in stage 0.0 failed 4 times; aborting job
17/06/22 17:05:14 INFO YarnScheduler: Cancelling stage 0
17/06/22 17:05:14 INFO YarnScheduler: Stage 0 was cancelled
17/06/22 17:05:14 INFO DAGScheduler: ShuffleMapStage 0 (mapToPair at SparkCubingByLayer.java:193) failed in 265.116 s
17/06/22 17:05:14 INFO DAGScheduler: Job 0 failed: saveAsNewAPIHadoopFile at SparkCubingByLayer.java:288, took 265.278439 s
Exception in thread "main" java.lang.RuntimeException: error execute org.apache.kylin.engine.spark.SparkCubingByLayer
at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, hadoop759.lt.163.org): java.lang.IllegalArgumentException: Failed to find metadata store by url: kylin_metadata_2_0_0@hbase
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:99)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:110)
at org.apache.kylin.cube.CubeDescManager.getStore(CubeDescManager.java:370)
at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(CubeDescManager.java:298)
at org.apache.kylin.cube.CubeDescManager.<init>(CubeDescManager.java:109)
at org.apache.kylin.cube.CubeDescManager.getInstance(CubeDescManager.java:81)
at org.apache.kylin.cube.CubeInstance.getDescriptor(CubeInstance.java:114)
at org.apache.kylin.cube.CubeSegment.getCubeDesc(CubeSegment.java:119)
at org.apache.kylin.cube.CubeSegment.isEnableSharding(CubeSegment.java:477)
at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(RowKeyEncoder.java:48)
at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(AbstractRowKeyEncoder.java:48)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:205)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:193)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:191)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1922)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1144)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1074)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1074)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1074)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply$mcV$sp(PairRDDFunctions.scala:994)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:985)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:985)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:985)
at org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopFile(JavaPairRDD.scala:800)
at org.apache.kylin.engine.spark.SparkCubingByLayer.saveToHDFS(SparkCubingByLayer.java:288)
at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:257)
at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
... 10 more
Caused by: java.lang.IllegalArgumentException: Failed to find metadata store by url: kylin_metadata_2_0_0@hbase
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:99)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:110)
at org.apache.kylin.cube.CubeDescManager.getStore(CubeDescManager.java:370)
at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(CubeDescManager.java:298)
at org.apache.kylin.cube.CubeDescManager.<init>(CubeDescManager.java:109)
at org.apache.kylin.cube.CubeDescManager.getInstance(CubeDescManager.java:81)
at org.apache.kylin.cube.CubeInstance.getDescriptor(CubeInstance.java:114)
at org.apache.kylin.cube.CubeSegment.getCubeDesc(CubeSegment.java:119)
at org.apache.kylin.cube.CubeSegment.isEnableSharding(CubeSegment.java:477)
at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(RowKeyEncoder.java:48)
at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(AbstractRowKeyEncoder.java:48)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:205)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:193)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:191)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
-------------------------------------------------------------------------------------------------------------------------------------------------
I'm not sure it's the problem of my kylin or spark?
2017-06-22
skyyws
发件人:ShaoFeng Shi <sh...@apache.org>
发送时间:2017-06-22 12:52
主题:Re: Re: Re: Build sample error with spark on kylin 2.0.0
收件人:"dev"<de...@kylin.apache.org>
抄送:
The root cause should be " java.lang.NoClassDefFoundError:
org/cloudera/htrace/Trace". Please locate the "htrace-core.jar" in local
disk first, and then re-run the spark-submit command with adding this jar
in the param "--jars". If it works this time, then you can configure this
jar path in kylin.properties:
kylin.engine.spark.additional-jars=/path/to/htrace-core.jar
Then restart Kylin and resume the job.
2017-06-22 12:02 GMT+08:00 skyyws <sk...@163.com>:
> Hi ShaoFeng , no more other error msg before or after this in kylin.log,
> but I try to execute cmd by spart-submit directly like this:
> --------------------------------------------------------------------
> ./spark-submit --class org.apache.kylin.common.util.SparkEntry --conf
> spark.executor.instances=1 --conf spark.yarn.queue=default --conf
> spark.history.fs.logDirectory=hdfs://xxx/user/user1/kylin_2_0_0_test/spark-history
> --conf spark.master=yarn --conf spark.executor.memory=1G --conf
> spark.eventLog.enabled=true --conf spark.eventLog.dir=hdfs://xxx/
> user/user1/kylin_2_0_0_test/spark-history --conf spark.executor.cores=2
> --conf spark.submit.deployMode=cluster --files /xxx/hbase-0.98.8-hadoop2/conf/hbase-site.xml
> --jars /xxx/kylin-deploy/kylin-2.0.0/lib/kylin-job-2.0.0.jar,/user/
> user1/ext_lib/htrace-core-2.04.jar,/user/user1/ext_lib/
> hbase-client-0.98.8-hadoop2.jar,/user/user1/ext_lib/hbase-
> common-0.98.8-hadoop2.jar,/user/user1/ext_lib/hbase-
> protocol-0.98.8-hadoop2.jar,/user/user1/ext_lib/metrics-
> core-2.2.0.jar,/user/user1/ext_lib/guava-12.0.1.jar,
> /xxx/kylin-deploy/kylin-2.0.0/lib/kylin-job-2.0.0.jar -className
> org.apache.kylin.engine.spark.SparkCubingByLayer -hiveTable
> default.kylin_intermediate_kylin_sales_test_cube_a0cd9950_cddc_4c3b_aaa5_fddf87d1fdaa
> -segmentId a0cd9950-cddc-4c3b-aaa5-fddf87d1fdaa -confPath
> /xxx/kylin-deploy/kylin-2.0.0/conf -output hdfs:///user/user1/kylin_2_0_
> 0_test/kylin_metadata_2_0_0/kylin-7a376cb7-7ee7-43fd-95dd-
> 79c2c1999f40/kylin_sales_test_cube/cuboid/ -cubename kylin_sales_test_cube
> --------------------------------------------------------------------
> Then, I got some other msg from spark, here is the full error msg:
> --------------------------------------------------------------------
> 17/06/22 11:35:19 ERROR HBaseConnection: Error when open connection hbase
> java.io.IOException: java.lang.reflect.InvocationTargetException
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:413)
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:306)
> at org.apache.kylin.storage.hbase.HBaseConnection.get(
> HBaseConnection.java:229)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.getConnection(
> HBaseResourceStore.java:72)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.
> createHTableIfNeeded(HBaseResourceStore.java:89)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.<
> init>(HBaseResourceStore.java:85)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(
> ResourceStore.java:91)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
> at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(
> CubeManager.java:740)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:160)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:411)
> ... 27 more
> Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace
> at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(
> RecoverableZooKeeper.java:218)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479)
> at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(
> ZKClusterId.java:65)
> at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(
> ZooKeeperRegistry.java:83)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.retrieveClusterId(HConnectionManager.java:839)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.<init>(HConnectionManager.java:642)
> ... 32 more
> Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> ... 38 more
> 17/06/22 11:35:19 ERROR ResourceStore: Create new store instance failed
> java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(
> ResourceStore.java:91)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
> at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(
> CubeManager.java:740)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:160)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.IllegalArgumentException: File not exist by
> 'kylin_metadata_2_0_0@hbase': /xxx/spark-1.6.2-bin-hadoop2.
> 7/kylin_metadata_2_0_0@hbase
> at org.apache.kylin.common.persistence.FileResourceStore.
> <init>(FileResourceStore.java:49)
> ... 22 more
> 17/06/22 11:35:19 ERROR ResourceStore: Create new store instance failed
> java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(
> ResourceStore.java:91)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
> at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(
> CubeManager.java:740)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:160)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: org.apache.kylin.common.persistence.StorageException: Error
> when open connection hbase
> at org.apache.kylin.storage.hbase.HBaseConnection.get(
> HBaseConnection.java:242)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.getConnection(
> HBaseResourceStore.java:72)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.
> createHTableIfNeeded(HBaseResourceStore.java:89)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.<
> init>(HBaseResourceStore.java:85)
> ... 22 more
> Caused by: java.io.IOException: java.lang.reflect.
> InvocationTargetException
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:413)
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:306)
> at org.apache.kylin.storage.hbase.HBaseConnection.get(
> HBaseConnection.java:229)
> ... 25 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:411)
> ... 27 more
> Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace
> at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(
> RecoverableZooKeeper.java:218)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479)
> at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(
> ZKClusterId.java:65)
> at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(
> ZooKeeperRegistry.java:83)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.retrieveClusterId(HConnectionManager.java:839)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.<init>(HConnectionManager.java:642)
> ... 32 more
> Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> ... 38 more
> 17/06/22 11:35:19 INFO ClientCnxn: Opening socket connection to server
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL
> (unknown error)
> Exception in thread "main" java.lang.RuntimeException: error execute
> org.apache.kylin.engine.spark.SparkCubingByLayer
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:42)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.IllegalArgumentException: Failed to find metadata
> store by url: kylin_metadata_2_0_0@hbase
> at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(
> ResourceStore.java:99)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
> at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(
> CubeManager.java:740)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:160)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> ... 10 more
> 17/06/22 11:35:19 INFO ClientCnxn: Socket connection established to
> 127.0.0.1/127.0.0.1:2181, initiating session
> 17/06/22 11:35:19 INFO SparkContext: Invoking stop() from shutdown hook
> --------------------------------------------------------------------
> Thanks for your attention!
> 2017-06-22
>
> skyyws
>
>
>
> 发件人:ShaoFeng Shi <sh...@apache.org>
> 发送时间:2017-06-22 11:42
> 主题:Re: Re: Re: Build sample error with spark on kylin 2.0.0
> 收件人:"dev"<de...@kylin.apache.org>
> 抄送:
>
> hi Sky, glad to see it moves forward. The "Failed to find metadata store by
> url: kylin_metadata_2_0
> _0@hbase" is not root cause. Could you check more with the log files, is
> there any other error before or after this?
>
> 2017-06-21 20:43 GMT+08:00 skyyws <sk...@163.com>:
>
> > Thank you for your suggestion, Shaofeng Shi, I try to use hadoop client
> > 2.7.3, it worked. But I met another probelm:
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > 17/06/21 20:20:39 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
> > hadoop645.lt.163.org): java.lang.IllegalArgumentException: Failed to
> find
> > metadata store by url: kylin_metadata_2_0
> > _0@hbase
> > at org.apache.kylin.common.persistence.ResourceStore.
> > createResourceStore(ResourceStore.java:99)
> > at org.apache.kylin.common.persistence.ResourceStore.
> > getStore(ResourceStore.java:110)
> > at org.apache.kylin.cube.CubeDescManager.getStore(
> > CubeDescManager.java:370)
> > at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(
> > CubeDescManager.java:298)
> > at org.apache.kylin.cube.CubeDescManager.<init>(
> > CubeDescManager.java:109)
> > at org.apache.kylin.cube.CubeDescManager.getInstance(
> > CubeDescManager.java:81)
> > at org.apache.kylin.cube.CubeInstance.getDescriptor(
> > CubeInstance.java:114)
> > at org.apache.kylin.cube.CubeSegment.getCubeDesc(
> > CubeSegment.java:119)
> > at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(
> > RowKeyEncoder.java:50)
> > at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.
> createInstance(
> > AbstractRowKeyEncoder.java:48)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> > SparkCubingByLayer.java:205)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> > SparkCubingByLayer.java:193)
> > at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> > pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> > at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> > pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> > at org.apache.spark.util.collection.ExternalSorter.
> > insertAll(ExternalSorter.scala:191)
> > at org.apache.spark.shuffle.sort.SortShuffleWriter.write(
> > SortShuffleWriter.scala:64)
> > at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> > ShuffleMapTask.scala:73)
> > at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> > ShuffleMapTask.scala:41)
> > at org.apache.spark.scheduler.Task.run(Task.scala:89)
> > at org.apache.spark.executor.Executor$TaskRunner.run(
> > Executor.scala:227)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1145)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:615)
> > at java.lang.Thread.run(Thread.java:745)
> > 17/06/21 20:20:39 INFO TaskSetManager: Starting task 0.1 in stage 0.0
> (TID
> > 2, hadoop645.lt.163.org, partition 0,RACK_LOCAL, 3276 bytes)
> > 17/06/21 20:21:14 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1,
> > hadoop645.lt.163.org): java.lang.IllegalArgumentException: Failed to
> find
> > metadata store by url: kylin_metadata_2_0
> > _0@hbase
> > at org.apache.kylin.common.persistence.ResourceStore.
> > createResourceStore(ResourceStore.java:99)
> > at org.apache.kylin.common.persistence.ResourceStore.
> > getStore(ResourceStore.java:110)
> > at org.apache.kylin.cube.CubeDescManager.getStore(
> > CubeDescManager.java:370)
> > at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(
> > CubeDescManager.java:298)
> > at org.apache.kylin.cube.CubeDescManager.<init>(
> > CubeDescManager.java:109)
> > at org.apache.kylin.cube.CubeDescManager.getInstance(
> > CubeDescManager.java:81)
> > at org.apache.kylin.cube.CubeInstance.getDescriptor(
> > CubeInstance.java:114)
> > at org.apache.kylin.cube.CubeSegment.getCubeDesc(
> > CubeSegment.java:119)
> > at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(
> > RowKeyEncoder.java:50)
> > at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.
> createInstance(
> > AbstractRowKeyEncoder.java:48)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> > SparkCubingByLayer.java:205)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> > SparkCubingByLayer.java:193)
> > at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> > pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> > at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> > pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> > at org.apache.spark.util.collection.ExternalSorter.
> > insertAll(ExternalSorter.scala:191)
> > at org.apache.spark.shuffle.sort.SortShuffleWriter.write(
> > SortShuffleWriter.scala:64)
> > at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> > ShuffleMapTask.scala:73)
> > at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> > ShuffleMapTask.scala:41)
> > at org.apache.spark.scheduler.Task.run(Task.scala:89)
> > at org.apache.spark.executor.Executor$TaskRunner.run(
> > Executor.scala:227)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1145)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:615)
> > at java.lang.Thread.run(Thread.java:745)
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > But I can use the kylin in-build spark-shell to get data from hive and
> > hbase successfully, just like this:
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > sqlContext.sql("show tables").take(1)
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > import org.apache.spark._
> > import org.apache.spark.rdd.NewHadoopRDD
> > import org.apache.hadoop.fs.Path
> > import org.apache.hadoop.hbase.util.Bytes
> > import org.apache.hadoop.hbase.HColumnDescriptor
> > import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}
> > import org.apache.hadoop.hbase.client.{HBaseAdmin, Put, HTable, Result}
> > import org.apache.hadoop.hbase.mapreduce.TableInputFormat
> > import org.apache.hadoop.hbase.io.ImmutableBytesWritable
> > val conf = HBaseConfiguration.create()
> > conf.set("hbase.zookeeper.quorum", "localhost")
> > conf.set(TableInputFormat.INPUT_TABLE, "test_table")
> > val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
> > classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
> > classOf[org.apache.hadoop.hbase.client.Result])
> >
> > val res = hBaseRDD.take(1)
> > val rs = res(0)._2
> > val kv = rs.raw
> > for(keyvalue <- kv) println("rowkey:"+ new String(keyvalue.getRow)+ "
> > cf:"+new String(keyvalue.getFamily()) + " column:" + new
> > String(keyvalue.getQualifier) + " " + "value:"+new
> > String(keyvalue.getValue()))
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > By the way, I've already put hive-site.xml and hbase-site.xml into the
> > HADOOP_CONF_DIR and $SPARK_HOME/conf(which is acually
> > $KYLIN_HOME/spark/conf), and I also set spark.driver.extraClassPath in
> > spark-defaults.conf to attachs some related jars(hbase-client.jar,
> > hbase-common.jar and so on).
> > I don't know why, anyone could give me some advice?
> > 2017-06-21
> >
> > skyyws
> >
> >
> >
> > 发件人:ShaoFeng Shi <sh...@apache.org>
> > 发送时间:2017-06-20 15:13
> > 主题:Re: Re: Build sample error with spark on kylin 2.0.0
> > 收件人:"dev"<de...@kylin.apache.org>
> > 抄送:
> >
> > Or you can check whether there is old hadoop jars on your cluster,
> > according to https://issues.apache.org/jira/browse/HADOOP-11064
> >
> >
> > 2017-06-20 9:33 GMT+08:00 skyyws <sk...@163.com>:
> >
> > > No, I deploy kylin on linux, this is my machine info:
> > > --------------------------
> > > 3.2.0-4-amd64 #1 SMP Debian 3.2.82-1 x86_64 GNU/Linux
> > > -------------------------
> > >
> > > 2017-06-20
> > >
> > > skyyws
> > >
> > >
> > >
> > > 发件人:ShaoFeng Shi <sh...@apache.org>
> > > 发送时间:2017-06-20 00:10
> > > 主题:Re: Build sample error with spark on kylin 2.0.0
> > > 收件人:"dev"<de...@kylin.apache.org>
> > > 抄送:
> > >
> > > Are you running Kylin on windows? If yes, check:
> > > https://stackoverflow.com/questions/33211599/hadoop-
> > > error-on-windows-java-lang-unsatisfiedlinkerror
> > >
> > > 2017-06-19 21:55 GMT+08:00 skyyws <sk...@163.com>:
> > >
> > > > Hi all,
> > > > I met an error when using spark engine build kylin sample on step
> > "Build
> > > > Cube with Spark", here is the exception log:
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > Exception in thread "main" java.lang.UnsatisfiedLinkError:
> > > > org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteAr
> > > > ray(II[BI[BIILjava/lang/String;JZ)V
> > > > at org.apache.hadoop.util.NativeCrc32.
> > > > nativeComputeChunkedSumsByteArray(Native Method)
> > > > at org.apache.hadoop.util.NativeCrc32.
> > > > calculateChunkedSumsByteArray(NativeCrc32.java:86)
> > > > at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(
> > > > DataChecksum.java:430)
> > > > at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(
> > > > FSOutputSummer.java:202)
> > > > at org.apache.hadoop.fs.FSOutputSummer.write1(
> > > > FSOutputSummer.java:124)
> > > > at org.apache.hadoop.fs.FSOutputSummer.write(
> > > > FSOutputSummer.java:110)
> > > > at org.apache.hadoop.fs.FSDataOutputStream$
> > PositionCache.write(
> > > > FSDataOutputStream.java:58)
> > > > at java.io.DataOutputStream.write(DataOutputStream.java:107)
> > > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:80)
> > > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> > > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
> > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
> > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
> > > > at org.apache.spark.deploy.yarn.Client.copyFileToRemote(
> > > > Client.scala:317)
> > > > at org.apache.spark.deploy.yarn.Client.org$apache$spark$
> > > > deploy$yarn$Client$$distribute$1(Client.scala:407)
> > > > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > > > prepareLocalResources$5.apply(Client.scala:446)
> > > > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > > > prepareLocalResources$5.apply(Client.scala:444)
> > > > at scala.collection.immutable.List.foreach(List.scala:318)
> > > > at org.apache.spark.deploy.yarn.
> Client.prepareLocalResources(
> > > > Client.scala:444)
> > > > at org.apache.spark.deploy.yarn.Client.
> > > > createContainerLaunchContext(Client.scala:727)
> > > > at org.apache.spark.deploy.yarn.Client.submitApplication(
> > > > Client.scala:142)
> > > > at org.apache.spark.scheduler.cluster.
> > > YarnClientSchedulerBackend.
> > > > start(YarnClientSchedulerBackend.scala:57)
> > > > at org.apache.spark.scheduler.TaskSchedulerImpl.start(
> > > > TaskSchedulerImpl.scala:144)
> > > > at org.apache.spark.SparkContext.
> > <init>(SparkContext.scala:530)
> > > > at org.apache.spark.api.java.JavaSparkContext.<init>(
> > > > JavaSparkContext.scala:59)
> > > > at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> > > > SparkCubingByLayer.java:150)
> > > > at org.apache.kylin.common.util.AbstractApplication.execute(
> > > > AbstractApplication.java:37)
> > > > at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.
> > > > java:44)
> > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > > at sun.reflect.NativeMethodAccessorImpl.invoke(
> > > > NativeMethodAccessorImpl.java:57)
> > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > > > DelegatingMethodAccessorImpl.java:43)
> > > > at java.lang.reflect.Method.invoke(Method.java:606)
> > > > at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> > > > deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> > > > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> > > > SparkSubmit.scala:181)
> > > > at org.apache.spark.deploy.SparkSubmit$.submit(
> > > > SparkSubmit.scala:206)
> > > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.
> > > > scala:121)
> > > > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.
> scala)
> > > > 17/06/19 21:22:06 INFO storage.DiskBlockManager: Shutdown hook called
> > > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Shutdown hook called
> > > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28
> > > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/httpd-
> > > > 9bcb9a5d-569f-4f28-ad89-038a9020eda8
> > > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/userFiles-
> > > > 2e9ff265-3d37-40e0-8894-6fd4d1a3ad8b
> > > >
> > > > at org.apache.kylin.common.util.CliCommandExecutor.execute(
> > > > CliCommandExecutor.java:92)
> > > > at org.apache.kylin.engine.spark.SparkExecutable.doWork(
> > > > SparkExecutable.java:124)
> > > > at org.apache.kylin.job.execution.AbstractExecutable.
> > > > execute(AbstractExecutable.java:124)
> > > > at org.apache.kylin.job.execution.DefaultChainedExecutable.
> > > doWork(
> > > > DefaultChainedExecutable.java:64)
> > > > at org.apache.kylin.job.execution.AbstractExecutable.
> > > > execute(AbstractExecutable.java:124)
> > > > at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> > > > JobRunner.run(DefaultScheduler.java:142)
> > > > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > ThreadPoolExecutor.java:1145)
> > > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > ThreadPoolExecutor.java:615)
> > > > at java.lang.Thread.run(Thread.java:745)
> > > >
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > I can use the kylin in-build spark-shell to do some operations like:
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > var textFile = sc.textFile("hdfs://xxxx/xxxx/README.md")
> > > > textFile.count()
> > > > textFile.first()
> > > > textFile.filter(line => line.contains("hello")).count()
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > Here is the env info:
> > > > kylin version is 2.0.0
> > > > hadoop version is 2.7.*
> > > > spark version is 1.6.*
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > Anyone can help me?THX
> > > >
> > > >
> > > > 2017-06-19
> > > > skyyws
> > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > > Shaofeng Shi 史少锋
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
--
Best regards,
Shaofeng Shi 史少锋
Re: Re: Re: Build sample error with spark on kylin 2.0.0
Posted by skyyws <sk...@163.com>.
Hi, ShaoFeng, sorry for trouble you again. I fixed the "NoclassDefFoundError"(just replace soft link with copy these jars), and there is another problem:
--------------------------------------------------------------------------------------------------------------
17/06/22 17:04:40 INFO TaskSetManager: Starting task 1.3 in stage 0.0 (TID 7, hadoop759.lt.163.org, partition 1,RACK_LOCAL, 3275 bytes)
17/06/22 17:05:14 WARN TaskSetManager: Lost task 0.3 in stage 0.0 (TID 6, hadoop759.lt.163.org): java.lang.IllegalArgumentException: Failed to find metadata store by url: kylin_metadata_2_0_0@hbase
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:99)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:110)
at org.apache.kylin.cube.CubeDescManager.getStore(CubeDescManager.java:370)
at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(CubeDescManager.java:298)
at org.apache.kylin.cube.CubeDescManager.<init>(CubeDescManager.java:109)
at org.apache.kylin.cube.CubeDescManager.getInstance(CubeDescManager.java:81)
at org.apache.kylin.cube.CubeInstance.getDescriptor(CubeInstance.java:114)
at org.apache.kylin.cube.CubeSegment.getCubeDesc(CubeSegment.java:119)
at org.apache.kylin.cube.CubeSegment.isEnableSharding(CubeSegment.java:477)
at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(RowKeyEncoder.java:48)
at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(AbstractRowKeyEncoder.java:48)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:205)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:193)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:191)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
17/06/22 17:05:14 ERROR TaskSetManager: Task 0 in stage 0.0 failed 4 times; aborting job
17/06/22 17:05:14 INFO YarnScheduler: Cancelling stage 0
17/06/22 17:05:14 INFO YarnScheduler: Stage 0 was cancelled
17/06/22 17:05:14 INFO DAGScheduler: ShuffleMapStage 0 (mapToPair at SparkCubingByLayer.java:193) failed in 265.116 s
17/06/22 17:05:14 INFO DAGScheduler: Job 0 failed: saveAsNewAPIHadoopFile at SparkCubingByLayer.java:288, took 265.278439 s
Exception in thread "main" java.lang.RuntimeException: error execute org.apache.kylin.engine.spark.SparkCubingByLayer
at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 6, hadoop759.lt.163.org): java.lang.IllegalArgumentException: Failed to find metadata store by url: kylin_metadata_2_0_0@hbase
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:99)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:110)
at org.apache.kylin.cube.CubeDescManager.getStore(CubeDescManager.java:370)
at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(CubeDescManager.java:298)
at org.apache.kylin.cube.CubeDescManager.<init>(CubeDescManager.java:109)
at org.apache.kylin.cube.CubeDescManager.getInstance(CubeDescManager.java:81)
at org.apache.kylin.cube.CubeInstance.getDescriptor(CubeInstance.java:114)
at org.apache.kylin.cube.CubeSegment.getCubeDesc(CubeSegment.java:119)
at org.apache.kylin.cube.CubeSegment.isEnableSharding(CubeSegment.java:477)
at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(RowKeyEncoder.java:48)
at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(AbstractRowKeyEncoder.java:48)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:205)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:193)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:191)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1922)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1144)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1074)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1074)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1074)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply$mcV$sp(PairRDDFunctions.scala:994)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:985)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopFile$2.apply(PairRDDFunctions.scala:985)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:985)
at org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopFile(JavaPairRDD.scala:800)
at org.apache.kylin.engine.spark.SparkCubingByLayer.saveToHDFS(SparkCubingByLayer.java:288)
at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:257)
at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
... 10 more
Caused by: java.lang.IllegalArgumentException: Failed to find metadata store by url: kylin_metadata_2_0_0@hbase
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:99)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:110)
at org.apache.kylin.cube.CubeDescManager.getStore(CubeDescManager.java:370)
at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(CubeDescManager.java:298)
at org.apache.kylin.cube.CubeDescManager.<init>(CubeDescManager.java:109)
at org.apache.kylin.cube.CubeDescManager.getInstance(CubeDescManager.java:81)
at org.apache.kylin.cube.CubeInstance.getDescriptor(CubeInstance.java:114)
at org.apache.kylin.cube.CubeSegment.getCubeDesc(CubeSegment.java:119)
at org.apache.kylin.cube.CubeSegment.isEnableSharding(CubeSegment.java:477)
at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(RowKeyEncoder.java:48)
at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(AbstractRowKeyEncoder.java:48)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:205)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:193)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:191)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
-------------------------------------------------------------------------------------------------------------------------------------------------
I'm not sure it's the problem of my kylin or spark?
2017-06-22
skyyws
发件人:ShaoFeng Shi <sh...@apache.org>
发送时间:2017-06-22 12:52
主题:Re: Re: Re: Build sample error with spark on kylin 2.0.0
收件人:"dev"<de...@kylin.apache.org>
抄送:
The root cause should be " java.lang.NoClassDefFoundError:
org/cloudera/htrace/Trace". Please locate the "htrace-core.jar" in local
disk first, and then re-run the spark-submit command with adding this jar
in the param "--jars". If it works this time, then you can configure this
jar path in kylin.properties:
kylin.engine.spark.additional-jars=/path/to/htrace-core.jar
Then restart Kylin and resume the job.
2017-06-22 12:02 GMT+08:00 skyyws <sk...@163.com>:
> Hi ShaoFeng , no more other error msg before or after this in kylin.log,
> but I try to execute cmd by spart-submit directly like this:
> --------------------------------------------------------------------
> ./spark-submit --class org.apache.kylin.common.util.SparkEntry --conf
> spark.executor.instances=1 --conf spark.yarn.queue=default --conf
> spark.history.fs.logDirectory=hdfs://xxx/user/user1/kylin_2_0_0_test/spark-history
> --conf spark.master=yarn --conf spark.executor.memory=1G --conf
> spark.eventLog.enabled=true --conf spark.eventLog.dir=hdfs://xxx/
> user/user1/kylin_2_0_0_test/spark-history --conf spark.executor.cores=2
> --conf spark.submit.deployMode=cluster --files /xxx/hbase-0.98.8-hadoop2/conf/hbase-site.xml
> --jars /xxx/kylin-deploy/kylin-2.0.0/lib/kylin-job-2.0.0.jar,/user/
> user1/ext_lib/htrace-core-2.04.jar,/user/user1/ext_lib/
> hbase-client-0.98.8-hadoop2.jar,/user/user1/ext_lib/hbase-
> common-0.98.8-hadoop2.jar,/user/user1/ext_lib/hbase-
> protocol-0.98.8-hadoop2.jar,/user/user1/ext_lib/metrics-
> core-2.2.0.jar,/user/user1/ext_lib/guava-12.0.1.jar,
> /xxx/kylin-deploy/kylin-2.0.0/lib/kylin-job-2.0.0.jar -className
> org.apache.kylin.engine.spark.SparkCubingByLayer -hiveTable
> default.kylin_intermediate_kylin_sales_test_cube_a0cd9950_cddc_4c3b_aaa5_fddf87d1fdaa
> -segmentId a0cd9950-cddc-4c3b-aaa5-fddf87d1fdaa -confPath
> /xxx/kylin-deploy/kylin-2.0.0/conf -output hdfs:///user/user1/kylin_2_0_
> 0_test/kylin_metadata_2_0_0/kylin-7a376cb7-7ee7-43fd-95dd-
> 79c2c1999f40/kylin_sales_test_cube/cuboid/ -cubename kylin_sales_test_cube
> --------------------------------------------------------------------
> Then, I got some other msg from spark, here is the full error msg:
> --------------------------------------------------------------------
> 17/06/22 11:35:19 ERROR HBaseConnection: Error when open connection hbase
> java.io.IOException: java.lang.reflect.InvocationTargetException
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:413)
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:306)
> at org.apache.kylin.storage.hbase.HBaseConnection.get(
> HBaseConnection.java:229)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.getConnection(
> HBaseResourceStore.java:72)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.
> createHTableIfNeeded(HBaseResourceStore.java:89)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.<
> init>(HBaseResourceStore.java:85)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(
> ResourceStore.java:91)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
> at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(
> CubeManager.java:740)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:160)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:411)
> ... 27 more
> Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace
> at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(
> RecoverableZooKeeper.java:218)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479)
> at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(
> ZKClusterId.java:65)
> at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(
> ZooKeeperRegistry.java:83)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.retrieveClusterId(HConnectionManager.java:839)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.<init>(HConnectionManager.java:642)
> ... 32 more
> Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> ... 38 more
> 17/06/22 11:35:19 ERROR ResourceStore: Create new store instance failed
> java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(
> ResourceStore.java:91)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
> at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(
> CubeManager.java:740)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:160)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.IllegalArgumentException: File not exist by
> 'kylin_metadata_2_0_0@hbase': /xxx/spark-1.6.2-bin-hadoop2.
> 7/kylin_metadata_2_0_0@hbase
> at org.apache.kylin.common.persistence.FileResourceStore.
> <init>(FileResourceStore.java:49)
> ... 22 more
> 17/06/22 11:35:19 ERROR ResourceStore: Create new store instance failed
> java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(
> ResourceStore.java:91)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
> at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(
> CubeManager.java:740)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:160)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: org.apache.kylin.common.persistence.StorageException: Error
> when open connection hbase
> at org.apache.kylin.storage.hbase.HBaseConnection.get(
> HBaseConnection.java:242)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.getConnection(
> HBaseResourceStore.java:72)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.
> createHTableIfNeeded(HBaseResourceStore.java:89)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.<
> init>(HBaseResourceStore.java:85)
> ... 22 more
> Caused by: java.io.IOException: java.lang.reflect.
> InvocationTargetException
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:413)
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:306)
> at org.apache.kylin.storage.hbase.HBaseConnection.get(
> HBaseConnection.java:229)
> ... 25 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:411)
> ... 27 more
> Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace
> at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(
> RecoverableZooKeeper.java:218)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479)
> at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(
> ZKClusterId.java:65)
> at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(
> ZooKeeperRegistry.java:83)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.retrieveClusterId(HConnectionManager.java:839)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.<init>(HConnectionManager.java:642)
> ... 32 more
> Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> ... 38 more
> 17/06/22 11:35:19 INFO ClientCnxn: Opening socket connection to server
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL
> (unknown error)
> Exception in thread "main" java.lang.RuntimeException: error execute
> org.apache.kylin.engine.spark.SparkCubingByLayer
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:42)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.IllegalArgumentException: Failed to find metadata
> store by url: kylin_metadata_2_0_0@hbase
> at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(
> ResourceStore.java:99)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
> at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(
> CubeManager.java:740)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:160)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> ... 10 more
> 17/06/22 11:35:19 INFO ClientCnxn: Socket connection established to
> 127.0.0.1/127.0.0.1:2181, initiating session
> 17/06/22 11:35:19 INFO SparkContext: Invoking stop() from shutdown hook
> --------------------------------------------------------------------
> Thanks for your attention!
> 2017-06-22
>
> skyyws
>
>
>
> 发件人:ShaoFeng Shi <sh...@apache.org>
> 发送时间:2017-06-22 11:42
> 主题:Re: Re: Re: Build sample error with spark on kylin 2.0.0
> 收件人:"dev"<de...@kylin.apache.org>
> 抄送:
>
> hi Sky, glad to see it moves forward. The "Failed to find metadata store by
> url: kylin_metadata_2_0
> _0@hbase" is not root cause. Could you check more with the log files, is
> there any other error before or after this?
>
> 2017-06-21 20:43 GMT+08:00 skyyws <sk...@163.com>:
>
> > Thank you for your suggestion, Shaofeng Shi, I try to use hadoop client
> > 2.7.3, it worked. But I met another probelm:
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > 17/06/21 20:20:39 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
> > hadoop645.lt.163.org): java.lang.IllegalArgumentException: Failed to
> find
> > metadata store by url: kylin_metadata_2_0
> > _0@hbase
> > at org.apache.kylin.common.persistence.ResourceStore.
> > createResourceStore(ResourceStore.java:99)
> > at org.apache.kylin.common.persistence.ResourceStore.
> > getStore(ResourceStore.java:110)
> > at org.apache.kylin.cube.CubeDescManager.getStore(
> > CubeDescManager.java:370)
> > at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(
> > CubeDescManager.java:298)
> > at org.apache.kylin.cube.CubeDescManager.<init>(
> > CubeDescManager.java:109)
> > at org.apache.kylin.cube.CubeDescManager.getInstance(
> > CubeDescManager.java:81)
> > at org.apache.kylin.cube.CubeInstance.getDescriptor(
> > CubeInstance.java:114)
> > at org.apache.kylin.cube.CubeSegment.getCubeDesc(
> > CubeSegment.java:119)
> > at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(
> > RowKeyEncoder.java:50)
> > at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.
> createInstance(
> > AbstractRowKeyEncoder.java:48)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> > SparkCubingByLayer.java:205)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> > SparkCubingByLayer.java:193)
> > at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> > pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> > at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> > pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> > at org.apache.spark.util.collection.ExternalSorter.
> > insertAll(ExternalSorter.scala:191)
> > at org.apache.spark.shuffle.sort.SortShuffleWriter.write(
> > SortShuffleWriter.scala:64)
> > at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> > ShuffleMapTask.scala:73)
> > at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> > ShuffleMapTask.scala:41)
> > at org.apache.spark.scheduler.Task.run(Task.scala:89)
> > at org.apache.spark.executor.Executor$TaskRunner.run(
> > Executor.scala:227)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1145)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:615)
> > at java.lang.Thread.run(Thread.java:745)
> > 17/06/21 20:20:39 INFO TaskSetManager: Starting task 0.1 in stage 0.0
> (TID
> > 2, hadoop645.lt.163.org, partition 0,RACK_LOCAL, 3276 bytes)
> > 17/06/21 20:21:14 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1,
> > hadoop645.lt.163.org): java.lang.IllegalArgumentException: Failed to
> find
> > metadata store by url: kylin_metadata_2_0
> > _0@hbase
> > at org.apache.kylin.common.persistence.ResourceStore.
> > createResourceStore(ResourceStore.java:99)
> > at org.apache.kylin.common.persistence.ResourceStore.
> > getStore(ResourceStore.java:110)
> > at org.apache.kylin.cube.CubeDescManager.getStore(
> > CubeDescManager.java:370)
> > at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(
> > CubeDescManager.java:298)
> > at org.apache.kylin.cube.CubeDescManager.<init>(
> > CubeDescManager.java:109)
> > at org.apache.kylin.cube.CubeDescManager.getInstance(
> > CubeDescManager.java:81)
> > at org.apache.kylin.cube.CubeInstance.getDescriptor(
> > CubeInstance.java:114)
> > at org.apache.kylin.cube.CubeSegment.getCubeDesc(
> > CubeSegment.java:119)
> > at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(
> > RowKeyEncoder.java:50)
> > at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.
> createInstance(
> > AbstractRowKeyEncoder.java:48)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> > SparkCubingByLayer.java:205)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> > SparkCubingByLayer.java:193)
> > at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> > pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> > at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> > pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> > at org.apache.spark.util.collection.ExternalSorter.
> > insertAll(ExternalSorter.scala:191)
> > at org.apache.spark.shuffle.sort.SortShuffleWriter.write(
> > SortShuffleWriter.scala:64)
> > at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> > ShuffleMapTask.scala:73)
> > at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> > ShuffleMapTask.scala:41)
> > at org.apache.spark.scheduler.Task.run(Task.scala:89)
> > at org.apache.spark.executor.Executor$TaskRunner.run(
> > Executor.scala:227)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1145)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:615)
> > at java.lang.Thread.run(Thread.java:745)
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > But I can use the kylin in-build spark-shell to get data from hive and
> > hbase successfully, just like this:
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > sqlContext.sql("show tables").take(1)
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > import org.apache.spark._
> > import org.apache.spark.rdd.NewHadoopRDD
> > import org.apache.hadoop.fs.Path
> > import org.apache.hadoop.hbase.util.Bytes
> > import org.apache.hadoop.hbase.HColumnDescriptor
> > import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}
> > import org.apache.hadoop.hbase.client.{HBaseAdmin, Put, HTable, Result}
> > import org.apache.hadoop.hbase.mapreduce.TableInputFormat
> > import org.apache.hadoop.hbase.io.ImmutableBytesWritable
> > val conf = HBaseConfiguration.create()
> > conf.set("hbase.zookeeper.quorum", "localhost")
> > conf.set(TableInputFormat.INPUT_TABLE, "test_table")
> > val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
> > classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
> > classOf[org.apache.hadoop.hbase.client.Result])
> >
> > val res = hBaseRDD.take(1)
> > val rs = res(0)._2
> > val kv = rs.raw
> > for(keyvalue <- kv) println("rowkey:"+ new String(keyvalue.getRow)+ "
> > cf:"+new String(keyvalue.getFamily()) + " column:" + new
> > String(keyvalue.getQualifier) + " " + "value:"+new
> > String(keyvalue.getValue()))
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > By the way, I've already put hive-site.xml and hbase-site.xml into the
> > HADOOP_CONF_DIR and $SPARK_HOME/conf(which is acually
> > $KYLIN_HOME/spark/conf), and I also set spark.driver.extraClassPath in
> > spark-defaults.conf to attachs some related jars(hbase-client.jar,
> > hbase-common.jar and so on).
> > I don't know why, anyone could give me some advice?
> > 2017-06-21
> >
> > skyyws
> >
> >
> >
> > 发件人:ShaoFeng Shi <sh...@apache.org>
> > 发送时间:2017-06-20 15:13
> > 主题:Re: Re: Build sample error with spark on kylin 2.0.0
> > 收件人:"dev"<de...@kylin.apache.org>
> > 抄送:
> >
> > Or you can check whether there is old hadoop jars on your cluster,
> > according to https://issues.apache.org/jira/browse/HADOOP-11064
> >
> >
> > 2017-06-20 9:33 GMT+08:00 skyyws <sk...@163.com>:
> >
> > > No, I deploy kylin on linux, this is my machine info:
> > > --------------------------
> > > 3.2.0-4-amd64 #1 SMP Debian 3.2.82-1 x86_64 GNU/Linux
> > > -------------------------
> > >
> > > 2017-06-20
> > >
> > > skyyws
> > >
> > >
> > >
> > > 发件人:ShaoFeng Shi <sh...@apache.org>
> > > 发送时间:2017-06-20 00:10
> > > 主题:Re: Build sample error with spark on kylin 2.0.0
> > > 收件人:"dev"<de...@kylin.apache.org>
> > > 抄送:
> > >
> > > Are you running Kylin on windows? If yes, check:
> > > https://stackoverflow.com/questions/33211599/hadoop-
> > > error-on-windows-java-lang-unsatisfiedlinkerror
> > >
> > > 2017-06-19 21:55 GMT+08:00 skyyws <sk...@163.com>:
> > >
> > > > Hi all,
> > > > I met an error when using spark engine build kylin sample on step
> > "Build
> > > > Cube with Spark", here is the exception log:
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > Exception in thread "main" java.lang.UnsatisfiedLinkError:
> > > > org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteAr
> > > > ray(II[BI[BIILjava/lang/String;JZ)V
> > > > at org.apache.hadoop.util.NativeCrc32.
> > > > nativeComputeChunkedSumsByteArray(Native Method)
> > > > at org.apache.hadoop.util.NativeCrc32.
> > > > calculateChunkedSumsByteArray(NativeCrc32.java:86)
> > > > at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(
> > > > DataChecksum.java:430)
> > > > at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(
> > > > FSOutputSummer.java:202)
> > > > at org.apache.hadoop.fs.FSOutputSummer.write1(
> > > > FSOutputSummer.java:124)
> > > > at org.apache.hadoop.fs.FSOutputSummer.write(
> > > > FSOutputSummer.java:110)
> > > > at org.apache.hadoop.fs.FSDataOutputStream$
> > PositionCache.write(
> > > > FSDataOutputStream.java:58)
> > > > at java.io.DataOutputStream.write(DataOutputStream.java:107)
> > > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:80)
> > > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> > > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
> > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
> > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
> > > > at org.apache.spark.deploy.yarn.Client.copyFileToRemote(
> > > > Client.scala:317)
> > > > at org.apache.spark.deploy.yarn.Client.org$apache$spark$
> > > > deploy$yarn$Client$$distribute$1(Client.scala:407)
> > > > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > > > prepareLocalResources$5.apply(Client.scala:446)
> > > > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > > > prepareLocalResources$5.apply(Client.scala:444)
> > > > at scala.collection.immutable.List.foreach(List.scala:318)
> > > > at org.apache.spark.deploy.yarn.
> Client.prepareLocalResources(
> > > > Client.scala:444)
> > > > at org.apache.spark.deploy.yarn.Client.
> > > > createContainerLaunchContext(Client.scala:727)
> > > > at org.apache.spark.deploy.yarn.Client.submitApplication(
> > > > Client.scala:142)
> > > > at org.apache.spark.scheduler.cluster.
> > > YarnClientSchedulerBackend.
> > > > start(YarnClientSchedulerBackend.scala:57)
> > > > at org.apache.spark.scheduler.TaskSchedulerImpl.start(
> > > > TaskSchedulerImpl.scala:144)
> > > > at org.apache.spark.SparkContext.
> > <init>(SparkContext.scala:530)
> > > > at org.apache.spark.api.java.JavaSparkContext.<init>(
> > > > JavaSparkContext.scala:59)
> > > > at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> > > > SparkCubingByLayer.java:150)
> > > > at org.apache.kylin.common.util.AbstractApplication.execute(
> > > > AbstractApplication.java:37)
> > > > at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.
> > > > java:44)
> > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > > at sun.reflect.NativeMethodAccessorImpl.invoke(
> > > > NativeMethodAccessorImpl.java:57)
> > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > > > DelegatingMethodAccessorImpl.java:43)
> > > > at java.lang.reflect.Method.invoke(Method.java:606)
> > > > at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> > > > deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> > > > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> > > > SparkSubmit.scala:181)
> > > > at org.apache.spark.deploy.SparkSubmit$.submit(
> > > > SparkSubmit.scala:206)
> > > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.
> > > > scala:121)
> > > > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.
> scala)
> > > > 17/06/19 21:22:06 INFO storage.DiskBlockManager: Shutdown hook called
> > > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Shutdown hook called
> > > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28
> > > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/httpd-
> > > > 9bcb9a5d-569f-4f28-ad89-038a9020eda8
> > > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/userFiles-
> > > > 2e9ff265-3d37-40e0-8894-6fd4d1a3ad8b
> > > >
> > > > at org.apache.kylin.common.util.CliCommandExecutor.execute(
> > > > CliCommandExecutor.java:92)
> > > > at org.apache.kylin.engine.spark.SparkExecutable.doWork(
> > > > SparkExecutable.java:124)
> > > > at org.apache.kylin.job.execution.AbstractExecutable.
> > > > execute(AbstractExecutable.java:124)
> > > > at org.apache.kylin.job.execution.DefaultChainedExecutable.
> > > doWork(
> > > > DefaultChainedExecutable.java:64)
> > > > at org.apache.kylin.job.execution.AbstractExecutable.
> > > > execute(AbstractExecutable.java:124)
> > > > at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> > > > JobRunner.run(DefaultScheduler.java:142)
> > > > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > ThreadPoolExecutor.java:1145)
> > > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > ThreadPoolExecutor.java:615)
> > > > at java.lang.Thread.run(Thread.java:745)
> > > >
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > I can use the kylin in-build spark-shell to do some operations like:
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > var textFile = sc.textFile("hdfs://xxxx/xxxx/README.md")
> > > > textFile.count()
> > > > textFile.first()
> > > > textFile.filter(line => line.contains("hello")).count()
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > Here is the env info:
> > > > kylin version is 2.0.0
> > > > hadoop version is 2.7.*
> > > > spark version is 1.6.*
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > Anyone can help me?THX
> > > >
> > > >
> > > > 2017-06-19
> > > > skyyws
> > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > > Shaofeng Shi 史少锋
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
--
Best regards,
Shaofeng Shi 史少锋
Re: Re: Re: Build sample error with spark on kylin 2.0.0
Posted by ShaoFeng Shi <sh...@apache.org>.
The root cause should be " java.lang.NoClassDefFoundError:
org/cloudera/htrace/Trace". Please locate the "htrace-core.jar" in local
disk first, and then re-run the spark-submit command with adding this jar
in the param "--jars". If it works this time, then you can configure this
jar path in kylin.properties:
kylin.engine.spark.additional-jars=/path/to/htrace-core.jar
Then restart Kylin and resume the job.
2017-06-22 12:02 GMT+08:00 skyyws <sk...@163.com>:
> Hi ShaoFeng , no more other error msg before or after this in kylin.log,
> but I try to execute cmd by spart-submit directly like this:
> --------------------------------------------------------------------
> ./spark-submit --class org.apache.kylin.common.util.SparkEntry --conf
> spark.executor.instances=1 --conf spark.yarn.queue=default --conf
> spark.history.fs.logDirectory=hdfs://xxx/user/user1/kylin_2_0_0_test/spark-history
> --conf spark.master=yarn --conf spark.executor.memory=1G --conf
> spark.eventLog.enabled=true --conf spark.eventLog.dir=hdfs://xxx/
> user/user1/kylin_2_0_0_test/spark-history --conf spark.executor.cores=2
> --conf spark.submit.deployMode=cluster --files /xxx/hbase-0.98.8-hadoop2/conf/hbase-site.xml
> --jars /xxx/kylin-deploy/kylin-2.0.0/lib/kylin-job-2.0.0.jar,/user/
> user1/ext_lib/htrace-core-2.04.jar,/user/user1/ext_lib/
> hbase-client-0.98.8-hadoop2.jar,/user/user1/ext_lib/hbase-
> common-0.98.8-hadoop2.jar,/user/user1/ext_lib/hbase-
> protocol-0.98.8-hadoop2.jar,/user/user1/ext_lib/metrics-
> core-2.2.0.jar,/user/user1/ext_lib/guava-12.0.1.jar,
> /xxx/kylin-deploy/kylin-2.0.0/lib/kylin-job-2.0.0.jar -className
> org.apache.kylin.engine.spark.SparkCubingByLayer -hiveTable
> default.kylin_intermediate_kylin_sales_test_cube_a0cd9950_cddc_4c3b_aaa5_fddf87d1fdaa
> -segmentId a0cd9950-cddc-4c3b-aaa5-fddf87d1fdaa -confPath
> /xxx/kylin-deploy/kylin-2.0.0/conf -output hdfs:///user/user1/kylin_2_0_
> 0_test/kylin_metadata_2_0_0/kylin-7a376cb7-7ee7-43fd-95dd-
> 79c2c1999f40/kylin_sales_test_cube/cuboid/ -cubename kylin_sales_test_cube
> --------------------------------------------------------------------
> Then, I got some other msg from spark, here is the full error msg:
> --------------------------------------------------------------------
> 17/06/22 11:35:19 ERROR HBaseConnection: Error when open connection hbase
> java.io.IOException: java.lang.reflect.InvocationTargetException
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:413)
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:306)
> at org.apache.kylin.storage.hbase.HBaseConnection.get(
> HBaseConnection.java:229)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.getConnection(
> HBaseResourceStore.java:72)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.
> createHTableIfNeeded(HBaseResourceStore.java:89)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.<
> init>(HBaseResourceStore.java:85)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(
> ResourceStore.java:91)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
> at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(
> CubeManager.java:740)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:160)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:411)
> ... 27 more
> Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace
> at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(
> RecoverableZooKeeper.java:218)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479)
> at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(
> ZKClusterId.java:65)
> at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(
> ZooKeeperRegistry.java:83)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.retrieveClusterId(HConnectionManager.java:839)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.<init>(HConnectionManager.java:642)
> ... 32 more
> Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> ... 38 more
> 17/06/22 11:35:19 ERROR ResourceStore: Create new store instance failed
> java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(
> ResourceStore.java:91)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
> at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(
> CubeManager.java:740)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:160)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.IllegalArgumentException: File not exist by
> 'kylin_metadata_2_0_0@hbase': /xxx/spark-1.6.2-bin-hadoop2.
> 7/kylin_metadata_2_0_0@hbase
> at org.apache.kylin.common.persistence.FileResourceStore.
> <init>(FileResourceStore.java:49)
> ... 22 more
> 17/06/22 11:35:19 ERROR ResourceStore: Create new store instance failed
> java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(
> ResourceStore.java:91)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
> at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(
> CubeManager.java:740)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:160)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: org.apache.kylin.common.persistence.StorageException: Error
> when open connection hbase
> at org.apache.kylin.storage.hbase.HBaseConnection.get(
> HBaseConnection.java:242)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.getConnection(
> HBaseResourceStore.java:72)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.
> createHTableIfNeeded(HBaseResourceStore.java:89)
> at org.apache.kylin.storage.hbase.HBaseResourceStore.<
> init>(HBaseResourceStore.java:85)
> ... 22 more
> Caused by: java.io.IOException: java.lang.reflect.
> InvocationTargetException
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:413)
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:306)
> at org.apache.kylin.storage.hbase.HBaseConnection.get(
> HBaseConnection.java:229)
> ... 25 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:57)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
> at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(
> HConnectionManager.java:411)
> ... 27 more
> Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace
> at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(
> RecoverableZooKeeper.java:218)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479)
> at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(
> ZKClusterId.java:65)
> at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(
> ZooKeeperRegistry.java:83)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.retrieveClusterId(HConnectionManager.java:839)
> at org.apache.hadoop.hbase.client.HConnectionManager$
> HConnectionImplementation.<init>(HConnectionManager.java:642)
> ... 32 more
> Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> ... 38 more
> 17/06/22 11:35:19 INFO ClientCnxn: Opening socket connection to server
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL
> (unknown error)
> Exception in thread "main" java.lang.RuntimeException: error execute
> org.apache.kylin.engine.spark.SparkCubingByLayer
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:42)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lang.IllegalArgumentException: Failed to find metadata
> store by url: kylin_metadata_2_0_0@hbase
> at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(
> ResourceStore.java:99)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
> at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(
> CubeManager.java:740)
> at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
> at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:160)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> ... 10 more
> 17/06/22 11:35:19 INFO ClientCnxn: Socket connection established to
> 127.0.0.1/127.0.0.1:2181, initiating session
> 17/06/22 11:35:19 INFO SparkContext: Invoking stop() from shutdown hook
> --------------------------------------------------------------------
> Thanks for your attention!
> 2017-06-22
>
> skyyws
>
>
>
> 发件人:ShaoFeng Shi <sh...@apache.org>
> 发送时间:2017-06-22 11:42
> 主题:Re: Re: Re: Build sample error with spark on kylin 2.0.0
> 收件人:"dev"<de...@kylin.apache.org>
> 抄送:
>
> hi Sky, glad to see it moves forward. The "Failed to find metadata store by
> url: kylin_metadata_2_0
> _0@hbase" is not root cause. Could you check more with the log files, is
> there any other error before or after this?
>
> 2017-06-21 20:43 GMT+08:00 skyyws <sk...@163.com>:
>
> > Thank you for your suggestion, Shaofeng Shi, I try to use hadoop client
> > 2.7.3, it worked. But I met another probelm:
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > 17/06/21 20:20:39 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
> > hadoop645.lt.163.org): java.lang.IllegalArgumentException: Failed to
> find
> > metadata store by url: kylin_metadata_2_0
> > _0@hbase
> > at org.apache.kylin.common.persistence.ResourceStore.
> > createResourceStore(ResourceStore.java:99)
> > at org.apache.kylin.common.persistence.ResourceStore.
> > getStore(ResourceStore.java:110)
> > at org.apache.kylin.cube.CubeDescManager.getStore(
> > CubeDescManager.java:370)
> > at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(
> > CubeDescManager.java:298)
> > at org.apache.kylin.cube.CubeDescManager.<init>(
> > CubeDescManager.java:109)
> > at org.apache.kylin.cube.CubeDescManager.getInstance(
> > CubeDescManager.java:81)
> > at org.apache.kylin.cube.CubeInstance.getDescriptor(
> > CubeInstance.java:114)
> > at org.apache.kylin.cube.CubeSegment.getCubeDesc(
> > CubeSegment.java:119)
> > at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(
> > RowKeyEncoder.java:50)
> > at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.
> createInstance(
> > AbstractRowKeyEncoder.java:48)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> > SparkCubingByLayer.java:205)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> > SparkCubingByLayer.java:193)
> > at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> > pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> > at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> > pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> > at org.apache.spark.util.collection.ExternalSorter.
> > insertAll(ExternalSorter.scala:191)
> > at org.apache.spark.shuffle.sort.SortShuffleWriter.write(
> > SortShuffleWriter.scala:64)
> > at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> > ShuffleMapTask.scala:73)
> > at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> > ShuffleMapTask.scala:41)
> > at org.apache.spark.scheduler.Task.run(Task.scala:89)
> > at org.apache.spark.executor.Executor$TaskRunner.run(
> > Executor.scala:227)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1145)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:615)
> > at java.lang.Thread.run(Thread.java:745)
> > 17/06/21 20:20:39 INFO TaskSetManager: Starting task 0.1 in stage 0.0
> (TID
> > 2, hadoop645.lt.163.org, partition 0,RACK_LOCAL, 3276 bytes)
> > 17/06/21 20:21:14 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1,
> > hadoop645.lt.163.org): java.lang.IllegalArgumentException: Failed to
> find
> > metadata store by url: kylin_metadata_2_0
> > _0@hbase
> > at org.apache.kylin.common.persistence.ResourceStore.
> > createResourceStore(ResourceStore.java:99)
> > at org.apache.kylin.common.persistence.ResourceStore.
> > getStore(ResourceStore.java:110)
> > at org.apache.kylin.cube.CubeDescManager.getStore(
> > CubeDescManager.java:370)
> > at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(
> > CubeDescManager.java:298)
> > at org.apache.kylin.cube.CubeDescManager.<init>(
> > CubeDescManager.java:109)
> > at org.apache.kylin.cube.CubeDescManager.getInstance(
> > CubeDescManager.java:81)
> > at org.apache.kylin.cube.CubeInstance.getDescriptor(
> > CubeInstance.java:114)
> > at org.apache.kylin.cube.CubeSegment.getCubeDesc(
> > CubeSegment.java:119)
> > at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(
> > RowKeyEncoder.java:50)
> > at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.
> createInstance(
> > AbstractRowKeyEncoder.java:48)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> > SparkCubingByLayer.java:205)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> > SparkCubingByLayer.java:193)
> > at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> > pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> > at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> > pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> > at org.apache.spark.util.collection.ExternalSorter.
> > insertAll(ExternalSorter.scala:191)
> > at org.apache.spark.shuffle.sort.SortShuffleWriter.write(
> > SortShuffleWriter.scala:64)
> > at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> > ShuffleMapTask.scala:73)
> > at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> > ShuffleMapTask.scala:41)
> > at org.apache.spark.scheduler.Task.run(Task.scala:89)
> > at org.apache.spark.executor.Executor$TaskRunner.run(
> > Executor.scala:227)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1145)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:615)
> > at java.lang.Thread.run(Thread.java:745)
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > But I can use the kylin in-build spark-shell to get data from hive and
> > hbase successfully, just like this:
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > sqlContext.sql("show tables").take(1)
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > import org.apache.spark._
> > import org.apache.spark.rdd.NewHadoopRDD
> > import org.apache.hadoop.fs.Path
> > import org.apache.hadoop.hbase.util.Bytes
> > import org.apache.hadoop.hbase.HColumnDescriptor
> > import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}
> > import org.apache.hadoop.hbase.client.{HBaseAdmin, Put, HTable, Result}
> > import org.apache.hadoop.hbase.mapreduce.TableInputFormat
> > import org.apache.hadoop.hbase.io.ImmutableBytesWritable
> > val conf = HBaseConfiguration.create()
> > conf.set("hbase.zookeeper.quorum", "localhost")
> > conf.set(TableInputFormat.INPUT_TABLE, "test_table")
> > val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
> > classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
> > classOf[org.apache.hadoop.hbase.client.Result])
> >
> > val res = hBaseRDD.take(1)
> > val rs = res(0)._2
> > val kv = rs.raw
> > for(keyvalue <- kv) println("rowkey:"+ new String(keyvalue.getRow)+ "
> > cf:"+new String(keyvalue.getFamily()) + " column:" + new
> > String(keyvalue.getQualifier) + " " + "value:"+new
> > String(keyvalue.getValue()))
> > -------------------------- --------------------------
> > -------------------------- --------------------------
> > By the way, I've already put hive-site.xml and hbase-site.xml into the
> > HADOOP_CONF_DIR and $SPARK_HOME/conf(which is acually
> > $KYLIN_HOME/spark/conf), and I also set spark.driver.extraClassPath in
> > spark-defaults.conf to attachs some related jars(hbase-client.jar,
> > hbase-common.jar and so on).
> > I don't know why, anyone could give me some advice?
> > 2017-06-21
> >
> > skyyws
> >
> >
> >
> > 发件人:ShaoFeng Shi <sh...@apache.org>
> > 发送时间:2017-06-20 15:13
> > 主题:Re: Re: Build sample error with spark on kylin 2.0.0
> > 收件人:"dev"<de...@kylin.apache.org>
> > 抄送:
> >
> > Or you can check whether there is old hadoop jars on your cluster,
> > according to https://issues.apache.org/jira/browse/HADOOP-11064
> >
> >
> > 2017-06-20 9:33 GMT+08:00 skyyws <sk...@163.com>:
> >
> > > No, I deploy kylin on linux, this is my machine info:
> > > --------------------------
> > > 3.2.0-4-amd64 #1 SMP Debian 3.2.82-1 x86_64 GNU/Linux
> > > -------------------------
> > >
> > > 2017-06-20
> > >
> > > skyyws
> > >
> > >
> > >
> > > 发件人:ShaoFeng Shi <sh...@apache.org>
> > > 发送时间:2017-06-20 00:10
> > > 主题:Re: Build sample error with spark on kylin 2.0.0
> > > 收件人:"dev"<de...@kylin.apache.org>
> > > 抄送:
> > >
> > > Are you running Kylin on windows? If yes, check:
> > > https://stackoverflow.com/questions/33211599/hadoop-
> > > error-on-windows-java-lang-unsatisfiedlinkerror
> > >
> > > 2017-06-19 21:55 GMT+08:00 skyyws <sk...@163.com>:
> > >
> > > > Hi all,
> > > > I met an error when using spark engine build kylin sample on step
> > "Build
> > > > Cube with Spark", here is the exception log:
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > Exception in thread "main" java.lang.UnsatisfiedLinkError:
> > > > org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteAr
> > > > ray(II[BI[BIILjava/lang/String;JZ)V
> > > > at org.apache.hadoop.util.NativeCrc32.
> > > > nativeComputeChunkedSumsByteArray(Native Method)
> > > > at org.apache.hadoop.util.NativeCrc32.
> > > > calculateChunkedSumsByteArray(NativeCrc32.java:86)
> > > > at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(
> > > > DataChecksum.java:430)
> > > > at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(
> > > > FSOutputSummer.java:202)
> > > > at org.apache.hadoop.fs.FSOutputSummer.write1(
> > > > FSOutputSummer.java:124)
> > > > at org.apache.hadoop.fs.FSOutputSummer.write(
> > > > FSOutputSummer.java:110)
> > > > at org.apache.hadoop.fs.FSDataOutputStream$
> > PositionCache.write(
> > > > FSDataOutputStream.java:58)
> > > > at java.io.DataOutputStream.write(DataOutputStream.java:107)
> > > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:80)
> > > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> > > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
> > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
> > > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
> > > > at org.apache.spark.deploy.yarn.Client.copyFileToRemote(
> > > > Client.scala:317)
> > > > at org.apache.spark.deploy.yarn.Client.org$apache$spark$
> > > > deploy$yarn$Client$$distribute$1(Client.scala:407)
> > > > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > > > prepareLocalResources$5.apply(Client.scala:446)
> > > > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > > > prepareLocalResources$5.apply(Client.scala:444)
> > > > at scala.collection.immutable.List.foreach(List.scala:318)
> > > > at org.apache.spark.deploy.yarn.
> Client.prepareLocalResources(
> > > > Client.scala:444)
> > > > at org.apache.spark.deploy.yarn.Client.
> > > > createContainerLaunchContext(Client.scala:727)
> > > > at org.apache.spark.deploy.yarn.Client.submitApplication(
> > > > Client.scala:142)
> > > > at org.apache.spark.scheduler.cluster.
> > > YarnClientSchedulerBackend.
> > > > start(YarnClientSchedulerBackend.scala:57)
> > > > at org.apache.spark.scheduler.TaskSchedulerImpl.start(
> > > > TaskSchedulerImpl.scala:144)
> > > > at org.apache.spark.SparkContext.
> > <init>(SparkContext.scala:530)
> > > > at org.apache.spark.api.java.JavaSparkContext.<init>(
> > > > JavaSparkContext.scala:59)
> > > > at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> > > > SparkCubingByLayer.java:150)
> > > > at org.apache.kylin.common.util.AbstractApplication.execute(
> > > > AbstractApplication.java:37)
> > > > at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.
> > > > java:44)
> > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > > at sun.reflect.NativeMethodAccessorImpl.invoke(
> > > > NativeMethodAccessorImpl.java:57)
> > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > > > DelegatingMethodAccessorImpl.java:43)
> > > > at java.lang.reflect.Method.invoke(Method.java:606)
> > > > at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> > > > deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> > > > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> > > > SparkSubmit.scala:181)
> > > > at org.apache.spark.deploy.SparkSubmit$.submit(
> > > > SparkSubmit.scala:206)
> > > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.
> > > > scala:121)
> > > > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.
> scala)
> > > > 17/06/19 21:22:06 INFO storage.DiskBlockManager: Shutdown hook called
> > > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Shutdown hook called
> > > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28
> > > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/httpd-
> > > > 9bcb9a5d-569f-4f28-ad89-038a9020eda8
> > > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/userFiles-
> > > > 2e9ff265-3d37-40e0-8894-6fd4d1a3ad8b
> > > >
> > > > at org.apache.kylin.common.util.CliCommandExecutor.execute(
> > > > CliCommandExecutor.java:92)
> > > > at org.apache.kylin.engine.spark.SparkExecutable.doWork(
> > > > SparkExecutable.java:124)
> > > > at org.apache.kylin.job.execution.AbstractExecutable.
> > > > execute(AbstractExecutable.java:124)
> > > > at org.apache.kylin.job.execution.DefaultChainedExecutable.
> > > doWork(
> > > > DefaultChainedExecutable.java:64)
> > > > at org.apache.kylin.job.execution.AbstractExecutable.
> > > > execute(AbstractExecutable.java:124)
> > > > at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> > > > JobRunner.run(DefaultScheduler.java:142)
> > > > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > ThreadPoolExecutor.java:1145)
> > > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > ThreadPoolExecutor.java:615)
> > > > at java.lang.Thread.run(Thread.java:745)
> > > >
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > I can use the kylin in-build spark-shell to do some operations like:
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > var textFile = sc.textFile("hdfs://xxxx/xxxx/README.md")
> > > > textFile.count()
> > > > textFile.first()
> > > > textFile.filter(line => line.contains("hello")).count()
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > Here is the env info:
> > > > kylin version is 2.0.0
> > > > hadoop version is 2.7.*
> > > > spark version is 1.6.*
> > > > ------------------------------------------------------------
> > > > -----------------------------
> > > > Anyone can help me?THX
> > > >
> > > >
> > > > 2017-06-19
> > > > skyyws
> > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > > Shaofeng Shi 史少锋
> > >
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
--
Best regards,
Shaofeng Shi 史少锋
Re: Re: Re: Build sample error with spark on kylin 2.0.0
Posted by skyyws <sk...@163.com>.
Hi ShaoFeng , no more other error msg before or after this in kylin.log, but I try to execute cmd by spart-submit directly like this:
--------------------------------------------------------------------
./spark-submit --class org.apache.kylin.common.util.SparkEntry --conf spark.executor.instances=1 --conf spark.yarn.queue=default --conf spark.history.fs.logDirectory=hdfs://xxx/user/user1/kylin_2_0_0_test/spark-history --conf spark.master=yarn --conf spark.executor.memory=1G --conf spark.eventLog.enabled=true --conf spark.eventLog.dir=hdfs://xxx/user/user1/kylin_2_0_0_test/spark-history --conf spark.executor.cores=2 --conf spark.submit.deployMode=cluster --files /xxx/hbase-0.98.8-hadoop2/conf/hbase-site.xml --jars /xxx/kylin-deploy/kylin-2.0.0/lib/kylin-job-2.0.0.jar,/user/user1/ext_lib/htrace-core-2.04.jar,/user/user1/ext_lib/hbase-client-0.98.8-hadoop2.jar,/user/user1/ext_lib/hbase-common-0.98.8-hadoop2.jar,/user/user1/ext_lib/hbase-protocol-0.98.8-hadoop2.jar,/user/user1/ext_lib/metrics-core-2.2.0.jar,/user/user1/ext_lib/guava-12.0.1.jar, /xxx/kylin-deploy/kylin-2.0.0/lib/kylin-job-2.0.0.jar -className org.apache.kylin.engine.spark.SparkCubingByLayer -hiveTable default.kylin_intermediate_kylin_sales_test_cube_a0cd9950_cddc_4c3b_aaa5_fddf87d1fdaa -segmentId a0cd9950-cddc-4c3b-aaa5-fddf87d1fdaa -confPath /xxx/kylin-deploy/kylin-2.0.0/conf -output hdfs:///user/user1/kylin_2_0_0_test/kylin_metadata_2_0_0/kylin-7a376cb7-7ee7-43fd-95dd-79c2c1999f40/kylin_sales_test_cube/cuboid/ -cubename kylin_sales_test_cube
--------------------------------------------------------------------
Then, I got some other msg from spark, here is the full error msg:
--------------------------------------------------------------------
17/06/22 11:35:19 ERROR HBaseConnection: Error when open connection hbase
java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:413)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:306)
at org.apache.kylin.storage.hbase.HBaseConnection.get(HBaseConnection.java:229)
at org.apache.kylin.storage.hbase.HBaseResourceStore.getConnection(HBaseResourceStore.java:72)
at org.apache.kylin.storage.hbase.HBaseResourceStore.createHTableIfNeeded(HBaseResourceStore.java:89)
at org.apache.kylin.storage.hbase.HBaseResourceStore.<init>(HBaseResourceStore.java:85)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:91)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:110)
at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(CubeManager.java:740)
at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:160)
at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:411)
... 27 more
Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479)
at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:83)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.retrieveClusterId(HConnectionManager.java:839)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:642)
... 32 more
Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 38 more
17/06/22 11:35:19 ERROR ResourceStore: Create new store instance failed
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:91)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:110)
at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(CubeManager.java:740)
at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:160)
at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.IllegalArgumentException: File not exist by 'kylin_metadata_2_0_0@hbase': /xxx/spark-1.6.2-bin-hadoop2.7/kylin_metadata_2_0_0@hbase
at org.apache.kylin.common.persistence.FileResourceStore.<init>(FileResourceStore.java:49)
... 22 more
17/06/22 11:35:19 ERROR ResourceStore: Create new store instance failed
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:91)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:110)
at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(CubeManager.java:740)
at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:160)
at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.kylin.common.persistence.StorageException: Error when open connection hbase
at org.apache.kylin.storage.hbase.HBaseConnection.get(HBaseConnection.java:242)
at org.apache.kylin.storage.hbase.HBaseResourceStore.getConnection(HBaseResourceStore.java:72)
at org.apache.kylin.storage.hbase.HBaseResourceStore.createHTableIfNeeded(HBaseResourceStore.java:89)
at org.apache.kylin.storage.hbase.HBaseResourceStore.<init>(HBaseResourceStore.java:85)
... 22 more
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:413)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:306)
at org.apache.kylin.storage.hbase.HBaseConnection.get(HBaseConnection.java:229)
... 25 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:411)
... 27 more
Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479)
at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:83)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.retrieveClusterId(HConnectionManager.java:839)
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:642)
... 32 more
Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 38 more
17/06/22 11:35:19 INFO ClientCnxn: Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
Exception in thread "main" java.lang.RuntimeException: error execute org.apache.kylin.engine.spark.SparkCubingByLayer
at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.IllegalArgumentException: Failed to find metadata store by url: kylin_metadata_2_0_0@hbase
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:99)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:110)
at org.apache.kylin.cube.CubeManager.getStore(CubeManager.java:820)
at org.apache.kylin.cube.CubeManager.loadAllCubeInstance(CubeManager.java:740)
at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:145)
at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:109)
at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:160)
at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
... 10 more
17/06/22 11:35:19 INFO ClientCnxn: Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session
17/06/22 11:35:19 INFO SparkContext: Invoking stop() from shutdown hook
--------------------------------------------------------------------
Thanks for your attention!
2017-06-22
skyyws
发件人:ShaoFeng Shi <sh...@apache.org>
发送时间:2017-06-22 11:42
主题:Re: Re: Re: Build sample error with spark on kylin 2.0.0
收件人:"dev"<de...@kylin.apache.org>
抄送:
hi Sky, glad to see it moves forward. The "Failed to find metadata store by
url: kylin_metadata_2_0
_0@hbase" is not root cause. Could you check more with the log files, is
there any other error before or after this?
2017-06-21 20:43 GMT+08:00 skyyws <sk...@163.com>:
> Thank you for your suggestion, Shaofeng Shi, I try to use hadoop client
> 2.7.3, it worked. But I met another probelm:
> -------------------------- --------------------------
> -------------------------- --------------------------
> 17/06/21 20:20:39 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
> hadoop645.lt.163.org): java.lang.IllegalArgumentException: Failed to find
> metadata store by url: kylin_metadata_2_0
> _0@hbase
> at org.apache.kylin.common.persistence.ResourceStore.
> createResourceStore(ResourceStore.java:99)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeDescManager.getStore(
> CubeDescManager.java:370)
> at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(
> CubeDescManager.java:298)
> at org.apache.kylin.cube.CubeDescManager.<init>(
> CubeDescManager.java:109)
> at org.apache.kylin.cube.CubeDescManager.getInstance(
> CubeDescManager.java:81)
> at org.apache.kylin.cube.CubeInstance.getDescriptor(
> CubeInstance.java:114)
> at org.apache.kylin.cube.CubeSegment.getCubeDesc(
> CubeSegment.java:119)
> at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(
> RowKeyEncoder.java:50)
> at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(
> AbstractRowKeyEncoder.java:48)
> at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> SparkCubingByLayer.java:205)
> at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> SparkCubingByLayer.java:193)
> at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.util.collection.ExternalSorter.
> insertAll(ExternalSorter.scala:191)
> at org.apache.spark.shuffle.sort.SortShuffleWriter.write(
> SortShuffleWriter.scala:64)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(
> Executor.scala:227)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 17/06/21 20:20:39 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID
> 2, hadoop645.lt.163.org, partition 0,RACK_LOCAL, 3276 bytes)
> 17/06/21 20:21:14 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1,
> hadoop645.lt.163.org): java.lang.IllegalArgumentException: Failed to find
> metadata store by url: kylin_metadata_2_0
> _0@hbase
> at org.apache.kylin.common.persistence.ResourceStore.
> createResourceStore(ResourceStore.java:99)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeDescManager.getStore(
> CubeDescManager.java:370)
> at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(
> CubeDescManager.java:298)
> at org.apache.kylin.cube.CubeDescManager.<init>(
> CubeDescManager.java:109)
> at org.apache.kylin.cube.CubeDescManager.getInstance(
> CubeDescManager.java:81)
> at org.apache.kylin.cube.CubeInstance.getDescriptor(
> CubeInstance.java:114)
> at org.apache.kylin.cube.CubeSegment.getCubeDesc(
> CubeSegment.java:119)
> at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(
> RowKeyEncoder.java:50)
> at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(
> AbstractRowKeyEncoder.java:48)
> at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> SparkCubingByLayer.java:205)
> at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> SparkCubingByLayer.java:193)
> at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.util.collection.ExternalSorter.
> insertAll(ExternalSorter.scala:191)
> at org.apache.spark.shuffle.sort.SortShuffleWriter.write(
> SortShuffleWriter.scala:64)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(
> Executor.scala:227)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> -------------------------- --------------------------
> -------------------------- --------------------------
> But I can use the kylin in-build spark-shell to get data from hive and
> hbase successfully, just like this:
> -------------------------- --------------------------
> -------------------------- --------------------------
> sqlContext.sql("show tables").take(1)
> -------------------------- --------------------------
> -------------------------- --------------------------
> import org.apache.spark._
> import org.apache.spark.rdd.NewHadoopRDD
> import org.apache.hadoop.fs.Path
> import org.apache.hadoop.hbase.util.Bytes
> import org.apache.hadoop.hbase.HColumnDescriptor
> import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}
> import org.apache.hadoop.hbase.client.{HBaseAdmin, Put, HTable, Result}
> import org.apache.hadoop.hbase.mapreduce.TableInputFormat
> import org.apache.hadoop.hbase.io.ImmutableBytesWritable
> val conf = HBaseConfiguration.create()
> conf.set("hbase.zookeeper.quorum", "localhost")
> conf.set(TableInputFormat.INPUT_TABLE, "test_table")
> val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
> classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
> classOf[org.apache.hadoop.hbase.client.Result])
>
> val res = hBaseRDD.take(1)
> val rs = res(0)._2
> val kv = rs.raw
> for(keyvalue <- kv) println("rowkey:"+ new String(keyvalue.getRow)+ "
> cf:"+new String(keyvalue.getFamily()) + " column:" + new
> String(keyvalue.getQualifier) + " " + "value:"+new
> String(keyvalue.getValue()))
> -------------------------- --------------------------
> -------------------------- --------------------------
> By the way, I've already put hive-site.xml and hbase-site.xml into the
> HADOOP_CONF_DIR and $SPARK_HOME/conf(which is acually
> $KYLIN_HOME/spark/conf), and I also set spark.driver.extraClassPath in
> spark-defaults.conf to attachs some related jars(hbase-client.jar,
> hbase-common.jar and so on).
> I don't know why, anyone could give me some advice?
> 2017-06-21
>
> skyyws
>
>
>
> 发件人:ShaoFeng Shi <sh...@apache.org>
> 发送时间:2017-06-20 15:13
> 主题:Re: Re: Build sample error with spark on kylin 2.0.0
> 收件人:"dev"<de...@kylin.apache.org>
> 抄送:
>
> Or you can check whether there is old hadoop jars on your cluster,
> according to https://issues.apache.org/jira/browse/HADOOP-11064
>
>
> 2017-06-20 9:33 GMT+08:00 skyyws <sk...@163.com>:
>
> > No, I deploy kylin on linux, this is my machine info:
> > --------------------------
> > 3.2.0-4-amd64 #1 SMP Debian 3.2.82-1 x86_64 GNU/Linux
> > -------------------------
> >
> > 2017-06-20
> >
> > skyyws
> >
> >
> >
> > 发件人:ShaoFeng Shi <sh...@apache.org>
> > 发送时间:2017-06-20 00:10
> > 主题:Re: Build sample error with spark on kylin 2.0.0
> > 收件人:"dev"<de...@kylin.apache.org>
> > 抄送:
> >
> > Are you running Kylin on windows? If yes, check:
> > https://stackoverflow.com/questions/33211599/hadoop-
> > error-on-windows-java-lang-unsatisfiedlinkerror
> >
> > 2017-06-19 21:55 GMT+08:00 skyyws <sk...@163.com>:
> >
> > > Hi all,
> > > I met an error when using spark engine build kylin sample on step
> "Build
> > > Cube with Spark", here is the exception log:
> > > ------------------------------------------------------------
> > > -----------------------------
> > > Exception in thread "main" java.lang.UnsatisfiedLinkError:
> > > org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteAr
> > > ray(II[BI[BIILjava/lang/String;JZ)V
> > > at org.apache.hadoop.util.NativeCrc32.
> > > nativeComputeChunkedSumsByteArray(Native Method)
> > > at org.apache.hadoop.util.NativeCrc32.
> > > calculateChunkedSumsByteArray(NativeCrc32.java:86)
> > > at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(
> > > DataChecksum.java:430)
> > > at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(
> > > FSOutputSummer.java:202)
> > > at org.apache.hadoop.fs.FSOutputSummer.write1(
> > > FSOutputSummer.java:124)
> > > at org.apache.hadoop.fs.FSOutputSummer.write(
> > > FSOutputSummer.java:110)
> > > at org.apache.hadoop.fs.FSDataOutputStream$
> PositionCache.write(
> > > FSDataOutputStream.java:58)
> > > at java.io.DataOutputStream.write(DataOutputStream.java:107)
> > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:80)
> > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
> > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
> > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
> > > at org.apache.spark.deploy.yarn.Client.copyFileToRemote(
> > > Client.scala:317)
> > > at org.apache.spark.deploy.yarn.Client.org$apache$spark$
> > > deploy$yarn$Client$$distribute$1(Client.scala:407)
> > > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > > prepareLocalResources$5.apply(Client.scala:446)
> > > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > > prepareLocalResources$5.apply(Client.scala:444)
> > > at scala.collection.immutable.List.foreach(List.scala:318)
> > > at org.apache.spark.deploy.yarn.Client.prepareLocalResources(
> > > Client.scala:444)
> > > at org.apache.spark.deploy.yarn.Client.
> > > createContainerLaunchContext(Client.scala:727)
> > > at org.apache.spark.deploy.yarn.Client.submitApplication(
> > > Client.scala:142)
> > > at org.apache.spark.scheduler.cluster.
> > YarnClientSchedulerBackend.
> > > start(YarnClientSchedulerBackend.scala:57)
> > > at org.apache.spark.scheduler.TaskSchedulerImpl.start(
> > > TaskSchedulerImpl.scala:144)
> > > at org.apache.spark.SparkContext.
> <init>(SparkContext.scala:530)
> > > at org.apache.spark.api.java.JavaSparkContext.<init>(
> > > JavaSparkContext.scala:59)
> > > at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> > > SparkCubingByLayer.java:150)
> > > at org.apache.kylin.common.util.AbstractApplication.execute(
> > > AbstractApplication.java:37)
> > > at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.
> > > java:44)
> > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > at sun.reflect.NativeMethodAccessorImpl.invoke(
> > > NativeMethodAccessorImpl.java:57)
> > > at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > > DelegatingMethodAccessorImpl.java:43)
> > > at java.lang.reflect.Method.invoke(Method.java:606)
> > > at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> > > deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> > > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> > > SparkSubmit.scala:181)
> > > at org.apache.spark.deploy.SparkSubmit$.submit(
> > > SparkSubmit.scala:206)
> > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.
> > > scala:121)
> > > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> > > 17/06/19 21:22:06 INFO storage.DiskBlockManager: Shutdown hook called
> > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Shutdown hook called
> > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28
> > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/httpd-
> > > 9bcb9a5d-569f-4f28-ad89-038a9020eda8
> > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/userFiles-
> > > 2e9ff265-3d37-40e0-8894-6fd4d1a3ad8b
> > >
> > > at org.apache.kylin.common.util.CliCommandExecutor.execute(
> > > CliCommandExecutor.java:92)
> > > at org.apache.kylin.engine.spark.SparkExecutable.doWork(
> > > SparkExecutable.java:124)
> > > at org.apache.kylin.job.execution.AbstractExecutable.
> > > execute(AbstractExecutable.java:124)
> > > at org.apache.kylin.job.execution.DefaultChainedExecutable.
> > doWork(
> > > DefaultChainedExecutable.java:64)
> > > at org.apache.kylin.job.execution.AbstractExecutable.
> > > execute(AbstractExecutable.java:124)
> > > at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> > > JobRunner.run(DefaultScheduler.java:142)
> > > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > ThreadPoolExecutor.java:1145)
> > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > ThreadPoolExecutor.java:615)
> > > at java.lang.Thread.run(Thread.java:745)
> > >
> > > ------------------------------------------------------------
> > > -----------------------------
> > > I can use the kylin in-build spark-shell to do some operations like:
> > > ------------------------------------------------------------
> > > -----------------------------
> > > var textFile = sc.textFile("hdfs://xxxx/xxxx/README.md")
> > > textFile.count()
> > > textFile.first()
> > > textFile.filter(line => line.contains("hello")).count()
> > > ------------------------------------------------------------
> > > -----------------------------
> > > Here is the env info:
> > > kylin version is 2.0.0
> > > hadoop version is 2.7.*
> > > spark version is 1.6.*
> > > ------------------------------------------------------------
> > > -----------------------------
> > > Anyone can help me?THX
> > >
> > >
> > > 2017-06-19
> > > skyyws
> >
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
--
Best regards,
Shaofeng Shi 史少锋
Re: Re: Re: Build sample error with spark on kylin 2.0.0
Posted by ShaoFeng Shi <sh...@apache.org>.
hi Sky, glad to see it moves forward. The "Failed to find metadata store by
url: kylin_metadata_2_0
_0@hbase" is not root cause. Could you check more with the log files, is
there any other error before or after this?
2017-06-21 20:43 GMT+08:00 skyyws <sk...@163.com>:
> Thank you for your suggestion, Shaofeng Shi, I try to use hadoop client
> 2.7.3, it worked. But I met another probelm:
> -------------------------- --------------------------
> -------------------------- --------------------------
> 17/06/21 20:20:39 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
> hadoop645.lt.163.org): java.lang.IllegalArgumentException: Failed to find
> metadata store by url: kylin_metadata_2_0
> _0@hbase
> at org.apache.kylin.common.persistence.ResourceStore.
> createResourceStore(ResourceStore.java:99)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeDescManager.getStore(
> CubeDescManager.java:370)
> at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(
> CubeDescManager.java:298)
> at org.apache.kylin.cube.CubeDescManager.<init>(
> CubeDescManager.java:109)
> at org.apache.kylin.cube.CubeDescManager.getInstance(
> CubeDescManager.java:81)
> at org.apache.kylin.cube.CubeInstance.getDescriptor(
> CubeInstance.java:114)
> at org.apache.kylin.cube.CubeSegment.getCubeDesc(
> CubeSegment.java:119)
> at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(
> RowKeyEncoder.java:50)
> at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(
> AbstractRowKeyEncoder.java:48)
> at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> SparkCubingByLayer.java:205)
> at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> SparkCubingByLayer.java:193)
> at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.util.collection.ExternalSorter.
> insertAll(ExternalSorter.scala:191)
> at org.apache.spark.shuffle.sort.SortShuffleWriter.write(
> SortShuffleWriter.scala:64)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(
> Executor.scala:227)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 17/06/21 20:20:39 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID
> 2, hadoop645.lt.163.org, partition 0,RACK_LOCAL, 3276 bytes)
> 17/06/21 20:21:14 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1,
> hadoop645.lt.163.org): java.lang.IllegalArgumentException: Failed to find
> metadata store by url: kylin_metadata_2_0
> _0@hbase
> at org.apache.kylin.common.persistence.ResourceStore.
> createResourceStore(ResourceStore.java:99)
> at org.apache.kylin.common.persistence.ResourceStore.
> getStore(ResourceStore.java:110)
> at org.apache.kylin.cube.CubeDescManager.getStore(
> CubeDescManager.java:370)
> at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(
> CubeDescManager.java:298)
> at org.apache.kylin.cube.CubeDescManager.<init>(
> CubeDescManager.java:109)
> at org.apache.kylin.cube.CubeDescManager.getInstance(
> CubeDescManager.java:81)
> at org.apache.kylin.cube.CubeInstance.getDescriptor(
> CubeInstance.java:114)
> at org.apache.kylin.cube.CubeSegment.getCubeDesc(
> CubeSegment.java:119)
> at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(
> RowKeyEncoder.java:50)
> at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(
> AbstractRowKeyEncoder.java:48)
> at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> SparkCubingByLayer.java:205)
> at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(
> SparkCubingByLayer.java:193)
> at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> at org.apache.spark.api.java.JavaPairRDD$$anonfun$
> pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at org.apache.spark.util.collection.ExternalSorter.
> insertAll(ExternalSorter.scala:191)
> at org.apache.spark.shuffle.sort.SortShuffleWriter.write(
> SortShuffleWriter.scala:64)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(
> ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(
> Executor.scala:227)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> -------------------------- --------------------------
> -------------------------- --------------------------
> But I can use the kylin in-build spark-shell to get data from hive and
> hbase successfully, just like this:
> -------------------------- --------------------------
> -------------------------- --------------------------
> sqlContext.sql("show tables").take(1)
> -------------------------- --------------------------
> -------------------------- --------------------------
> import org.apache.spark._
> import org.apache.spark.rdd.NewHadoopRDD
> import org.apache.hadoop.fs.Path
> import org.apache.hadoop.hbase.util.Bytes
> import org.apache.hadoop.hbase.HColumnDescriptor
> import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}
> import org.apache.hadoop.hbase.client.{HBaseAdmin, Put, HTable, Result}
> import org.apache.hadoop.hbase.mapreduce.TableInputFormat
> import org.apache.hadoop.hbase.io.ImmutableBytesWritable
> val conf = HBaseConfiguration.create()
> conf.set("hbase.zookeeper.quorum", "localhost")
> conf.set(TableInputFormat.INPUT_TABLE, "test_table")
> val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
> classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
> classOf[org.apache.hadoop.hbase.client.Result])
>
> val res = hBaseRDD.take(1)
> val rs = res(0)._2
> val kv = rs.raw
> for(keyvalue <- kv) println("rowkey:"+ new String(keyvalue.getRow)+ "
> cf:"+new String(keyvalue.getFamily()) + " column:" + new
> String(keyvalue.getQualifier) + " " + "value:"+new
> String(keyvalue.getValue()))
> -------------------------- --------------------------
> -------------------------- --------------------------
> By the way, I've already put hive-site.xml and hbase-site.xml into the
> HADOOP_CONF_DIR and $SPARK_HOME/conf(which is acually
> $KYLIN_HOME/spark/conf), and I also set spark.driver.extraClassPath in
> spark-defaults.conf to attachs some related jars(hbase-client.jar,
> hbase-common.jar and so on).
> I don't know why, anyone could give me some advice?
> 2017-06-21
>
> skyyws
>
>
>
> 发件人:ShaoFeng Shi <sh...@apache.org>
> 发送时间:2017-06-20 15:13
> 主题:Re: Re: Build sample error with spark on kylin 2.0.0
> 收件人:"dev"<de...@kylin.apache.org>
> 抄送:
>
> Or you can check whether there is old hadoop jars on your cluster,
> according to https://issues.apache.org/jira/browse/HADOOP-11064
>
>
> 2017-06-20 9:33 GMT+08:00 skyyws <sk...@163.com>:
>
> > No, I deploy kylin on linux, this is my machine info:
> > --------------------------
> > 3.2.0-4-amd64 #1 SMP Debian 3.2.82-1 x86_64 GNU/Linux
> > -------------------------
> >
> > 2017-06-20
> >
> > skyyws
> >
> >
> >
> > 发件人:ShaoFeng Shi <sh...@apache.org>
> > 发送时间:2017-06-20 00:10
> > 主题:Re: Build sample error with spark on kylin 2.0.0
> > 收件人:"dev"<de...@kylin.apache.org>
> > 抄送:
> >
> > Are you running Kylin on windows? If yes, check:
> > https://stackoverflow.com/questions/33211599/hadoop-
> > error-on-windows-java-lang-unsatisfiedlinkerror
> >
> > 2017-06-19 21:55 GMT+08:00 skyyws <sk...@163.com>:
> >
> > > Hi all,
> > > I met an error when using spark engine build kylin sample on step
> "Build
> > > Cube with Spark", here is the exception log:
> > > ------------------------------------------------------------
> > > -----------------------------
> > > Exception in thread "main" java.lang.UnsatisfiedLinkError:
> > > org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteAr
> > > ray(II[BI[BIILjava/lang/String;JZ)V
> > > at org.apache.hadoop.util.NativeCrc32.
> > > nativeComputeChunkedSumsByteArray(Native Method)
> > > at org.apache.hadoop.util.NativeCrc32.
> > > calculateChunkedSumsByteArray(NativeCrc32.java:86)
> > > at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(
> > > DataChecksum.java:430)
> > > at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(
> > > FSOutputSummer.java:202)
> > > at org.apache.hadoop.fs.FSOutputSummer.write1(
> > > FSOutputSummer.java:124)
> > > at org.apache.hadoop.fs.FSOutputSummer.write(
> > > FSOutputSummer.java:110)
> > > at org.apache.hadoop.fs.FSDataOutputStream$
> PositionCache.write(
> > > FSDataOutputStream.java:58)
> > > at java.io.DataOutputStream.write(DataOutputStream.java:107)
> > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:80)
> > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> > > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
> > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
> > > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
> > > at org.apache.spark.deploy.yarn.Client.copyFileToRemote(
> > > Client.scala:317)
> > > at org.apache.spark.deploy.yarn.Client.org$apache$spark$
> > > deploy$yarn$Client$$distribute$1(Client.scala:407)
> > > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > > prepareLocalResources$5.apply(Client.scala:446)
> > > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > > prepareLocalResources$5.apply(Client.scala:444)
> > > at scala.collection.immutable.List.foreach(List.scala:318)
> > > at org.apache.spark.deploy.yarn.Client.prepareLocalResources(
> > > Client.scala:444)
> > > at org.apache.spark.deploy.yarn.Client.
> > > createContainerLaunchContext(Client.scala:727)
> > > at org.apache.spark.deploy.yarn.Client.submitApplication(
> > > Client.scala:142)
> > > at org.apache.spark.scheduler.cluster.
> > YarnClientSchedulerBackend.
> > > start(YarnClientSchedulerBackend.scala:57)
> > > at org.apache.spark.scheduler.TaskSchedulerImpl.start(
> > > TaskSchedulerImpl.scala:144)
> > > at org.apache.spark.SparkContext.
> <init>(SparkContext.scala:530)
> > > at org.apache.spark.api.java.JavaSparkContext.<init>(
> > > JavaSparkContext.scala:59)
> > > at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> > > SparkCubingByLayer.java:150)
> > > at org.apache.kylin.common.util.AbstractApplication.execute(
> > > AbstractApplication.java:37)
> > > at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.
> > > java:44)
> > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > > at sun.reflect.NativeMethodAccessorImpl.invoke(
> > > NativeMethodAccessorImpl.java:57)
> > > at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > > DelegatingMethodAccessorImpl.java:43)
> > > at java.lang.reflect.Method.invoke(Method.java:606)
> > > at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> > > deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> > > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> > > SparkSubmit.scala:181)
> > > at org.apache.spark.deploy.SparkSubmit$.submit(
> > > SparkSubmit.scala:206)
> > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.
> > > scala:121)
> > > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> > > 17/06/19 21:22:06 INFO storage.DiskBlockManager: Shutdown hook called
> > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Shutdown hook called
> > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28
> > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/httpd-
> > > 9bcb9a5d-569f-4f28-ad89-038a9020eda8
> > > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/userFiles-
> > > 2e9ff265-3d37-40e0-8894-6fd4d1a3ad8b
> > >
> > > at org.apache.kylin.common.util.CliCommandExecutor.execute(
> > > CliCommandExecutor.java:92)
> > > at org.apache.kylin.engine.spark.SparkExecutable.doWork(
> > > SparkExecutable.java:124)
> > > at org.apache.kylin.job.execution.AbstractExecutable.
> > > execute(AbstractExecutable.java:124)
> > > at org.apache.kylin.job.execution.DefaultChainedExecutable.
> > doWork(
> > > DefaultChainedExecutable.java:64)
> > > at org.apache.kylin.job.execution.AbstractExecutable.
> > > execute(AbstractExecutable.java:124)
> > > at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> > > JobRunner.run(DefaultScheduler.java:142)
> > > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > ThreadPoolExecutor.java:1145)
> > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > ThreadPoolExecutor.java:615)
> > > at java.lang.Thread.run(Thread.java:745)
> > >
> > > ------------------------------------------------------------
> > > -----------------------------
> > > I can use the kylin in-build spark-shell to do some operations like:
> > > ------------------------------------------------------------
> > > -----------------------------
> > > var textFile = sc.textFile("hdfs://xxxx/xxxx/README.md")
> > > textFile.count()
> > > textFile.first()
> > > textFile.filter(line => line.contains("hello")).count()
> > > ------------------------------------------------------------
> > > -----------------------------
> > > Here is the env info:
> > > kylin version is 2.0.0
> > > hadoop version is 2.7.*
> > > spark version is 1.6.*
> > > ------------------------------------------------------------
> > > -----------------------------
> > > Anyone can help me?THX
> > >
> > >
> > > 2017-06-19
> > > skyyws
> >
> >
> >
> >
> > --
> > Best regards,
> >
> > Shaofeng Shi 史少锋
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
--
Best regards,
Shaofeng Shi 史少锋
Re: Re: Re: Build sample error with spark on kylin 2.0.0
Posted by skyyws <sk...@163.com>.
Thank you for your suggestion, Shaofeng Shi, I try to use hadoop client 2.7.3, it worked. But I met another probelm:
-------------------------- -------------------------- -------------------------- --------------------------
17/06/21 20:20:39 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, hadoop645.lt.163.org): java.lang.IllegalArgumentException: Failed to find metadata store by url: kylin_metadata_2_0
_0@hbase
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:99)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:110)
at org.apache.kylin.cube.CubeDescManager.getStore(CubeDescManager.java:370)
at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(CubeDescManager.java:298)
at org.apache.kylin.cube.CubeDescManager.<init>(CubeDescManager.java:109)
at org.apache.kylin.cube.CubeDescManager.getInstance(CubeDescManager.java:81)
at org.apache.kylin.cube.CubeInstance.getDescriptor(CubeInstance.java:114)
at org.apache.kylin.cube.CubeSegment.getCubeDesc(CubeSegment.java:119)
at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(RowKeyEncoder.java:50)
at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(AbstractRowKeyEncoder.java:48)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:205)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:193)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:191)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
17/06/21 20:20:39 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 2, hadoop645.lt.163.org, partition 0,RACK_LOCAL, 3276 bytes)
17/06/21 20:21:14 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, hadoop645.lt.163.org): java.lang.IllegalArgumentException: Failed to find metadata store by url: kylin_metadata_2_0
_0@hbase
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:99)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:110)
at org.apache.kylin.cube.CubeDescManager.getStore(CubeDescManager.java:370)
at org.apache.kylin.cube.CubeDescManager.reloadAllCubeDesc(CubeDescManager.java:298)
at org.apache.kylin.cube.CubeDescManager.<init>(CubeDescManager.java:109)
at org.apache.kylin.cube.CubeDescManager.getInstance(CubeDescManager.java:81)
at org.apache.kylin.cube.CubeInstance.getDescriptor(CubeInstance.java:114)
at org.apache.kylin.cube.CubeSegment.getCubeDesc(CubeSegment.java:119)
at org.apache.kylin.cube.kv.RowKeyEncoder.<init>(RowKeyEncoder.java:50)
at org.apache.kylin.cube.kv.AbstractRowKeyEncoder.createInstance(AbstractRowKeyEncoder.java:48)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:205)
at org.apache.kylin.engine.spark.SparkCubingByLayer$2.call(SparkCubingByLayer.java:193)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1018)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:191)
at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:64)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
-------------------------- -------------------------- -------------------------- --------------------------
But I can use the kylin in-build spark-shell to get data from hive and hbase successfully, just like this:
-------------------------- -------------------------- -------------------------- --------------------------
sqlContext.sql("show tables").take(1)
-------------------------- -------------------------- -------------------------- --------------------------
import org.apache.spark._
import org.apache.spark.rdd.NewHadoopRDD
import org.apache.hadoop.fs.Path
import org.apache.hadoop.hbase.util.Bytes
import org.apache.hadoop.hbase.HColumnDescriptor
import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}
import org.apache.hadoop.hbase.client.{HBaseAdmin, Put, HTable, Result}
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.hadoop.hbase.io.ImmutableBytesWritable
val conf = HBaseConfiguration.create()
conf.set("hbase.zookeeper.quorum", "localhost")
conf.set(TableInputFormat.INPUT_TABLE, "test_table")
val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], classOf[org.apache.hadoop.hbase.client.Result])
val res = hBaseRDD.take(1)
val rs = res(0)._2
val kv = rs.raw
for(keyvalue <- kv) println("rowkey:"+ new String(keyvalue.getRow)+ " cf:"+new String(keyvalue.getFamily()) + " column:" + new String(keyvalue.getQualifier) + " " + "value:"+new String(keyvalue.getValue()))
-------------------------- -------------------------- -------------------------- --------------------------
By the way, I've already put hive-site.xml and hbase-site.xml into the HADOOP_CONF_DIR and $SPARK_HOME/conf(which is acually $KYLIN_HOME/spark/conf), and I also set spark.driver.extraClassPath in spark-defaults.conf to attachs some related jars(hbase-client.jar, hbase-common.jar and so on).
I don't know why, anyone could give me some advice?
2017-06-21
skyyws
发件人:ShaoFeng Shi <sh...@apache.org>
发送时间:2017-06-20 15:13
主题:Re: Re: Build sample error with spark on kylin 2.0.0
收件人:"dev"<de...@kylin.apache.org>
抄送:
Or you can check whether there is old hadoop jars on your cluster,
according to https://issues.apache.org/jira/browse/HADOOP-11064
2017-06-20 9:33 GMT+08:00 skyyws <sk...@163.com>:
> No, I deploy kylin on linux, this is my machine info:
> --------------------------
> 3.2.0-4-amd64 #1 SMP Debian 3.2.82-1 x86_64 GNU/Linux
> -------------------------
>
> 2017-06-20
>
> skyyws
>
>
>
> 发件人:ShaoFeng Shi <sh...@apache.org>
> 发送时间:2017-06-20 00:10
> 主题:Re: Build sample error with spark on kylin 2.0.0
> 收件人:"dev"<de...@kylin.apache.org>
> 抄送:
>
> Are you running Kylin on windows? If yes, check:
> https://stackoverflow.com/questions/33211599/hadoop-
> error-on-windows-java-lang-unsatisfiedlinkerror
>
> 2017-06-19 21:55 GMT+08:00 skyyws <sk...@163.com>:
>
> > Hi all,
> > I met an error when using spark engine build kylin sample on step "Build
> > Cube with Spark", here is the exception log:
> > ------------------------------------------------------------
> > -----------------------------
> > Exception in thread "main" java.lang.UnsatisfiedLinkError:
> > org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteAr
> > ray(II[BI[BIILjava/lang/String;JZ)V
> > at org.apache.hadoop.util.NativeCrc32.
> > nativeComputeChunkedSumsByteArray(Native Method)
> > at org.apache.hadoop.util.NativeCrc32.
> > calculateChunkedSumsByteArray(NativeCrc32.java:86)
> > at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(
> > DataChecksum.java:430)
> > at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(
> > FSOutputSummer.java:202)
> > at org.apache.hadoop.fs.FSOutputSummer.write1(
> > FSOutputSummer.java:124)
> > at org.apache.hadoop.fs.FSOutputSummer.write(
> > FSOutputSummer.java:110)
> > at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(
> > FSDataOutputStream.java:58)
> > at java.io.DataOutputStream.write(DataOutputStream.java:107)
> > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:80)
> > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
> > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
> > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
> > at org.apache.spark.deploy.yarn.Client.copyFileToRemote(
> > Client.scala:317)
> > at org.apache.spark.deploy.yarn.Client.org$apache$spark$
> > deploy$yarn$Client$$distribute$1(Client.scala:407)
> > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > prepareLocalResources$5.apply(Client.scala:446)
> > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > prepareLocalResources$5.apply(Client.scala:444)
> > at scala.collection.immutable.List.foreach(List.scala:318)
> > at org.apache.spark.deploy.yarn.Client.prepareLocalResources(
> > Client.scala:444)
> > at org.apache.spark.deploy.yarn.Client.
> > createContainerLaunchContext(Client.scala:727)
> > at org.apache.spark.deploy.yarn.Client.submitApplication(
> > Client.scala:142)
> > at org.apache.spark.scheduler.cluster.
> YarnClientSchedulerBackend.
> > start(YarnClientSchedulerBackend.scala:57)
> > at org.apache.spark.scheduler.TaskSchedulerImpl.start(
> > TaskSchedulerImpl.scala:144)
> > at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
> > at org.apache.spark.api.java.JavaSparkContext.<init>(
> > JavaSparkContext.scala:59)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> > SparkCubingByLayer.java:150)
> > at org.apache.kylin.common.util.AbstractApplication.execute(
> > AbstractApplication.java:37)
> > at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.
> > java:44)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at sun.reflect.NativeMethodAccessorImpl.invoke(
> > NativeMethodAccessorImpl.java:57)
> > at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> > deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> > SparkSubmit.scala:181)
> > at org.apache.spark.deploy.SparkSubmit$.submit(
> > SparkSubmit.scala:206)
> > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.
> > scala:121)
> > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> > 17/06/19 21:22:06 INFO storage.DiskBlockManager: Shutdown hook called
> > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Shutdown hook called
> > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28
> > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/httpd-
> > 9bcb9a5d-569f-4f28-ad89-038a9020eda8
> > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/userFiles-
> > 2e9ff265-3d37-40e0-8894-6fd4d1a3ad8b
> >
> > at org.apache.kylin.common.util.CliCommandExecutor.execute(
> > CliCommandExecutor.java:92)
> > at org.apache.kylin.engine.spark.SparkExecutable.doWork(
> > SparkExecutable.java:124)
> > at org.apache.kylin.job.execution.AbstractExecutable.
> > execute(AbstractExecutable.java:124)
> > at org.apache.kylin.job.execution.DefaultChainedExecutable.
> doWork(
> > DefaultChainedExecutable.java:64)
> > at org.apache.kylin.job.execution.AbstractExecutable.
> > execute(AbstractExecutable.java:124)
> > at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> > JobRunner.run(DefaultScheduler.java:142)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1145)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:615)
> > at java.lang.Thread.run(Thread.java:745)
> >
> > ------------------------------------------------------------
> > -----------------------------
> > I can use the kylin in-build spark-shell to do some operations like:
> > ------------------------------------------------------------
> > -----------------------------
> > var textFile = sc.textFile("hdfs://xxxx/xxxx/README.md")
> > textFile.count()
> > textFile.first()
> > textFile.filter(line => line.contains("hello")).count()
> > ------------------------------------------------------------
> > -----------------------------
> > Here is the env info:
> > kylin version is 2.0.0
> > hadoop version is 2.7.*
> > spark version is 1.6.*
> > ------------------------------------------------------------
> > -----------------------------
> > Anyone can help me?THX
> >
> >
> > 2017-06-19
> > skyyws
>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
--
Best regards,
Shaofeng Shi 史少锋
Re: Re: Build sample error with spark on kylin 2.0.0
Posted by ShaoFeng Shi <sh...@apache.org>.
Or you can check whether there is old hadoop jars on your cluster,
according to https://issues.apache.org/jira/browse/HADOOP-11064
2017-06-20 9:33 GMT+08:00 skyyws <sk...@163.com>:
> No, I deploy kylin on linux, this is my machine info:
> --------------------------
> 3.2.0-4-amd64 #1 SMP Debian 3.2.82-1 x86_64 GNU/Linux
> -------------------------
>
> 2017-06-20
>
> skyyws
>
>
>
> 发件人:ShaoFeng Shi <sh...@apache.org>
> 发送时间:2017-06-20 00:10
> 主题:Re: Build sample error with spark on kylin 2.0.0
> 收件人:"dev"<de...@kylin.apache.org>
> 抄送:
>
> Are you running Kylin on windows? If yes, check:
> https://stackoverflow.com/questions/33211599/hadoop-
> error-on-windows-java-lang-unsatisfiedlinkerror
>
> 2017-06-19 21:55 GMT+08:00 skyyws <sk...@163.com>:
>
> > Hi all,
> > I met an error when using spark engine build kylin sample on step "Build
> > Cube with Spark", here is the exception log:
> > ------------------------------------------------------------
> > -----------------------------
> > Exception in thread "main" java.lang.UnsatisfiedLinkError:
> > org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteAr
> > ray(II[BI[BIILjava/lang/String;JZ)V
> > at org.apache.hadoop.util.NativeCrc32.
> > nativeComputeChunkedSumsByteArray(Native Method)
> > at org.apache.hadoop.util.NativeCrc32.
> > calculateChunkedSumsByteArray(NativeCrc32.java:86)
> > at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(
> > DataChecksum.java:430)
> > at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(
> > FSOutputSummer.java:202)
> > at org.apache.hadoop.fs.FSOutputSummer.write1(
> > FSOutputSummer.java:124)
> > at org.apache.hadoop.fs.FSOutputSummer.write(
> > FSOutputSummer.java:110)
> > at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(
> > FSDataOutputStream.java:58)
> > at java.io.DataOutputStream.write(DataOutputStream.java:107)
> > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:80)
> > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
> > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
> > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
> > at org.apache.spark.deploy.yarn.Client.copyFileToRemote(
> > Client.scala:317)
> > at org.apache.spark.deploy.yarn.Client.org$apache$spark$
> > deploy$yarn$Client$$distribute$1(Client.scala:407)
> > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > prepareLocalResources$5.apply(Client.scala:446)
> > at org.apache.spark.deploy.yarn.Client$$anonfun$
> > prepareLocalResources$5.apply(Client.scala:444)
> > at scala.collection.immutable.List.foreach(List.scala:318)
> > at org.apache.spark.deploy.yarn.Client.prepareLocalResources(
> > Client.scala:444)
> > at org.apache.spark.deploy.yarn.Client.
> > createContainerLaunchContext(Client.scala:727)
> > at org.apache.spark.deploy.yarn.Client.submitApplication(
> > Client.scala:142)
> > at org.apache.spark.scheduler.cluster.
> YarnClientSchedulerBackend.
> > start(YarnClientSchedulerBackend.scala:57)
> > at org.apache.spark.scheduler.TaskSchedulerImpl.start(
> > TaskSchedulerImpl.scala:144)
> > at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
> > at org.apache.spark.api.java.JavaSparkContext.<init>(
> > JavaSparkContext.scala:59)
> > at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> > SparkCubingByLayer.java:150)
> > at org.apache.kylin.common.util.AbstractApplication.execute(
> > AbstractApplication.java:37)
> > at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.
> > java:44)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at sun.reflect.NativeMethodAccessorImpl.invoke(
> > NativeMethodAccessorImpl.java:57)
> > at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> > DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> > deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> > SparkSubmit.scala:181)
> > at org.apache.spark.deploy.SparkSubmit$.submit(
> > SparkSubmit.scala:206)
> > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.
> > scala:121)
> > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> > 17/06/19 21:22:06 INFO storage.DiskBlockManager: Shutdown hook called
> > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Shutdown hook called
> > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28
> > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/httpd-
> > 9bcb9a5d-569f-4f28-ad89-038a9020eda8
> > 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> > /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/userFiles-
> > 2e9ff265-3d37-40e0-8894-6fd4d1a3ad8b
> >
> > at org.apache.kylin.common.util.CliCommandExecutor.execute(
> > CliCommandExecutor.java:92)
> > at org.apache.kylin.engine.spark.SparkExecutable.doWork(
> > SparkExecutable.java:124)
> > at org.apache.kylin.job.execution.AbstractExecutable.
> > execute(AbstractExecutable.java:124)
> > at org.apache.kylin.job.execution.DefaultChainedExecutable.
> doWork(
> > DefaultChainedExecutable.java:64)
> > at org.apache.kylin.job.execution.AbstractExecutable.
> > execute(AbstractExecutable.java:124)
> > at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> > JobRunner.run(DefaultScheduler.java:142)
> > at java.util.concurrent.ThreadPoolExecutor.runWorker(
> > ThreadPoolExecutor.java:1145)
> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > ThreadPoolExecutor.java:615)
> > at java.lang.Thread.run(Thread.java:745)
> >
> > ------------------------------------------------------------
> > -----------------------------
> > I can use the kylin in-build spark-shell to do some operations like:
> > ------------------------------------------------------------
> > -----------------------------
> > var textFile = sc.textFile("hdfs://xxxx/xxxx/README.md")
> > textFile.count()
> > textFile.first()
> > textFile.filter(line => line.contains("hello")).count()
> > ------------------------------------------------------------
> > -----------------------------
> > Here is the env info:
> > kylin version is 2.0.0
> > hadoop version is 2.7.*
> > spark version is 1.6.*
> > ------------------------------------------------------------
> > -----------------------------
> > Anyone can help me?THX
> >
> >
> > 2017-06-19
> > skyyws
>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
--
Best regards,
Shaofeng Shi 史少锋
Re: Re: Build sample error with spark on kylin 2.0.0
Posted by skyyws <sk...@163.com>.
No, I deploy kylin on linux, this is my machine info:
--------------------------
3.2.0-4-amd64 #1 SMP Debian 3.2.82-1 x86_64 GNU/Linux
-------------------------
2017-06-20
skyyws
发件人:ShaoFeng Shi <sh...@apache.org>
发送时间:2017-06-20 00:10
主题:Re: Build sample error with spark on kylin 2.0.0
收件人:"dev"<de...@kylin.apache.org>
抄送:
Are you running Kylin on windows? If yes, check:
https://stackoverflow.com/questions/33211599/hadoop-error-on-windows-java-lang-unsatisfiedlinkerror
2017-06-19 21:55 GMT+08:00 skyyws <sk...@163.com>:
> Hi all,
> I met an error when using spark engine build kylin sample on step "Build
> Cube with Spark", here is the exception log:
> ------------------------------------------------------------
> -----------------------------
> Exception in thread "main" java.lang.UnsatisfiedLinkError:
> org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteAr
> ray(II[BI[BIILjava/lang/String;JZ)V
> at org.apache.hadoop.util.NativeCrc32.
> nativeComputeChunkedSumsByteArray(Native Method)
> at org.apache.hadoop.util.NativeCrc32.
> calculateChunkedSumsByteArray(NativeCrc32.java:86)
> at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(
> DataChecksum.java:430)
> at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(
> FSOutputSummer.java:202)
> at org.apache.hadoop.fs.FSOutputSummer.write1(
> FSOutputSummer.java:124)
> at org.apache.hadoop.fs.FSOutputSummer.write(
> FSOutputSummer.java:110)
> at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(
> FSDataOutputStream.java:58)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:80)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:52)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
> at org.apache.spark.deploy.yarn.Client.copyFileToRemote(
> Client.scala:317)
> at org.apache.spark.deploy.yarn.Client.org$apache$spark$
> deploy$yarn$Client$$distribute$1(Client.scala:407)
> at org.apache.spark.deploy.yarn.Client$$anonfun$
> prepareLocalResources$5.apply(Client.scala:446)
> at org.apache.spark.deploy.yarn.Client$$anonfun$
> prepareLocalResources$5.apply(Client.scala:444)
> at scala.collection.immutable.List.foreach(List.scala:318)
> at org.apache.spark.deploy.yarn.Client.prepareLocalResources(
> Client.scala:444)
> at org.apache.spark.deploy.yarn.Client.
> createContainerLaunchContext(Client.scala:727)
> at org.apache.spark.deploy.yarn.Client.submitApplication(
> Client.scala:142)
> at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.
> start(YarnClientSchedulerBackend.scala:57)
> at org.apache.spark.scheduler.TaskSchedulerImpl.start(
> TaskSchedulerImpl.scala:144)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:530)
> at org.apache.spark.api.java.JavaSparkContext.<init>(
> JavaSparkContext.scala:59)
> at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(
> SparkCubingByLayer.java:150)
> at org.apache.kylin.common.util.AbstractApplication.execute(
> AbstractApplication.java:37)
> at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.
> java:44)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(
> SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(
> SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.
> scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 17/06/19 21:22:06 INFO storage.DiskBlockManager: Shutdown hook called
> 17/06/19 21:22:06 INFO util.ShutdownHookManager: Shutdown hook called
> 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28
> 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/httpd-
> 9bcb9a5d-569f-4f28-ad89-038a9020eda8
> 17/06/19 21:22:06 INFO util.ShutdownHookManager: Deleting directory
> /tmp/spark-0d1d3709-86cd-446c-b728-5070f168de28/userFiles-
> 2e9ff265-3d37-40e0-8894-6fd4d1a3ad8b
>
> at org.apache.kylin.common.util.CliCommandExecutor.execute(
> CliCommandExecutor.java:92)
> at org.apache.kylin.engine.spark.SparkExecutable.doWork(
> SparkExecutable.java:124)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(
> DefaultChainedExecutable.java:64)
> at org.apache.kylin.job.execution.AbstractExecutable.
> execute(AbstractExecutable.java:124)
> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$
> JobRunner.run(DefaultScheduler.java:142)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> ------------------------------------------------------------
> -----------------------------
> I can use the kylin in-build spark-shell to do some operations like:
> ------------------------------------------------------------
> -----------------------------
> var textFile = sc.textFile("hdfs://xxxx/xxxx/README.md")
> textFile.count()
> textFile.first()
> textFile.filter(line => line.contains("hello")).count()
> ------------------------------------------------------------
> -----------------------------
> Here is the env info:
> kylin version is 2.0.0
> hadoop version is 2.7.*
> spark version is 1.6.*
> ------------------------------------------------------------
> -----------------------------
> Anyone can help me?THX
>
>
> 2017-06-19
> skyyws
--
Best regards,
Shaofeng Shi 史少锋