You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Chen Xi (JIRA)" <ji...@apache.org> on 2018/11/01 03:37:00 UTC

[jira] [Commented] (KYLIN-3604) Can't build cube with spark in HBase standalone mode

    [ https://issues.apache.org/jira/browse/KYLIN-3604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671071#comment-16671071 ] 

Chen Xi commented on KYLIN-3604:
--------------------------------

[~colinmjj] Hi, Colin! This is a good patch! I applied it on kylin-2.5.0 and it solved this problem.

However, I encountered another problem in step 8 Convert Cuboid Data to HFile.

I'm using CDH 5.11.0 and Spark 2.1.1. The HBase Cluster uses kerberos and NM HA.

I configured spark to gain tokens from HBase cluster namenodes, and used cluster mode. 

The command for step 8 is:
{code:java}
export HADOOP_CONF_DIR=/home/someguy/kylin/hadoop_conf && /home/someguy/kylin/spark/bin/spark-submit --class org.apache.kylin.common.util.SparkEntry  --conf spark.executor.instances=40  --conf spark.yarn.archive=viewfs://hadoop-foo/distcache/kylin/spark-libs/spark2.1.1-hadoop2.6.0.tar  --conf spark.yarn.queue=root.cupid.kylin  --conf spark.history.fs.logDirectory=viewfs://hadoop-foo/kylin/spark-history  --conf spark.master=yarn  --conf spark.hadoop.yarn.timeline-service.enabled=false  --conf spark.executor.memory=4G  --conf spark.eventLog.enabled=true  --conf spark.eventLog.dir=viewfs://hadoop-foo/kylin/spark-history  --conf spark.yarn.executor.memoryOverhead=1024  --conf spark.yarn.access.namenodes=hdfs://namenode01-foo-kylin.abcde.hadoop,hdfs://namenode02-foo-kylin.abcde.hadoop  --conf spark.driver.memory=2G  --conf spark.submit.deployMode=cluster  --conf spark.shuffle.service.enabled=false --jars /usr/lib/hbase/lib/hbase-common-1.2.0-cdh5.11.0.jar,/usr/lib/hbase/lib/hbase-server-1.2.0-cdh5.11.0.jar,/usr/lib/hbase/lib/hbase-client-1.2.0-cdh5.11.0.jar,/usr/lib/hbase/lib/hbase-protocol-1.2.0-cdh5.11.0.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.2.0-cdh5.11.0.jar,/usr/lib/hbase/lib/htrace-core-3.2.0-incubating.jar,/usr/lib/hbase/lib/metrics-core-2.2.0.jar, /home/someguy/kylin/lib/kylin-job-2.5.0.jar -className org.apache.kylin.storage.hbase.steps.SparkCubeHFile -partitions hdfs://hadoop-foo-kylin/kylin/kylin_metadata_someguy/kylin-dbc59708-bc41-cc18-0799-0b1cf5e2efa0/my_cube/rowkey_stats/part-r-00000_hfile -counterOutput viewfs://hadoop-foo/kylin/kylin_metadata_someguy/kylin-dbc59708-bc41-cc18-0799-0b1cf5e2efa0/my_cube/counter -cubename my_cube -output hdfs://hadoop-foo-kylin/kylin/kylin_metadata_someguy/kylin-dbc59708-bc41-cc18-0799-0b1cf5e2efa0/my_cube/hfile -input viewfs://hadoop-foo/kylin/kylin_metadata_someguy/kylin-dbc59708-bc41-cc18-0799-0b1cf5e2efa0/my_cube/cuboid/ -segmentId 103639e1-a24d-d4fc-a2f3-37267d299898 -metaUrl kylin_metadata_someguy@hdfs,path=viewfs://hadoop-foo/kylin/kylin_metadata_someguy/kylin-dbc59708-bc41-cc18-0799-0b1cf5e2efa0/my_cube/metadata -hbaseConfPath viewfs://hadoop-foo/kylin/kylin_metadata_someguy/kylin-dbc59708-bc41-cc18-0799-0b1cf5e2efa0/hbase-conf.xml
{code}
The stack trace is:
{code:java}
18/10/31 22:11:45 ERROR yarn.ApplicationMaster: User class threw exception: java.lang.RuntimeException: error execute org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: Delegation Token can be issued only with kerberos or web authentication
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:7521)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken(NameNodeRpcServer.java:548)
	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getDelegationToken(AuthorizationProviderProxyClientProtocol.java:663)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:981)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2220)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2214)

java.lang.RuntimeException: error execute org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: Delegation Token can be issued only with kerberos or web authentication
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:7521)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken(NameNodeRpcServer.java:548)
	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getDelegationToken(AuthorizationProviderProxyClientProtocol.java:663)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:981)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2220)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2214)

	at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42)
	at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): Delegation Token can be issued only with kerberos or web authentication
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDelegationToken(FSNamesystem.java:7521)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getDelegationToken(NameNodeRpcServer.java:548)
	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getDelegationToken(AuthorizationProviderProxyClientProtocol.java:663)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:981)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2220)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2216)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2214)

	at org.apache.hadoop.ipc.Client.call(Client.java:1472)
	at org.apache.hadoop.ipc.Client.call(Client.java:1409)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
	at com.sun.proxy.$Proxy10.getDelegationToken(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getDelegationToken(ClientNamenodeProtocolTranslatorPB.java:928)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
	at com.sun.proxy.$Proxy11.getDelegationToken(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient.getDelegationToken(DFSClient.java:1082)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getDelegationToken(DistributedFileSystem.java:1499)
	at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:546)
	at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:524)
	at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2283)
	at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:140)
	at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
	at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
	at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:142)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1099)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1085)
	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsNewAPIHadoopDataset$1.apply(PairRDDFunctions.scala:1085)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
	at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:1085)
	at org.apache.spark.api.java.JavaPairRDD.saveAsNewAPIHadoopDataset(JavaPairRDD.scala:831)
	at org.apache.kylin.storage.hbase.steps.SparkCubeHFile.execute(SparkCubeHFile.java:236)
	at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
	... 6 more
{code}
Did you come up with this problem? I wonder if I missed anything in my configuration? 

Maybe this is a bug I should create a new issue.

> Can't build cube with spark in HBase standalone mode
> ----------------------------------------------------
>
>                 Key: KYLIN-3604
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3604
>             Project: Kylin
>          Issue Type: Bug
>          Components: Storage - HBase
>    Affects Versions: v2.5.0
>            Reporter: Colin Ma
>            Assignee: Colin Ma
>            Priority: Blocker
>             Fix For: v2.5.1
>
>         Attachments: KYLIN-3604.001.patch
>
>
> With Hbase standalone mode, Cube can't be built on step 8 Convert Cuboid Data to HFile, the following is the related exception:
> 18/09/29 11:13:21 INFO steps.SparkCubeHFile: Input path: {color:#d04437}hdfs://nameservice1/kylin/kylin_metadata/kylin-b65c0e62-69e9-bb11-9d7d-e6e5abc7ef8e/test_spark_cube/cuboid/{color}
> 18/09/29 11:13:21 INFO steps.SparkCubeHFile: Output path: {color:#14892c}hdfs://nameservice3/kylin/kylin_metadata/kylin-b65c0e62-69e9-bb11-9d7d-e6e5abc7ef8e/test_spark_cube/hfile{color}
> 18/09/29 11:13:21 INFO steps.SparkCubeHFile: Loading HBase configuration {color:#d04437}from:hdfs://nameservice1/kylin/kylin_metadata/kylin-b65c0e62-69e9-bb11-9d7d-e6e5abc7ef8e/hbase-conf.xml{color}
> 18/09/29 11:13:21 ERROR yarn.ApplicationMaster: User class threw exception: java.lang.RuntimeException: error execute org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: Wrong FS: hdfs://nameservice1/kylin/kylin_metadata/kylin-b65c0e62-69e9-bb11-9d7d-e6e5abc7ef8e/hbase-conf.xml, expected: hdfs://nameservice3 java.lang.RuntimeException: error execute org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: Wrong FS: hdfs://nameservice1/kylin/kylin_metadata/kylin-b65c0e62-69e9-bb11-9d7d-e6e5abc7ef8e/hbase-conf.xml, expected: hdfs://nameservice3 at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42) at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637) Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://nameservice1/kylin/kylin_metadata/kylin-b65c0e62-69e9-bb11-9d7d-e6e5abc7ef8e/hbase-conf.xml, expected: hdfs://nameservice3 at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645) at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:302) at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:298) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:298) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766) at org.apache.kylin.storage.hbase.steps.SparkCubeHFile.execute(SparkCubeHFile.java:183) at org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37) ... 6 more 18/09/29 11:13:21 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.RuntimeException: error execute org.apache.kylin.storage.hbase.steps.SparkCubeHFile. Root cause: Wrong FS: hdfs://nameservice1/kylin/kylin_metadata/kylin-b65c0e62-69e9-bb11-9d7d-e6e5abc7ef8e/hbase-conf.xml, expected: hdfs://nameservice3) 18/09/29 11:13:21 INFO spark.SparkContext: Invoking stop() from shutdown hook 18/09/29 11:13:21 INFO server.ServerConnector: Stopped Spark@1785d078
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)