You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Xuefu Zhang (JIRA)" <ji...@apache.org> on 2014/11/24 21:42:13 UTC

[jira] [Comment Edited] (HIVE-8951) Spark remote context doesn't work with local-cluster [Spark Branch]

    [ https://issues.apache.org/jira/browse/HIVE-8951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14223450#comment-14223450 ] 

Xuefu Zhang edited comment on HIVE-8951 at 11/24/14 8:41 PM:
-------------------------------------------------------------

[~vanzin], yes. I realized that later on, and clearly it's called. The problem is about local-cluster, which doesn't work even for the first query, giving exceptions as showed above. Could you please take a look? It's blocking the current integration effort. Thanks.


was (Author: xuefuz):
[~vanzin], yes. I realized that later on, and clearly it's called. The problem is about local-cluster, which doesn't work, giving exceptions as showed above. Could you please take a look? It's blocking the current integration effort. Thanks.

> Spark remote context doesn't work with local-cluster [Spark Branch]
> -------------------------------------------------------------------
>
>                 Key: HIVE-8951
>                 URL: https://issues.apache.org/jira/browse/HIVE-8951
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Xuefu Zhang
>
> What I did:
> {code}
> set spark.home=/home/xzhang/apache/spark;
> set spark.master=local-cluster[2,1,2048];
> set hive.execution.engine=spark; 
> set spark.executor.memory=2g;
> set spark.serializer=org.apache.spark.serializer.KryoSerializer;
> set spark.io.compression.codec=org.apache.spark.io.LZFCompressionCodec;
> select name, avg(value) as v from dec group by name order by v;
> {code}
> Exeptions seen:
> {code}
> 14/11/23 10:42:15 INFO Worker: Spark home: /home/xzhang/apache/spark
> 14/11/23 10:42:15 INFO AppClient$ClientActor: Connecting to master spark://xzdt.local:55151...
> 14/11/23 10:42:15 INFO Master: Registering app Hive on Spark
> 14/11/23 10:42:15 INFO Master: Registered app Hive on Spark with ID app-20141123104215-0000
> 14/11/23 10:42:15 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20141123104215-0000
> 14/11/23 10:42:15 INFO NettyBlockTransferService: Server created on 41676
> 14/11/23 10:42:15 INFO BlockManagerMaster: Trying to register BlockManager
> 14/11/23 10:42:15 INFO BlockManagerMasterActor: Registering block manager xzdt.local:41676 with 265.0 MB RAM, BlockManagerId(<driver>, xzdt.local, 41676)
> 14/11/23 10:42:15 INFO BlockManagerMaster: Registered BlockManager
> 14/11/23 10:42:15 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
> 14/11/23 10:42:20 WARN AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use
> java.net.BindException: Address already in use
> 	at sun.nio.ch.Net.bind0(Native Method)
> 	at sun.nio.ch.Net.bind(Net.java:174)
> 	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:139)
> 	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:77)
> 	at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187)
> 	at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316)
> 	at org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265)
> 	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
> 	at org.eclipse.jetty.server.Server.doStart(Server.java:293)
> 	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
> 	at org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$$connect$1(JettyUtils.scala:194)
> 	at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204)
> 	at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204)
> 	at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1676)
> 	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> 	at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1667)
> 	at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:204)
> 	at org.apache.spark.ui.WebUI.bind(WebUI.scala:102)
> 	at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267)
> 	at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267)
> 	at scala.Option.foreach(Option.scala:236)
> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:267)
> 	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
> 	at org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:106)
> 	at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:362)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:616)
> 	at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:353)
> 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
> 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 14/11/23 10:42:20 WARN AbstractLifeCycle: FAILED org.eclipse.jetty.server.Server@4c9fd062: java.net.BindException: Address already in use
> java.net.BindException: Address already in use
> 	at sun.nio.ch.Net.bind0(Native Method)
> 	at sun.nio.ch.Net.bind(Net.java:174)
> 	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:139)
> 	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:77)
> 	at org.eclipse.jetty.server.nio.SelectChannelConnector.open(SelectChannelConnector.java:187)
> 	at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:316)
> 	at org.eclipse.jetty.server.nio.SelectChannelConnector.doStart(SelectChannelConnector.java:265)
> 	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
> 	at org.eclipse.jetty.server.Server.doStart(Server.java:293)
> 	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
> 	at org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$$connect$1(JettyUtils.scala:194)
> 	at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204)
> 	at org.apache.spark.ui.JettyUtils$$anonfun$2.apply(JettyUtils.scala:204)
> 	at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1676)
> 	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> 	at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1667)
> 	at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:204)
> 	at org.apache.spark.ui.WebUI.bind(WebUI.scala:102)
> 	at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267)
> 	at org.apache.spark.SparkContext$$anonfun$10.apply(SparkContext.scala:267)
> 	at scala.Option.foreach(Option.scala:236)
> 	at org.apache.spark.SparkContext.<init>(SparkContext.scala:267)
> 	at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
> 	at org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:106)
> 	at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:362)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:616)
> 	at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:353)
> 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
> 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {code}
> I also saw SparkSubmit process is working hard launching some other processes:
> {code}
> xzhang@xzdt:~/apache/spark$ jps
> 12731 CoarseGrainedExecutorBackend
> 11746 RunJar
> 25974 TaskTracker
> 12067 SparkSubmit
> 25524 SecondaryNameNode
> 25771 JobTracker
> 25280 DataNode
> 25108 NameNode
> 12885 Jps
> 12742 CoarseGrainedExecutorBackend
> 12408 CoarseGrainedExecutorBackend
> 12409 CoarseGrainedExecutorBackend
> 11879 SparkSubmit
> {code}
> If I change spark.master to point to a standalone, it works fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)