You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Markus Gierich <Ma...@fiduciagad.de> on 2021/08/27 08:07:26 UTC

Spark submit on openshift

Hi!

I created a spark cluster on openshift using radanalytics.io I'm trying to execute the SparPi sample using

spark-submit --name sparkpi-2  \--master spark://hans:7077 \
--deploy-mode cluster  \
--class org.apache.spark.examples.SparkPi  \
/opt/spark/examples/jars/spark-examples_2.11-2.4.5.jar 1


The task is submitted and I see RUNNING state in the UI but it never ends because the logs shows UnknownHostException

21/08/27 07:45:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(1017200000); groups with view permissions: Set(); users  with modify permissions: Set(1017200000); groups with modify permissions: Set()
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
        at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
        at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
        at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:285)
        at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
        at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
        at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
        at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201)
        at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
        at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
        ... 4 more
Caused by: java.io.IOException: Failed to connect to hans-w-1-7jgtb:44139
        at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)
        at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
        at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198)
        at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194)
        at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: hans-w-1-7jgtb
        at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
        at java.net.InetAddress.getAllByName(InetAddress.java:1193)
        at java.net.InetAddress.getAllByName(InetAddress.java:1127)
        at java.net.InetAddress.getByName(InetAddress.java:1077)
        at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146)
        at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143)
        at java.security.AccessController.doPrivileged(Native Method)
        at io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:143)
        at io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:43)
        at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:63)
        at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:55)
        at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:57)
        at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:32)
        at io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:108)
        at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:202)
        at io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:48)
        at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:182)
        at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:168)
        at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
        at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:551)
        at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
        at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615)
        at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:604)
        at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
        at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetSuccess(AbstractChannel.java:985)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:505)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:416)
        at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:475)
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518)
        at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        ... 1 more


I could resolve this with giving pods fix hostnames and create headless services for them. But with this it is difficult to scale the spark cluster. Is there another solution ???


Best Regards

MARKUS





Fiducia & GAD IT AG | www.fiduciagad.de
AG Frankfurt a. M. HRB 102381 | Sitz der Gesellschaft: Frankfurt a. M. | USt-IdNr. DE 143582320
Vorstand: Martin Beyer (Vorstandssprecher), Ulrich Coenen (Vorstandssprecher),
Daniela Bücker, Birgit Frohnhoff, Jörg Staff, Ralf Teufel
Vorsitzender des Aufsichtsrats: Jürgen Brinkmann