You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Rachana Srivastava <Ra...@markmonitor.com> on 2015/10/12 05:49:52 UTC

yarn-cluster mode throwing NullPointerException

I am trying to submit a job using yarn-cluster mode using spark-submit command.  My code works fine when I use yarn-client mode.

Cloudera Version:
CDH-5.4.7-1.cdh5.4.7.p0.3

Command Submitted:
spark-submit --class "com.markmonitor.antifraud.ce.KafkaURLStreaming"  \
--driver-java-options "-Dlog4j.configuration=file:///etc/spark/myconf/log4j.sample.properties" \
--conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:///etc/spark/myconf/log4j.sample.properties" \
--conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:///etc/spark/myconf/log4j.sample.properties" \
--num-executors 2 \
--executor-cores 2 \
../target/mm-XXX-ce-0.0.1-SNAPSHOT-jar-with-dependencies.jar \
yarn-cluster 10 "XXX:2181" "XXX:9092" groups kafkaurl 5 \
"hdfs://ip-10-0-0-XXX.us-west-2.compute.internal:8020/user/ec2-user/urlFeature.properties" \
"hdfs://ip-10-0-0-XXX.us-west-2.compute.internal:8020/user/ec2-user/urlFeatureContent.properties" \
"hdfs://ip-10-0-0-XXX.us-west-2.compute.internal:8020/user/ec2-user/hdfsOutputNEWScript/OUTPUTYarn2"  false


Log Details:
INFO : org.apache.spark.SparkContext - Running Spark version 1.3.0
INFO : org.apache.spark.SecurityManager - Changing view acls to: ec2-user
INFO : org.apache.spark.SecurityManager - Changing modify acls to: ec2-user
INFO : org.apache.spark.SecurityManager - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ec2-user); users with modify permissions: Set(ec2-user)
INFO : akka.event.slf4j.Slf4jLogger - Slf4jLogger started
INFO : Remoting - Starting remoting
INFO : Remoting - Remoting started; listening on addresses :[akka.tcp://sparkDriver@ip-10-0-0-XXX.us-west-2.compute.internal:49579]
INFO : Remoting - Remoting now listens on addresses: [akka.tcp://sparkDriver@ip-10-0-0-XXX.us-west-2.compute.internal:49579]
INFO : org.apache.spark.util.Utils - Successfully started service 'sparkDriver' on port 49579.
INFO : org.apache.spark.SparkEnv - Registering MapOutputTracker
INFO : org.apache.spark.SparkEnv - Registering BlockManagerMaster
INFO : org.apache.spark.storage.DiskBlockManager - Created local directory at /tmp/spark-1c805495-c7c4-471d-973f-b1ae0e2c8ff9/blockmgr-fff1946f-a716-40fc-a62d-bacba5b17638
INFO : org.apache.spark.storage.MemoryStore - MemoryStore started with capacity 265.4 MB
INFO : org.apache.spark.HttpFileServer - HTTP File server directory is /tmp/spark-8ed6f513-854f-4ee4-95ea-87185364eeaf/httpd-75cee1e7-af7a-4c82-a9ff-a124ce7ca7ae
INFO : org.apache.spark.HttpServer - Starting HTTP Server
INFO : org.spark-project.jetty.server.Server - jetty-8.y.z-SNAPSHOT
INFO : org.spark-project.jetty.server.AbstractConnector - Started SocketConnector@0.0.0.0:46671
INFO : org.apache.spark.util.Utils - Successfully started service 'HTTP file server' on port 46671.
INFO : org.apache.spark.SparkEnv - Registering OutputCommitCoordinator
INFO : org.spark-project.jetty.server.Server - jetty-8.y.z-SNAPSHOT
INFO : org.spark-project.jetty.server.AbstractConnector - Started SelectChannelConnector@0.0.0.0:4040
INFO : org.apache.spark.util.Utils - Successfully started service 'SparkUI' on port 4040.
INFO : org.apache.spark.ui.SparkUI - Started SparkUI at http://ip-10-0-0-XXX.us-west-2.compute.internal:4040
INFO : org.apache.spark.SparkContext - Added JAR file:/home/ec2-user/CE/correlationengine/scripts/../target/mm-anti-fraud-ce-0.0.1-SNAPSHOT-jar-with-dependencies.jar at http://10.0.0.XXX:46671/jars/mm-anti-fraud-ce-0.0.1-SNAPSHOT-jar-with-dependencies.jar with timestamp 1444620509463
INFO : org.apache.spark.scheduler.cluster.YarnClusterScheduler - Created YarnClusterScheduler
ERROR: org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend - Application ID is not set.
INFO : org.apache.spark.network.netty.NettyBlockTransferService - Server created on 33880
INFO : org.apache.spark.storage.BlockManagerMaster - Trying to register BlockManager
INFO : org.apache.spark.storage.BlockManagerMasterActor - Registering block manager ip-10-0-0-XXX.us-west-2.compute.internal:33880 with 265.4 MB RAM, BlockManagerId(<driver>, ip-10-0-0-XXX.us-west-2.compute.internal, 33880)
INFO : org.apache.spark.storage.BlockManagerMaster - Registered BlockManager
INFO : org.apache.spark.scheduler.EventLoggingListener - Logging events to hdfs://ip-10-0-0-XXX.us-west-2.compute.internal:8020/user/spark/applicationHistory/spark-application-1444620509497
Exception in thread "main" java.lang.NullPointerException
    at org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:580)
    at org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
    at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
    at com.markmonitor.antifraud.ce.KafkaURLStreaming.main(KafkaURLStreaming.java:91)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
WARN : org.apache.hadoop.hdfs.DFSClient - Unable to persist blocks in hflush for /user/spark/applicationHistory/spark-application-1444620509497.inprogress
java.io.IOException: Failed on local exception: java.io.InterruptedIOException: Interruped while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/10.0.0.XXX:43929 remote=ip-10-0-0-XXX.us-west-2.compute.internal/10.0.0.XXX:8020]. 59998 millis timeout left.; Host Details : local host is: "ip-10-0-0-XXX.us-west-2.compute.internal/10.0.0.XXX"; destination host is: "ip-10-0-0-XXX.us-west-2.compute.internal":8020;
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
    at org.apache.hadoop.ipc.Client.call(Client.java:1472)
    at org.apache.hadoop.ipc.Client.call(Client.java:1399)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
    at com.sun.proxy.$Proxy18.fsync(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.fsync(ClientNamenodeProtocolTranslatorPB.java:814)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy19.fsync(Unknown Source)
    at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2067)
    at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
    at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
    at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
    at org.apache.spark.scheduler.EventLoggingListener.onBlockManagerAdded(EventLoggingListener.scala:171)
    at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:46)
    at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
    at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
    at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
    at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
Caused by: java.io.InterruptedIOException: Interruped while waiting for IO on channel java.nio.channels.SocketChannel[connected local=/10.0.0.XXX:43929 remote=ip-10-0-0-XXX.us-west-2.compute.internal/10.0.0.XXX:8020]. 59998 millis timeout left.
    at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:352)
    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:513)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1071)
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)
WARN : org.apache.hadoop.hdfs.DFSClient - Error while syncing
java.nio.channels.ClosedChannelException
    at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1635)
    at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2074)
    at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
    at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
    at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
    at org.apache.spark.scheduler.EventLoggingListener.onBlockManagerAdded(EventLoggingListener.scala:171)
    at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:46)
    at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
    at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
    at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
    at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
ERROR: org.apache.spark.scheduler.LiveListenerBus - Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
    at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
    at org.apache.spark.scheduler.EventLoggingListener.onBlockManagerAdded(EventLoggingListener.scala:171)
    at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:46)
    at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
    at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
    at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
    at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
Caused by: java.nio.channels.ClosedChannelException
    at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1635)
    at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2074)
    at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
    at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
    ... 19 more
ERROR: org.apache.spark.scheduler.LiveListenerBus - Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
    at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
    at org.apache.spark.scheduler.EventLoggingListener.onApplicationStart(EventLoggingListener.scala:177)
    at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:52)
    at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
    at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
    at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
    at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
    at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
Caused by: java.io.IOException: Filesystem closed
    at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:794)
    at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1998)
    at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
    at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
    ... 19 more
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/avro-tools-1.7.6-cdh5.4.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

Thanks,

Rachana

Re: yarn-cluster mode throwing NullPointerException

Posted by Venkatakrishnan Sowrirajan <vs...@asu.edu>.
Hi Rachana,


Are you by any chance saying something like this in your code
​?
​

"sparkConf.setMaster("yarn-cluster");"

​SparkContext is not supported with yarn-cluster mode.​


I think you are hitting this bug -- >
https://issues.apache.org/jira/browse/SPARK-7504. This got fixed in
Spark-1.4.0, so you can try in 1.4.0

Regards
Venkata krishnan

On Sun, Oct 11, 2015 at 8:49 PM, Rachana Srivastava <
Rachana.Srivastava@markmonitor.com> wrote:

> I am trying to submit a job using yarn-cluster mode using spark-submit
> command.  My code works fine when I use yarn-client mode.
>
>
>
> *Cloudera Version:*
>
> CDH-5.4.7-1.cdh5.4.7.p0.3
>
>
>
> *Command Submitted:*
>
> spark-submit --class "com.markmonitor.antifraud.ce.KafkaURLStreaming"  \
>
> --driver-java-options
> "-Dlog4j.configuration=file:///etc/spark/myconf/log4j.sample.properties" \
>
> --conf
> "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:///etc/spark/myconf/log4j.sample.properties"
> \
>
> --conf
> "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:///etc/spark/myconf/log4j.sample.properties"
> \
>
> --num-executors 2 \
>
> --executor-cores 2 \
>
> ../target/mm-XXX-ce-0.0.1-SNAPSHOT-jar-with-dependencies.jar \
>
> yarn-cluster 10 "XXX:2181" "XXX:9092" groups kafkaurl 5 \
>
> "hdfs://ip-10-0-0-XXX.us-west-2.compute.internal:8020/user/ec2-user/urlFeature.properties"
> \
>
> "hdfs://ip-10-0-0-XXX.us-west-2.compute.internal:8020/user/ec2-user/urlFeatureContent.properties"
> \
>
> "hdfs://ip-10-0-0-XXX.us-west-2.compute.internal:8020/user/ec2-user/hdfsOutputNEWScript/OUTPUTYarn2"
> false
>
>
>
>
>
> *Log Details:*
>
> INFO : org.apache.spark.SparkContext - Running Spark version 1.3.0
>
> INFO : org.apache.spark.SecurityManager - Changing view acls to: ec2-user
>
> INFO : org.apache.spark.SecurityManager - Changing modify acls to: ec2-user
>
> INFO : org.apache.spark.SecurityManager - SecurityManager: authentication
> disabled; ui acls disabled; users with view permissions: Set(ec2-user);
> users with modify permissions: Set(ec2-user)
>
> INFO : akka.event.slf4j.Slf4jLogger - Slf4jLogger started
>
> INFO : Remoting - Starting remoting
>
> INFO : Remoting - Remoting started; listening on addresses
> :[akka.tcp://sparkDriver@ip-10-0-0-XXX.us-west-2.compute.internal:49579]
>
> INFO : Remoting - Remoting now listens on addresses:
> [akka.tcp://sparkDriver@ip-10-0-0-XXX.us-west-2.compute.internal:49579]
>
> INFO : org.apache.spark.util.Utils - Successfully started service
> 'sparkDriver' on port 49579.
>
> INFO : org.apache.spark.SparkEnv - Registering MapOutputTracker
>
> INFO : org.apache.spark.SparkEnv - Registering BlockManagerMaster
>
> INFO : org.apache.spark.storage.DiskBlockManager - Created local directory
> at
> /tmp/spark-1c805495-c7c4-471d-973f-b1ae0e2c8ff9/blockmgr-fff1946f-a716-40fc-a62d-bacba5b17638
>
> INFO : org.apache.spark.storage.MemoryStore - MemoryStore started with
> capacity 265.4 MB
>
> INFO : org.apache.spark.HttpFileServer - HTTP File server directory is
> /tmp/spark-8ed6f513-854f-4ee4-95ea-87185364eeaf/httpd-75cee1e7-af7a-4c82-a9ff-a124ce7ca7ae
>
> INFO : org.apache.spark.HttpServer - Starting HTTP Server
>
> INFO : org.spark-project.jetty.server.Server - jetty-8.y.z-SNAPSHOT
>
> INFO : org.spark-project.jetty.server.AbstractConnector - Started
> SocketConnector@0.0.0.0:46671
>
> INFO : org.apache.spark.util.Utils - Successfully started service 'HTTP
> file server' on port 46671.
>
> INFO : org.apache.spark.SparkEnv - Registering OutputCommitCoordinator
>
> INFO : org.spark-project.jetty.server.Server - jetty-8.y.z-SNAPSHOT
>
> INFO : org.spark-project.jetty.server.AbstractConnector - Started
> SelectChannelConnector@0.0.0.0:4040
>
> INFO : org.apache.spark.util.Utils - Successfully started service
> 'SparkUI' on port 4040.
>
> INFO : org.apache.spark.ui.SparkUI - Started SparkUI at
> http://ip-10-0-0-XXX.us-west-2.compute.internal:4040
>
> INFO : org.apache.spark.SparkContext - Added JAR
> file:/home/ec2-user/CE/correlationengine/scripts/../target/mm-anti-fraud-ce-0.0.1-SNAPSHOT-jar-with-dependencies.jar
> at
> http://10.0.0.XXX:46671/jars/mm-anti-fraud-ce-0.0.1-SNAPSHOT-jar-with-dependencies.jar
> with timestamp 1444620509463
>
> INFO : org.apache.spark.scheduler.cluster.YarnClusterScheduler - Created
> YarnClusterScheduler
>
> ERROR: org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend -
> Application ID is not set.
>
> INFO : org.apache.spark.network.netty.NettyBlockTransferService - Server
> created on 33880
>
> INFO : org.apache.spark.storage.BlockManagerMaster - Trying to register
> BlockManager
>
> INFO : org.apache.spark.storage.BlockManagerMasterActor - Registering
> block manager ip-10-0-0-XXX.us-west-2.compute.internal:33880 with 265.4 MB
> RAM, BlockManagerId(<driver>, ip-10-0-0-XXX.us-west-2.compute.internal,
> 33880)
>
> INFO : org.apache.spark.storage.BlockManagerMaster - Registered
> BlockManager
>
> INFO : org.apache.spark.scheduler.EventLoggingListener - Logging events to
> hdfs://ip-10-0-0-XXX.us-west-2.compute.internal:8020/user/spark/applicationHistory/spark-application-1444620509497
>
> *Exception in thread "main" java.lang.NullPointerException*
>
> *    at
> org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:580)*
>
> *    at
> org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32)*
>
>     at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
>
>     at
> org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
>
>     at
> com.markmonitor.antifraud.ce.KafkaURLStreaming.main(KafkaURLStreaming.java:91)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>     at java.lang.reflect.Method.invoke(Method.java:606)
>
>     at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
>
>     at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
>
>     at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
>
>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
>
>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> WARN : org.apache.hadoop.hdfs.DFSClient - Unable to persist blocks in
> hflush for
> /user/spark/applicationHistory/spark-application-1444620509497.inprogress
>
> java.io.IOException: Failed on local exception:
> java.io.InterruptedIOException: Interruped while waiting for IO on channel
> java.nio.channels.SocketChannel[connected local=/10.0.0.XXX:43929
> remote=ip-10-0-0-XXX.us-west-2.compute.internal/10.0.0.XXX:8020]. 59998
> millis timeout left.; Host Details : local host is:
> "ip-10-0-0-XXX.us-west-2.compute.internal/10.0.0.XXX"; destination host is:
> "ip-10-0-0-XXX.us-west-2.compute.internal":8020;
>
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>
>     at org.apache.hadoop.ipc.Client.call(Client.java:1472)
>
>     at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>
>     at com.sun.proxy.$Proxy18.fsync(Unknown Source)
>
>     at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.fsync(ClientNamenodeProtocolTranslatorPB.java:814)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>     at java.lang.reflect.Method.invoke(Method.java:606)
>
>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>
>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>
>     at com.sun.proxy.$Proxy19.fsync(Unknown Source)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2067)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
>
>     at
> org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>     at java.lang.reflect.Method.invoke(Method.java:606)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at scala.Option.foreach(Option.scala:236)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.onBlockManagerAdded(EventLoggingListener.scala:171)
>
>     at
> org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:46)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
>
> Caused by: java.io.InterruptedIOException: Interruped while waiting for IO
> on channel java.nio.channels.SocketChannel[connected
> local=/10.0.0.XXX:43929
> remote=ip-10-0-0-XXX.us-west-2.compute.internal/10.0.0.XXX:8020]. 59998
> millis timeout left.
>
>     at
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:352)
>
>     at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
>
>     at
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>
>     at
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>
>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>
>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>
>     at
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:513)
>
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>
>     at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1071)
>
>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)
>
> WARN : org.apache.hadoop.hdfs.DFSClient - Error while syncing
>
> java.nio.channels.ClosedChannelException
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1635)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2074)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
>
>     at
> org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>     at java.lang.reflect.Method.invoke(Method.java:606)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at scala.Option.foreach(Option.scala:236)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.onBlockManagerAdded(EventLoggingListener.scala:171)
>
>     at
> org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:46)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
>
> ERROR: org.apache.spark.scheduler.LiveListenerBus - Listener
> EventLoggingListener threw an exception
>
> java.lang.reflect.InvocationTargetException
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>     at java.lang.reflect.Method.invoke(Method.java:606)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at scala.Option.foreach(Option.scala:236)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.onBlockManagerAdded(EventLoggingListener.scala:171)
>
>     at
> org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:46)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
>
> Caused by: java.nio.channels.ClosedChannelException
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1635)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2074)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
>
>     at
> org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
>
>     ... 19 more
>
> ERROR: org.apache.spark.scheduler.LiveListenerBus - Listener
> EventLoggingListener threw an exception
>
> java.lang.reflect.InvocationTargetException
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>     at java.lang.reflect.Method.invoke(Method.java:606)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at scala.Option.foreach(Option.scala:236)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.onApplicationStart(EventLoggingListener.scala:177)
>
>     at
> org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:52)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
>
> Caused by: java.io.IOException: Filesystem closed
>
>     at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:794)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1998)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
>
>     at
> org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
>
>     ... 19 more
>
> SLF4J: Class path contains multiple SLF4J bindings.
>
> SLF4J: Found binding in
> [jar:file:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: Found binding in
> [jar:file:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/avro-tools-1.7.6-cdh5.4.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
>
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>
>
>
> Thanks,
>
>
>
> Rachana
>

Re: yarn-cluster mode throwing NullPointerException

Posted by Venkatakrishnan Sowrirajan <vs...@asu.edu>.
Hi Rachana,


Are you by any chance saying something like this in your code
​?
​

"sparkConf.setMaster("yarn-cluster");"

​SparkContext is not supported with yarn-cluster mode.​


I think you are hitting this bug -- >
https://issues.apache.org/jira/browse/SPARK-7504. This got fixed in
Spark-1.4.0, so you can try in 1.4.0

Regards
Venkata krishnan

On Sun, Oct 11, 2015 at 8:49 PM, Rachana Srivastava <
Rachana.Srivastava@markmonitor.com> wrote:

> I am trying to submit a job using yarn-cluster mode using spark-submit
> command.  My code works fine when I use yarn-client mode.
>
>
>
> *Cloudera Version:*
>
> CDH-5.4.7-1.cdh5.4.7.p0.3
>
>
>
> *Command Submitted:*
>
> spark-submit --class "com.markmonitor.antifraud.ce.KafkaURLStreaming"  \
>
> --driver-java-options
> "-Dlog4j.configuration=file:///etc/spark/myconf/log4j.sample.properties" \
>
> --conf
> "spark.driver.extraJavaOptions=-Dlog4j.configuration=file:///etc/spark/myconf/log4j.sample.properties"
> \
>
> --conf
> "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:///etc/spark/myconf/log4j.sample.properties"
> \
>
> --num-executors 2 \
>
> --executor-cores 2 \
>
> ../target/mm-XXX-ce-0.0.1-SNAPSHOT-jar-with-dependencies.jar \
>
> yarn-cluster 10 "XXX:2181" "XXX:9092" groups kafkaurl 5 \
>
> "hdfs://ip-10-0-0-XXX.us-west-2.compute.internal:8020/user/ec2-user/urlFeature.properties"
> \
>
> "hdfs://ip-10-0-0-XXX.us-west-2.compute.internal:8020/user/ec2-user/urlFeatureContent.properties"
> \
>
> "hdfs://ip-10-0-0-XXX.us-west-2.compute.internal:8020/user/ec2-user/hdfsOutputNEWScript/OUTPUTYarn2"
> false
>
>
>
>
>
> *Log Details:*
>
> INFO : org.apache.spark.SparkContext - Running Spark version 1.3.0
>
> INFO : org.apache.spark.SecurityManager - Changing view acls to: ec2-user
>
> INFO : org.apache.spark.SecurityManager - Changing modify acls to: ec2-user
>
> INFO : org.apache.spark.SecurityManager - SecurityManager: authentication
> disabled; ui acls disabled; users with view permissions: Set(ec2-user);
> users with modify permissions: Set(ec2-user)
>
> INFO : akka.event.slf4j.Slf4jLogger - Slf4jLogger started
>
> INFO : Remoting - Starting remoting
>
> INFO : Remoting - Remoting started; listening on addresses
> :[akka.tcp://sparkDriver@ip-10-0-0-XXX.us-west-2.compute.internal:49579]
>
> INFO : Remoting - Remoting now listens on addresses:
> [akka.tcp://sparkDriver@ip-10-0-0-XXX.us-west-2.compute.internal:49579]
>
> INFO : org.apache.spark.util.Utils - Successfully started service
> 'sparkDriver' on port 49579.
>
> INFO : org.apache.spark.SparkEnv - Registering MapOutputTracker
>
> INFO : org.apache.spark.SparkEnv - Registering BlockManagerMaster
>
> INFO : org.apache.spark.storage.DiskBlockManager - Created local directory
> at
> /tmp/spark-1c805495-c7c4-471d-973f-b1ae0e2c8ff9/blockmgr-fff1946f-a716-40fc-a62d-bacba5b17638
>
> INFO : org.apache.spark.storage.MemoryStore - MemoryStore started with
> capacity 265.4 MB
>
> INFO : org.apache.spark.HttpFileServer - HTTP File server directory is
> /tmp/spark-8ed6f513-854f-4ee4-95ea-87185364eeaf/httpd-75cee1e7-af7a-4c82-a9ff-a124ce7ca7ae
>
> INFO : org.apache.spark.HttpServer - Starting HTTP Server
>
> INFO : org.spark-project.jetty.server.Server - jetty-8.y.z-SNAPSHOT
>
> INFO : org.spark-project.jetty.server.AbstractConnector - Started
> SocketConnector@0.0.0.0:46671
>
> INFO : org.apache.spark.util.Utils - Successfully started service 'HTTP
> file server' on port 46671.
>
> INFO : org.apache.spark.SparkEnv - Registering OutputCommitCoordinator
>
> INFO : org.spark-project.jetty.server.Server - jetty-8.y.z-SNAPSHOT
>
> INFO : org.spark-project.jetty.server.AbstractConnector - Started
> SelectChannelConnector@0.0.0.0:4040
>
> INFO : org.apache.spark.util.Utils - Successfully started service
> 'SparkUI' on port 4040.
>
> INFO : org.apache.spark.ui.SparkUI - Started SparkUI at
> http://ip-10-0-0-XXX.us-west-2.compute.internal:4040
>
> INFO : org.apache.spark.SparkContext - Added JAR
> file:/home/ec2-user/CE/correlationengine/scripts/../target/mm-anti-fraud-ce-0.0.1-SNAPSHOT-jar-with-dependencies.jar
> at
> http://10.0.0.XXX:46671/jars/mm-anti-fraud-ce-0.0.1-SNAPSHOT-jar-with-dependencies.jar
> with timestamp 1444620509463
>
> INFO : org.apache.spark.scheduler.cluster.YarnClusterScheduler - Created
> YarnClusterScheduler
>
> ERROR: org.apache.spark.scheduler.cluster.YarnClusterSchedulerBackend -
> Application ID is not set.
>
> INFO : org.apache.spark.network.netty.NettyBlockTransferService - Server
> created on 33880
>
> INFO : org.apache.spark.storage.BlockManagerMaster - Trying to register
> BlockManager
>
> INFO : org.apache.spark.storage.BlockManagerMasterActor - Registering
> block manager ip-10-0-0-XXX.us-west-2.compute.internal:33880 with 265.4 MB
> RAM, BlockManagerId(<driver>, ip-10-0-0-XXX.us-west-2.compute.internal,
> 33880)
>
> INFO : org.apache.spark.storage.BlockManagerMaster - Registered
> BlockManager
>
> INFO : org.apache.spark.scheduler.EventLoggingListener - Logging events to
> hdfs://ip-10-0-0-XXX.us-west-2.compute.internal:8020/user/spark/applicationHistory/spark-application-1444620509497
>
> *Exception in thread "main" java.lang.NullPointerException*
>
> *    at
> org.apache.spark.deploy.yarn.ApplicationMaster$.sparkContextInitialized(ApplicationMaster.scala:580)*
>
> *    at
> org.apache.spark.scheduler.cluster.YarnClusterScheduler.postStartHook(YarnClusterScheduler.scala:32)*
>
>     at org.apache.spark.SparkContext.<init>(SparkContext.scala:541)
>
>     at
> org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
>
>     at
> com.markmonitor.antifraud.ce.KafkaURLStreaming.main(KafkaURLStreaming.java:91)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>     at java.lang.reflect.Method.invoke(Method.java:606)
>
>     at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
>
>     at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
>
>     at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
>
>     at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
>
>     at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> WARN : org.apache.hadoop.hdfs.DFSClient - Unable to persist blocks in
> hflush for
> /user/spark/applicationHistory/spark-application-1444620509497.inprogress
>
> java.io.IOException: Failed on local exception:
> java.io.InterruptedIOException: Interruped while waiting for IO on channel
> java.nio.channels.SocketChannel[connected local=/10.0.0.XXX:43929
> remote=ip-10-0-0-XXX.us-west-2.compute.internal/10.0.0.XXX:8020]. 59998
> millis timeout left.; Host Details : local host is:
> "ip-10-0-0-XXX.us-west-2.compute.internal/10.0.0.XXX"; destination host is:
> "ip-10-0-0-XXX.us-west-2.compute.internal":8020;
>
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>
>     at org.apache.hadoop.ipc.Client.call(Client.java:1472)
>
>     at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>
>     at com.sun.proxy.$Proxy18.fsync(Unknown Source)
>
>     at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.fsync(ClientNamenodeProtocolTranslatorPB.java:814)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>     at java.lang.reflect.Method.invoke(Method.java:606)
>
>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>
>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>
>     at com.sun.proxy.$Proxy19.fsync(Unknown Source)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2067)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
>
>     at
> org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>     at java.lang.reflect.Method.invoke(Method.java:606)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at scala.Option.foreach(Option.scala:236)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.onBlockManagerAdded(EventLoggingListener.scala:171)
>
>     at
> org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:46)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
>
> Caused by: java.io.InterruptedIOException: Interruped while waiting for IO
> on channel java.nio.channels.SocketChannel[connected
> local=/10.0.0.XXX:43929
> remote=ip-10-0-0-XXX.us-west-2.compute.internal/10.0.0.XXX:8020]. 59998
> millis timeout left.
>
>     at
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:352)
>
>     at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
>
>     at
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>
>     at
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>
>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>
>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>
>     at
> org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:513)
>
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
>
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
>
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>
>     at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1071)
>
>     at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)
>
> WARN : org.apache.hadoop.hdfs.DFSClient - Error while syncing
>
> java.nio.channels.ClosedChannelException
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1635)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2074)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
>
>     at
> org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>     at java.lang.reflect.Method.invoke(Method.java:606)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at scala.Option.foreach(Option.scala:236)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.onBlockManagerAdded(EventLoggingListener.scala:171)
>
>     at
> org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:46)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
>
> ERROR: org.apache.spark.scheduler.LiveListenerBus - Listener
> EventLoggingListener threw an exception
>
> java.lang.reflect.InvocationTargetException
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>     at java.lang.reflect.Method.invoke(Method.java:606)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at scala.Option.foreach(Option.scala:236)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.onBlockManagerAdded(EventLoggingListener.scala:171)
>
>     at
> org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:46)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
>
> Caused by: java.nio.channels.ClosedChannelException
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1635)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2074)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
>
>     at
> org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
>
>     ... 19 more
>
> ERROR: org.apache.spark.scheduler.LiveListenerBus - Listener
> EventLoggingListener threw an exception
>
> java.lang.reflect.InvocationTargetException
>
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>     at java.lang.reflect.Method.invoke(Method.java:606)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
>
>     at scala.Option.foreach(Option.scala:236)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
>
>     at
> org.apache.spark.scheduler.EventLoggingListener.onApplicationStart(EventLoggingListener.scala:177)
>
>     at
> org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:52)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
>
>     at
> org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
>
>     at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
>
>     at
> org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
>
> Caused by: java.io.IOException: Filesystem closed
>
>     at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:794)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1998)
>
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
>
>     at
> org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
>
>     ... 19 more
>
> SLF4J: Class path contains multiple SLF4J bindings.
>
> SLF4J: Found binding in
> [jar:file:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: Found binding in
> [jar:file:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/avro-tools-1.7.6-cdh5.4.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
>
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>
>
>
> Thanks,
>
>
>
> Rachana
>