You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kevin (Sangwoo) Kim (JIRA)" <ji...@apache.org> on 2014/05/29 11:52:01 UTC

[jira] [Resolved] (SPARK-1963) Job aborted with NullPointerException from DAGScheduler.scala:1020

     [ https://issues.apache.org/jira/browse/SPARK-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin (Sangwoo) Kim resolved SPARK-1963.
----------------------------------------

    Resolution: Invalid

> Job aborted with NullPointerException from DAGScheduler.scala:1020
> ------------------------------------------------------------------
>
>                 Key: SPARK-1963
>                 URL: https://issues.apache.org/jira/browse/SPARK-1963
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Kevin (Sangwoo) Kim
>
> Hi, I'm testing Spark 0.9.1 from EC2 r3.8xlarge (32 core, 240GiB MEM)
> During counting active user from 70GB of data, Spark job aborted with NPE from DAGScheduler. 
> I guess the number of active user count is around 1~2M. 
> Here's what I did 
> {code}
> val logs = sc.textFile("file:///spark/data/*")
> val activeUser = logs.map{x => val a = LogObjectExtractor.getAnonymousAction(x); a.getUserId}.distinct
> activeUser.count
> {code}
> and here's the log. 
> {code}
> 14/05/29 05:26:46 INFO scheduler.TaskSetManager: Serialized task 1.0:2235 as 1883 bytes in 1 ms
> 14/05/29 05:26:46 INFO scheduler.TaskSetManager: Finished TID 2207 in 17541 ms on ip-10-169-5-198.ap-northeast-1.compute.internal (progress: 2204/2267)
> 14/05/29 05:26:46 INFO scheduler.DAGScheduler: Completed ShuffleMapTask(1, 2207)
> 14/05/29 05:26:46 INFO scheduler.TaskSetManager: Starting task 1.0:2236 as TID 2236 on executor 0: ip-10-169-5-198.ap-northeast-1.compute.internal (PROCESS_LOCAL)
> 14/05/29 05:26:46 INFO scheduler.TaskSetManager: Serialized task 1.0:2236 as 1883 bytes in 1 ms
> 14/05/29 05:26:46 WARN scheduler.TaskSetManager: Lost TID 2230 (task 1.0:2230)
> 14/05/29 05:26:46 WARN scheduler.TaskSetManager: Loss was due to java.lang.NullPointerException
> java.lang.NullPointerException
> 	at $line16.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:17)
> 	at $line16.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:17)
> 	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> 	at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> 	at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
> 	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:97)
> 	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:96)
> 	at org.apache.spark.rdd.RDD$$anonfun$3.apply(RDD.scala:477)
> 	at org.apache.spark.rdd.RDD$$anonfun$3.apply(RDD.scala:477)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:53)
> 	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:211)
> 	at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42)
> 	at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> 	at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:744)
> 14/05/29 05:26:46 INFO scheduler.TaskSetManager: Starting task 1.0:2230 as TID 2237 on executor 0: ip-10-169-5-198.ap-northeast-1.compute.internal (PROCESS_LOCAL)
> 14/05/29 05:26:46 INFO scheduler.TaskSetManager: Serialized task 1.0:2230 as 1883 bytes in 0 ms
> 14/05/29 05:26:46 WARN scheduler.TaskSetManager: Lost TID 2231 (task 1.0:2231)
> 14/05/29 05:26:46 INFO scheduler.TaskSetManager: Loss was due to java.lang.NullPointerException [duplicate 1]
> 14/05/29 05:26:46 INFO scheduler.TaskSetManager: Starting task 1.0:2231 as TID 2238 on executor 0: ip-10-169-5-198.ap-northeast-1.compute.internal (PROCESS_LOCAL)
> {code}
> ...
> {code}
> 14/05/29 05:26:46 INFO scheduler.TaskSetManager: Loss was due to java.lang.NullPointerException [duplicate 27]
> 14/05/29 05:26:46 INFO scheduler.TaskSetManager: Loss was due to java.lang.NullPointerException [duplicate 28]
> 14/05/29 05:26:46 INFO scheduler.TaskSetManager: Finished TID 2201 in 17959 ms on ip-10-169-5-198.ap-northeast-1.compute.internal (progress: 2210/2267)
> 14/05/29 05:26:46 INFO scheduler.TaskSetManager: Finished TID 2209 in 16588 ms on ip-10-169-5-198.ap-northeast-1.compute.internal (progress: 2211/2267)
> org.apache.spark.SparkException: Job aborted: Task 1.0:2230 failed 4 times (most recent failure: Exception failure: java.lang.NullPointerException)
> {code}
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.2#6252)