You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/02/15 13:53:17 UTC

[GitHub] liupc opened a new pull request #23799: [SPARK-26892]Fix saveAsTextFile throws NullPointerException when null row present

liupc opened a new pull request #23799: [SPARK-26892]Fix saveAsTextFile throws NullPointerException when null row present
URL: https://github.com/apache/spark/pull/23799
 
 
   ## What changes were proposed in this pull request?
   
   Currently, RDD.saveAsTextFile may throw NullPointerException then null row is present.
   ```
   scala> sc.parallelize(Seq(1,null),1).saveAsTextFile("/tmp/foobar.dat")
   19/02/15 21:39:17 ERROR Utils: Aborting task
   java.lang.NullPointerException
   at org.apache.spark.rdd.RDD.$anonfun$saveAsTextFile$3(RDD.scala:1510)
   at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
   at org.apache.spark.internal.io.SparkHadoopWriter$.$anonfun$executeTask$1(SparkHadoopWriter.scala:129)
   at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1352)
   at org.apache.spark.internal.io.SparkHadoopWriter$.executeTask(SparkHadoopWriter.scala:127)
   at org.apache.spark.internal.io.SparkHadoopWriter$.$anonfun$write$1(SparkHadoopWriter.scala:83)
   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
   at org.apache.spark.scheduler.Task.run(Task.scala:121)
   at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:425)
   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1318)
   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:428)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   at java.lang.Thread.run(Thread.java:748)
   ```
   
   This PR write "Null" for null row to avoid NPE and fix it.
   
   ## How was this patch tested?
   
   NA
   
   Please review http://spark.apache.org/contributing.html before opening a pull request.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org