You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Bogdan Niculescu (JIRA)" <ji...@apache.org> on 2015/04/16 15:01:01 UTC
[jira] [Commented] (SPARK-3284) saveAsParquetFile not working on windows

    [ https://issues.apache.org/jira/browse/SPARK-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498019#comment-14498019 ] 

Bogdan Niculescu commented on SPARK-3284:
-----------------------------------------

I get the same type of exception in Spark 1.3.0 under Windows when trying to save to a parquet file.
Here is my code :
case class Person(name: String, age: Int)

object DataFrameTest extends App {
  val conf = new SparkConf().setMaster("local[4]").setAppName("ParquetTest")
  val sc = new SparkContext(conf)
  val sqlContext = new SQLContext(sc)


  val persons = List(Person("a", 1), Person("b", 2))
  val rdd = sc.parallelize(persons)
  val dataFrame = sqlContext.createDataFrame(rdd)

  dataFrame.saveAsParquetFile("test.parquet")
}

The exception that I'm seeing is :
Exception in thread "main" java.lang.NullPointerException
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010)
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:404)
	at org.apache.hadoop.util.Shell.run(Shell.java:379)
        ...................................
	at org.apache.spark.sql.DataFrame.save(DataFrame.scala:1123)
	at org.apache.spark.sql.DataFrame.saveAsParquetFile(DataFrame.scala:922)
	at sparkTest.DataFrameTest$.delayedEndpoint$sparkTest$DataFrameTest$1(DataFrameTest.scala:19)
	at sparkTest.DataFrameTest$delayedInit$body.apply(DataFrameTest.scala:9)

> saveAsParquetFile not working on windows
> ----------------------------------------
>
>                 Key: SPARK-3284
>                 URL: https://issues.apache.org/jira/browse/SPARK-3284
>             Project: Spark
>          Issue Type: Bug
>          Components: Windows
>    Affects Versions: 1.0.2
>         Environment: Windows
>            Reporter: Pravesh Jain
>            Priority: Minor
>
> {code}
> object parquet {
>   case class Person(name: String, age: Int)
>   def main(args: Array[String]) {
>     val sparkConf = new SparkConf().setMaster("local").setAppName("HdfsWordCount")
>     val sc = new SparkContext(sparkConf)
>     val sqlContext = new org.apache.spark.sql.SQLContext(sc)
>     // createSchemaRDD is used to implicitly convert an RDD to a SchemaRDD.
>     import sqlContext.createSchemaRDD
>     val people = sc.textFile("C:/Users/pravesh.jain/Desktop/people/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))
>     people.saveAsParquetFile("C:/Users/pravesh.jain/Desktop/people/people.parquet")
>     val parquetFile = sqlContext.parquetFile("C:/Users/pravesh.jain/Desktop/people/people.parquet")
>   }
> }
> {code}
> gives the error
>     Exception in thread "main" java.lang.NullPointerException at org.apache.spark.parquet$.main(parquet.scala:16)
> which is the line saveAsParquetFile.
> This works fine in linux but using in eclipse in windows gives the error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org