You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Bogdan Niculescu (JIRA)" <ji...@apache.org> on 2015/04/16 15:01:01 UTC
[jira] [Commented] (SPARK-3284) saveAsParquetFile not working on
windows
[ https://issues.apache.org/jira/browse/SPARK-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498019#comment-14498019 ]
Bogdan Niculescu commented on SPARK-3284:
-----------------------------------------
I get the same type of exception in Spark 1.3.0 under Windows when trying to save to a parquet file.
Here is my code :
case class Person(name: String, age: Int)
object DataFrameTest extends App {
val conf = new SparkConf().setMaster("local[4]").setAppName("ParquetTest")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val persons = List(Person("a", 1), Person("b", 2))
val rdd = sc.parallelize(persons)
val dataFrame = sqlContext.createDataFrame(rdd)
dataFrame.saveAsParquetFile("test.parquet")
}
The exception that I'm seeing is :
Exception in thread "main" java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:404)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
...................................
at org.apache.spark.sql.DataFrame.save(DataFrame.scala:1123)
at org.apache.spark.sql.DataFrame.saveAsParquetFile(DataFrame.scala:922)
at sparkTest.DataFrameTest$.delayedEndpoint$sparkTest$DataFrameTest$1(DataFrameTest.scala:19)
at sparkTest.DataFrameTest$delayedInit$body.apply(DataFrameTest.scala:9)
> saveAsParquetFile not working on windows
> ----------------------------------------
>
> Key: SPARK-3284
> URL: https://issues.apache.org/jira/browse/SPARK-3284
> Project: Spark
> Issue Type: Bug
> Components: Windows
> Affects Versions: 1.0.2
> Environment: Windows
> Reporter: Pravesh Jain
> Priority: Minor
>
> {code}
> object parquet {
> case class Person(name: String, age: Int)
> def main(args: Array[String]) {
> val sparkConf = new SparkConf().setMaster("local").setAppName("HdfsWordCount")
> val sc = new SparkContext(sparkConf)
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> // createSchemaRDD is used to implicitly convert an RDD to a SchemaRDD.
> import sqlContext.createSchemaRDD
> val people = sc.textFile("C:/Users/pravesh.jain/Desktop/people/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))
> people.saveAsParquetFile("C:/Users/pravesh.jain/Desktop/people/people.parquet")
> val parquetFile = sqlContext.parquetFile("C:/Users/pravesh.jain/Desktop/people/people.parquet")
> }
> }
> {code}
> gives the error
> Exception in thread "main" java.lang.NullPointerException at org.apache.spark.parquet$.main(parquet.scala:16)
> which is the line saveAsParquetFile.
> This works fine in linux but using in eclipse in windows gives the error.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org