You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Irfan Kabli <ir...@gmail.com> on 2016/09/10 15:50:27 UTC

Problems with Reading CSV Files - Java - Eclipse

Dear Spark community members,

I am trying to read a CSV file in Spark using Java API.

My setup is as follows:
> Windows Machine
> Local deployment
> Spark 2.0.0
> Eclipse Scala IDE 4.0.0

I am trying to read from the local file system with the following code:

(Using the Java Perspective)

     SparkSession mySparkSession = SparkSession.builder()
    .master("local")
    .appName("loadingFiles")
    .getOrCreate();

    Dataset<Row> myDataSet=
mySparkSession.read().csv("C:/temp/pricepaid/pp-monthly-update-new-version.csv");

I am getting the following error message when running the application via
Eclipse:

xception in thread "main" java.lang.IllegalArgumentException: Error while
instantiating 'org.apache.spark.sql.internal.SessionState':
at
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:949)
at
org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:111)
at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:110)
at org.apache.spark.sql.SparkSession.conf$lzycompute(SparkSession.scala:133)
at org.apache.spark.sql.SparkSession.conf(SparkSession.scala:133)
at
org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:838)
at
org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:838)
at
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
at
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
at
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:838)
at org.packtpub.SparkFunctionsTest.main(SparkFunctionsTest.java:110)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:946)
... 13 more
Caused by: java.lang.IllegalArgumentException: Error while instantiating
'org.apache.spark.sql.internal.SharedState':
at
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:949)
at
org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:100)
at
org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:100)
at scala.Option.getOrElse(Option.scala:121)
at
org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:99)
at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:98)
at org.apache.spark.sql.internal.SessionState.<init>(SessionState.scala:153)
... 18 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$reflect(SparkSession.scala:946)
... 24 more
Caused by: java.lang.NoClassDefFoundError:
org/apache/parquet/hadoop/ParquetOutputCommitter
at org.apache.spark.sql.internal.SQLConf$.<init>(SQLConf.scala:235)
at org.apache.spark.sql.internal.SQLConf$.<clinit>(SQLConf.scala)
at org.apache.spark.sql.internal.SQLConf.setConfString(SQLConf.scala:711)
at
org.apache.spark.sql.internal.SharedState$$anonfun$1.apply(SharedState.scala:67)
at
org.apache.spark.sql.internal.SharedState$$anonfun$1.apply(SharedState.scala:67)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:67)
... 29 more
Caused by: java.lang.ClassNotFoundException:
org.apache.parquet.hadoop.ParquetOutputCommitter
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 37 more
16/09/10 16:48:14 INFO SparkContext: Invoking stop() from shutdown hook



Any ideas would be highly appreciated.

Best Regards,
Irfan

Re: Problems with Reading CSV Files - Java - Eclipse

Posted by ayan guha <gu...@gmail.com>.
It is failing with no class found error for parquet output committer. Maybe
a build issue?
On 11 Sep 2016 01:50, "Irfan Kabli" <ir...@gmail.com> wrote:

> Dear Spark community members,
>
> I am trying to read a CSV file in Spark using Java API.
>
> My setup is as follows:
> > Windows Machine
> > Local deployment
> > Spark 2.0.0
> > Eclipse Scala IDE 4.0.0
>
> I am trying to read from the local file system with the following code:
>
> (Using the Java Perspective)
>
>      SparkSession mySparkSession = SparkSession.builder()
>     .master("local")
>     .appName("loadingFiles")
>     .getOrCreate();
>
>     Dataset<Row> myDataSet= mySparkSession.read().csv("C:/
> temp/pricepaid/pp-monthly-update-new-version.csv");
>
> I am getting the following error message when running the application via
> Eclipse:
>
> xception in thread "main" java.lang.IllegalArgumentException: Error while
> instantiating 'org.apache.spark.sql.internal.SessionState':
> at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$
> reflect(SparkSession.scala:949)
> at org.apache.spark.sql.SparkSession.sessionState$
> lzycompute(SparkSession.scala:111)
> at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:110)
> at org.apache.spark.sql.SparkSession.conf$lzycompute(
> SparkSession.scala:133)
> at org.apache.spark.sql.SparkSession.conf(SparkSession.scala:133)
> at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(
> SparkSession.scala:838)
> at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(
> SparkSession.scala:838)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.
> apply(HashMap.scala:99)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.
> apply(HashMap.scala:99)
> at scala.collection.mutable.HashTable$class.foreachEntry(
> HashTable.scala:230)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.
> scala:838)
> at org.packtpub.SparkFunctionsTest.main(SparkFunctionsTest.java:110)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$
> reflect(SparkSession.scala:946)
> ... 13 more
> Caused by: java.lang.IllegalArgumentException: Error while instantiating
> 'org.apache.spark.sql.internal.SharedState':
> at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$
> reflect(SparkSession.scala:949)
> at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(
> SparkSession.scala:100)
> at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(
> SparkSession.scala:100)
> at scala.Option.getOrElse(Option.scala:121)
> at org.apache.spark.sql.SparkSession.sharedState$
> lzycompute(SparkSession.scala:99)
> at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:98)
> at org.apache.spark.sql.internal.SessionState.<init>(
> SessionState.scala:153)
> ... 18 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(
> NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
> DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$
> reflect(SparkSession.scala:946)
> ... 24 more
> Caused by: java.lang.NoClassDefFoundError: org/apache/parquet/hadoop/
> ParquetOutputCommitter
> at org.apache.spark.sql.internal.SQLConf$.<init>(SQLConf.scala:235)
> at org.apache.spark.sql.internal.SQLConf$.<clinit>(SQLConf.scala)
> at org.apache.spark.sql.internal.SQLConf.setConfString(SQLConf.scala:711)
> at org.apache.spark.sql.internal.SharedState$$anonfun$1.apply(
> SharedState.scala:67)
> at org.apache.spark.sql.internal.SharedState$$anonfun$1.apply(
> SharedState.scala:67)
> at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.
> scala:33)
> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
> at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:67)
> ... 29 more
> Caused by: java.lang.ClassNotFoundException: org.apache.parquet.hadoop.
> ParquetOutputCommitter
> at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 37 more
> 16/09/10 16:48:14 INFO SparkContext: Invoking stop() from shutdown hook
>
>
>
> Any ideas would be highly appreciated.
>
> Best Regards,
> Irfan
>