You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Trivedi Amit <am...@yahoo.com.INVALID> on 2016/09/15 14:50:42 UTC

Re: Write to Cassandra table from pyspark fails with scala reflect error [RESOLVED]

There was some environment issue. I basically removed all environment variables and with 2.11 and it worked.
Thanks for help.

      From: Trivedi Amit <am...@yahoo.com.INVALID>
 To: Russell Spitzer <ru...@gmail.com>; "user@spark.apache.org" <us...@spark.apache.org> 
 Sent: Thursday, September 15, 2016 9:47 AM
 Subject: Re: Write to Cassandra table from pyspark fails with scala reflect error
   
Thanks Russell. I didn't build this myself. I tried with Scala 2.11 com.datastax.spark:spark-cassandra-connector_2.11:2.0.0-M(1-3) and I am getting
```
Exception in thread "main" java.lang.RuntimeException: [download failed: org.scala-lang#scala-reflect;2.11.8!scala-reflect.jar]
    at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1066)
    at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:294)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:158)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Traceback (most recent call last):
  File "$SPARK_HOME/python/pyspark/shell.py", line 38, in <module>
    SparkContext._ensure_initialized()
  File "$SPARK_HOME/python/pyspark/context.py", line 243, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway()
  File "$SPARK_HOME/python/pyspark/java_gateway.py", line 94, in launch_gateway
    raise Exception("Java gateway process exited before sending the driver its port number")
Exception: Java gateway process exited before sending the driver its port number```
I deleted my .m2 directory to avoid any conflicts with cached or older versions. I only have SPARK_HOME environment variable set (env variables related to Spark and Python). 

      From: Russell Spitzer <ru...@gmail.com>
 To: Trivedi Amit <am...@yahoo.com>; "user@spark.apache.org" <us...@spark.apache.org> 
 Sent: Wednesday, September 14, 2016 11:24 PM
 Subject: Re: Write to Cassandra table from pyspark fails with scala reflect error
  
Spark 2.0 defaults to Scala 2.11, so if you didn't build it yourself you need the 2.11 artifact for the Spark Cassandra Connector.
On Wed, Sep 14, 2016 at 7:44 PM Trivedi Amit <am...@yahoo.com.invalid> wrote:

Hi,



I am testing a pyspark program that will read from a csv file and write data into Cassandra table. I am using pyspark with spark-cassandra-connector 2.10:2.0.0-M3. I am using Spark v2.0.0.

While executing below command

```df.write.format("org.apache.spark.sql.cassandra").mode('append').options( table="test_table", keyspace="test").save()```

I am getting
```
py4j.protocol.Py4JJavaError: An error occurred while calling o47.save.
: java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;
        at com.datastax.spark.connector.types.TypeConverter$.<init>(TypeConverter.scala:116)
        at com.datastax.spark.connector.types.TypeConverter$.<clinit>(TypeConverter.scala)
        at com.datastax.spark.connector.types.BigIntType$.converterToCassandra(PrimitiveColumnType.scala:50)
        at com.datastax.spark.connector.types.BigIntType$.converterToCassandra(PrimitiveColumnType.scala:46)
        at com.datastax.spark.connector.writer.SqlRowWriter$$anonfun$3.apply(SqlRowWriter.scala:18)
        at com.datastax.spark.connector.writer.SqlRowWriter$$anonfun$3.apply(SqlRowWriter.scala:18)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.AbstractTraversable.map(Traversable.scala:104)
        at com.datastax.spark.connector.writer.SqlRowWriter.<init>(SqlRowWriter.scala:18)
        at com.datastax.spark.connector.writer.SqlRowWriter$Factory$.rowWriter(SqlRowWriter.scala:36)
        at com.datastax.spark.connector.writer.SqlRowWriter$Factory$.rowWriter(SqlRowWriter.scala:34)
        at com.datastax.spark.connector.writer.TableWriter$.apply(TableWriter.scala:271)
        at com.datastax.spark.connector.RDDFunctions.saveToCassandra(RDDFunctions.scala:36)
        at org.apache.spark.sql.cassandra.CassandraSourceRelation.insert(CassandraSourceRelation.scala:66)
        at org.apache.spark.sql.cassandra.DefaultSource.createRelation(DefaultSource.scala:85)
        at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:429)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:211)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:280)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:128)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:211)
        at java.lang.Thread.run(Thread.java:745)
```

Google search for this error lead to threads where folks were talking about using scala version 2.10 instead of 2.11 for this issue. However, I am not using Scala and I am assuming that 2.10 in spark-cassandra-connector is Scala version. 

Don't know how to fix or get around this issue. Appreciate any help.
ThanksAT