You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Benjamin Ross <br...@Lattice-Engines.com> on 2015/07/30 21:45:13 UTC

Failed to load class for data source: org.apache.spark.sql.cassandra

Hey all,
I'm running what should be a very straight-forward application of the Cassandra sql connector, and I'm getting an error:

Exception in thread "main" java.lang.RuntimeException: Failed to load class for data source: org.apache.spark.sql.cassandra
        at scala.sys.package$.error(package.scala:27)
        at org.apache.spark.sql.sources.ResolvedDataSource$.lookupDataSource(ddl.scala:220)
        at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:233)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
        at com.latticeengines.test.CassandraTest$.main(CassandraTest.scala:33)
        at com.latticeengines.test.CassandraTest.main(CassandraTest.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/07/30 15:34:47 INFO spark.SparkContext: Invoking stop() from shutdown hook

My jar is shaded, so I assume this shouldn't happen?

Here's the code I'm trying to run:
object CassandraTest {
  def main(args: Array[String]) {
    println("Hello, scala!")

    var conf = new SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1")


    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)
    val df = sqlContext
      .read
      .format("org.apache.spark.sql.cassandra")
      .options(Map( "table" -> "kv", "keyspace" -> "test"))
      .load()
    val w = Window.orderBy("value").rowsBetween(-2, 0)
    df.select(mean("value").over(w))

  }
}

RE: Failed to load class for data source: org.apache.spark.sql.cassandra

Posted by Benjamin Ross <br...@Lattice-Engines.com>.

If anyone's curious, the issue here is that I was using the 1.2.4 connector of the datastax spark Cassandra connector, rather than the 1.4.0-M1 pre-release.  1.2.4 doesn't fully support data frames, and it's presumably still only experimental in 1.4.0-M1.

Ben


From: Benjamin Ross
Sent: Thursday, July 30, 2015 4:14 PM
To: user@spark.apache.org
Subject: RE: Failed to load class for data source: org.apache.spark.sql.cassandra

I'm submitting the application this way:
spark-submit  test-2.0.5-SNAPSHOT-jar-with-dependencies.jar

I've confirmed that org.apache.spark.sql.cassandra and org.apache.cassandra classes are in the jar.

Apologies for this relatively newbie question - I'm still new to both spark and scala.
Thanks,
Ben


From: Benjamin Ross
Sent: Thursday, July 30, 2015 3:45 PM
To: user@spark.apache.org<ma...@spark.apache.org>
Subject: Failed to load class for data source: org.apache.spark.sql.cassandra

Hey all,
I'm running what should be a very straight-forward application of the Cassandra sql connector, and I'm getting an error:

Exception in thread "main" java.lang.RuntimeException: Failed to load class for data source: org.apache.spark.sql.cassandra
        at scala.sys.package$.error(package.scala:27)
        at org.apache.spark.sql.sources.ResolvedDataSource$.lookupDataSource(ddl.scala:220)
        at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:233)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
        at com.latticeengines.test.CassandraTest$.main(CassandraTest.scala:33)
        at com.latticeengines.test.CassandraTest.main(CassandraTest.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/07/30 15:34:47 INFO spark.SparkContext: Invoking stop() from shutdown hook

My jar is shaded, so I assume this shouldn't happen?

Here's the code I'm trying to run:
object CassandraTest {
  def main(args: Array[String]) {
    println("Hello, scala!")

    var conf = new SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1")


    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)
    val df = sqlContext
      .read
      .format("org.apache.spark.sql.cassandra")
      .options(Map( "table" -> "kv", "keyspace" -> "test"))
      .load()
    val w = Window.orderBy("value").rowsBetween(-2, 0)
    df.select(mean("value").over(w))

  }
}

RE: Failed to load class for data source: org.apache.spark.sql.cassandra

Posted by Benjamin Ross <br...@Lattice-Engines.com>.

I'm submitting the application this way:
spark-submit  test-2.0.5-SNAPSHOT-jar-with-dependencies.jar

I've confirmed that org.apache.spark.sql.cassandra and org.apache.cassandra classes are in the jar.

Apologies for this relatively newbie question - I'm still new to both spark and scala.
Thanks,
Ben


From: Benjamin Ross
Sent: Thursday, July 30, 2015 3:45 PM
To: user@spark.apache.org
Subject: Failed to load class for data source: org.apache.spark.sql.cassandra

Hey all,
I'm running what should be a very straight-forward application of the Cassandra sql connector, and I'm getting an error:

Exception in thread "main" java.lang.RuntimeException: Failed to load class for data source: org.apache.spark.sql.cassandra
        at scala.sys.package$.error(package.scala:27)
        at org.apache.spark.sql.sources.ResolvedDataSource$.lookupDataSource(ddl.scala:220)
        at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:233)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114)
        at com.latticeengines.test.CassandraTest$.main(CassandraTest.scala:33)
        at com.latticeengines.test.CassandraTest.main(CassandraTest.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/07/30 15:34:47 INFO spark.SparkContext: Invoking stop() from shutdown hook

My jar is shaded, so I assume this shouldn't happen?

Here's the code I'm trying to run:
object CassandraTest {
  def main(args: Array[String]) {
    println("Hello, scala!")

    var conf = new SparkConf(true).set("spark.cassandra.connection.host", "127.0.0.1")


    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)
    val df = sqlContext
      .read
      .format("org.apache.spark.sql.cassandra")
      .options(Map( "table" -> "kv", "keyspace" -> "test"))
      .load()
    val w = Window.orderBy("value").rowsBetween(-2, 0)
    df.select(mean("value").over(w))

  }
}