You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2017/07/19 07:22:01 UTC

[jira] [Resolved] (SPARK-21466) com.cloudant.spark throws an error in python notebook

     [ https://issues.apache.org/jira/browse/SPARK-21466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-21466.
-------------------------------
    Resolution: Invalid

That doesn't sound like a Spark problem. A class isn't found and maybe it's due to a third party tool.

> com.cloudant.spark throws an error in python notebook
> -----------------------------------------------------
>
>                 Key: SPARK-21466
>                 URL: https://issues.apache.org/jira/browse/SPARK-21466
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.0.0, 2.1.1
>         Environment: Data Science Experience- Python Notebook version-2
>            Reporter: Smruthi Rajmohan
>
> When I try to establish a connection with cloudant. I get an error-
> Py4JJavaErrorTraceback (most recent call last)
> <ipython-input-1-503b9b8606de> in <module>()
>       1 import pyspark
>       2 sqlContext = SQLContext(sc)
> ----> 3 cloudantdata = sqlContext.read.format("com.cloudant.spark").option("cloudant.host","3d1a8ae1-9d67-4859-a3c4-8fed8d7548db-bluemix.cloudant.com").option("cloudant.username", "3d1a8ae1-9d67-4859-a3c4-8fed8d7548db-bluemix").option("cloudant.password","ccf3f8349ffb5964513973f90223f6178b41c2a50345b1120f0322466eb14ba9").load("iotp_4wkoz2_default_2017-07-17")
> /usr/local/src/spark20master/spark/python/pyspark/sql/readwriter.py in load(self, path, format, schema, **options)
>     145         self.options(**options)
>     146         if isinstance(path, basestring):
> --> 147             return self._df(self._jreader.load(path))
>     148         elif path is not None:
>     149             if type(path) != list:
> /usr/local/src/spark20master/spark/python/lib/py4j-0.10.3-src.zip/py4j/java_gateway.py in __call__(self, *args)
>    1131         answer = self.gateway_client.send_command(command)
>    1132         return_value = get_return_value(
> -> 1133             answer, self.gateway_client, self.target_id, self.name)
>    1134 
>    1135         for temp_arg in temp_args:
> /usr/local/src/spark20master/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
>      61     def deco(*a, **kw):
>      62         try:
> ---> 63             return f(*a, **kw)
>      64         except py4j.protocol.Py4JJavaError as e:
>      65             s = e.java_exception.toString()
> /usr/local/src/spark20master/spark/python/lib/py4j-0.10.3-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
>     317                 raise Py4JJavaError(
>     318                     "An error occurred while calling {0}{1}{2}.\n".
> --> 319                     format(target_id, ".", name), value)
>     320             else:
>     321                 raise Py4JError(
> Py4JJavaError: An error occurred while calling o86.load.
> : java.lang.ClassNotFoundException: Failed to find data source: com.cloudant.spark. Please find packages at https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects
> 	at org.apache.spark.sql.execution.datasources.DataSource.lookupDataSource(DataSource.scala:148)
> 	at org.apache.spark.sql.execution.datasources.DataSource.providingClass$lzycompute(DataSource.scala:79)
> 	at org.apache.spark.sql.execution.datasources.DataSource.providingClass(DataSource.scala:79)
> 	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340)
> 	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
> 	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:132)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
> 	at java.lang.reflect.Method.invoke(Method.java:507)
> 	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:237)
> 	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
> 	at py4j.Gateway.invoke(Gateway.java:280)
> 	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
> 	at py4j.commands.CallCommand.execute(CallCommand.java:79)
> 	at py4j.GatewayConnection.run(GatewayConnection.java:214)
> 	at java.lang.Thread.run(Thread.java:785)
> Caused by: java.lang.ClassNotFoundException: com.cloudant.spark.DefaultSource
> 	at java.net.URLClassLoader.findClass(URLClassLoader.java:607)
> 	at java.lang.ClassLoader.loadClassHelper(ClassLoader.java:844)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:823)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:803)
> 	at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5$$anonfun$apply$1.apply(DataSource.scala:132)
> 	at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5$$anonfun$apply$1.apply(DataSource.scala:132)
> 	at scala.util.Try$.apply(Try.scala:192)
> 	at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5.apply(DataSource.scala:132)
> 	at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$5.apply(DataSource.scala:132)
> 	at scala.util.Try.orElse(Try.scala:84)
> 	at org.apache.spark.sql.execution.datasources.DataSource.lookupDataSource(DataSource.scala:132)
> 	... 16 more
> This is the code for the same
> sqlContext = SQLContext(sc)
> cloudantdata = sqlContext.read.format("com.cloudant.spark").\
> option("cloudant.host","some_hostname").\
> option("cloudant.username", "some_username").\
> option("cloudant.password","password").\
> load("somedb")



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org