You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Shixiong Zhu (JIRA)" <ji...@apache.org> on 2016/06/16 20:19:05 UTC
[jira] [Resolved] (SPARK-15981) Fix bug in python DataStreamReader
[ https://issues.apache.org/jira/browse/SPARK-15981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shixiong Zhu resolved SPARK-15981.
----------------------------------
Resolution: Fixed
Fix Version/s: 2.0.0
> Fix bug in python DataStreamReader
> ----------------------------------
>
> Key: SPARK-15981
> URL: https://issues.apache.org/jira/browse/SPARK-15981
> Project: Spark
> Issue Type: Sub-task
> Components: SQL, Streaming
> Reporter: Tathagata Das
> Assignee: Tathagata Das
> Priority: Blocker
> Fix For: 2.0.0
>
>
> Bug in Python DataStreamReader API made it unusable. Because a single path was being converted to a array before calling Java DataStreamReader method (which takes a string only), it gave the following error.
> {code}
> File "/Users/tdas/Projects/Spark/spark/python/pyspark/sql/readwriter.py", line 947, in pyspark.sql.readwriter.DataStreamReader.json
> Failed example:
> json_sdf = spark.readStream.json(os.path.join(tempfile.mkdtemp(), 'data'), schema = sdf_schema)
> Exception raised:
> Traceback (most recent call last):
> File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/doctest.py", line 1253, in __run
> compileflags, 1) in test.globs
> File "<doctest pyspark.sql.readwriter.DataStreamReader.json[0]>", line 1, in <module>
> json_sdf = spark.readStream.json(os.path.join(tempfile.mkdtemp(), 'data'), schema = sdf_schema)
> File "/Users/tdas/Projects/Spark/spark/python/pyspark/sql/readwriter.py", line 963, in json
> return self._df(self._jreader.json(path))
> File "/Users/tdas/Projects/Spark/spark/python/lib/py4j-0.10.1-src.zip/py4j/java_gateway.py", line 933, in __call__
> answer, self.gateway_client, self.target_id, self.name)
> File "/Users/tdas/Projects/Spark/spark/python/pyspark/sql/utils.py", line 63, in deco
> return f(*a, **kw)
> File "/Users/tdas/Projects/Spark/spark/python/lib/py4j-0.10.1-src.zip/py4j/protocol.py", line 316, in get_return_value
> format(target_id, ".", name, value))
> Py4JError: An error occurred while calling o121.json. Trace:
> py4j.Py4JException: Method json([class java.util.ArrayList]) does not exist
> at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:318)
> at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:326)
> at py4j.Gateway.invoke(Gateway.java:272)
> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:128)
> at py4j.commands.CallCommand.execute(CallCommand.java:79)
> at py4j.GatewayConnection.run(GatewayConnection.java:211)
> at java.lang.Thread.run(Thread.java:744)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org