You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "ABHISHEK CHOUDHARY (JIRA)" <ji...@apache.org> on 2015/06/11 22:14:01 UTC

[jira] [Closed] (SPARK-8296) Not able to load Dataframe using Python throws py4j.protocol.Py4JJavaError

     [ https://issues.apache.org/jira/browse/SPARK-8296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ABHISHEK CHOUDHARY closed SPARK-8296.
-------------------------------------
       Resolution: Done
    Fix Version/s: 1.3.1

When I debug I found that Spark was receiving wrong Hadoop URL , a minor mistake in configuration , but the error stacktrace didn't reveal that.

So Its not a bug , its configuration issue

> Not able to load Dataframe using Python throws py4j.protocol.Py4JJavaError
> --------------------------------------------------------------------------
>
>                 Key: SPARK-8296
>                 URL: https://issues.apache.org/jira/browse/SPARK-8296
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 1.3.1
>         Environment: MAC OS
>            Reporter: ABHISHEK CHOUDHARY
>              Labels: test
>             Fix For: 1.3.1
>
>
> While trying to load a json file using sqlcontext in prebuilt spark-1.3.1-bin-hadoop2.4 version, it throws py4j.protocol.Py4JJavaError
> from pyspark.sql import SQLContext
> from pyspark import SparkContext
> sc = SparkContext()
> sqlContext = SQLContext(sc)
> # Create the DataFrame
> df = sqlContext.jsonFile("changes.json")
> # Show the content of the DataFrame
> df.show()
> Error thrown -
>   File "/Users/abhishekchoudhary/Work/python/evolveML/kaggle/avirto/test.py", line 11, in <module>
>     df = sqlContext.jsonFile("changes.json")
>   File "/Users/abhishekchoudhary/bigdata/cdh5.2.0/spark-1.3.1/python/pyspark/sql/context.py", line 377, in jsonFile
>     df = self._ssql_ctx.jsonFile(path, samplingRatio)
>   File "/Users/abhishekchoudhary/bigdata/cdh5.2.0/spark-1.3.1/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
>   File "/Users/abhishekchoudhary/bigdata/cdh5.2.0/spark-1.3.1/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
> py4j.protocol.Py4JJavaError
> On checking through the source code, I found that 'gateway_client' is not valid .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org