You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Reynold Xin (JIRA)" <ji...@apache.org> on 2015/07/18 09:39:05 UTC

[jira] [Closed] (SPARK-7981) Improve DataFrame Python exception

     [ https://issues.apache.org/jira/browse/SPARK-7981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Reynold Xin closed SPARK-7981.
------------------------------
       Resolution: Duplicate
         Assignee: Davies Liu
    Fix Version/s: 1.5.0

> Improve DataFrame Python exception
> ----------------------------------
>
>                 Key: SPARK-7981
>                 URL: https://issues.apache.org/jira/browse/SPARK-7981
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Reynold Xin
>            Assignee: Davies Liu
>             Fix For: 1.5.0
>
>
> It would be great if most exceptions thrown are rethrown as Python exceptions, rather than some crazy Py4j exception with a long stacktrace that is not Python friendly.
> As an example
> {code}
> In [61]: df.stat.cov('id', 'uniform')
> ---------------------------------------------------------------------------
> Py4JJavaError                             Traceback (most recent call last)
> <ipython-input-61-30146c89cbd6> in <module>()
> ----> 1 df.stat.cov('id', 'uniform')
> /scratch/rxin/spark/python/pyspark/sql/dataframe.pyc in cov(self, col1, col2)
>    1289 
>    1290     def cov(self, col1, col2):
> -> 1291         return self.df.cov(col1, col2)
>    1292 
>    1293     cov.__doc__ = DataFrame.cov.__doc__
> /scratch/rxin/spark/python/pyspark/sql/dataframe.pyc in cov(self, col1, col2)
>    1139         if not isinstance(col2, str):
>    1140             raise ValueError("col2 should be a string.")
> -> 1141         return self._jdf.stat().cov(col1, col2)
>    1142 
>    1143     @since(1.4)
> /Users/rxin/anaconda/lib/python2.7/site-packages/py4j-0.8.1-py2.7.egg/py4j/java_gateway.pyc in __call__(self, *args)
>     535         answer = self.gateway_client.send_command(command)
>     536         return_value = get_return_value(answer, self.gateway_client,
> --> 537                 self.target_id, self.name)
>     538 
>     539         for temp_arg in temp_args:
> /Users/rxin/anaconda/lib/python2.7/site-packages/py4j-0.8.1-py2.7.egg/py4j/protocol.pyc in get_return_value(answer, gateway_client, target_id, name)
>     298                 raise Py4JJavaError(
>     299                     'An error occurred while calling {0}{1}{2}.\n'.
> --> 300                     format(target_id, '.', name), value)
>     301             else:
>     302                 raise Py4JError(
> Py4JJavaError: An error occurred while calling o87.cov.
> : java.lang.IllegalArgumentException: requirement failed: Couldn't find column with name id
> 	at scala.Predef$.require(Predef.scala:233)
> 	at org.apache.spark.sql.execution.stat.StatFunctions$$anonfun$collectStatisticalData$3.apply(StatFunctions.scala:79)
> 	at org.apache.spark.sql.execution.stat.StatFunctions$$anonfun$collectStatisticalData$3.apply(StatFunctions.scala:78)
> 	at scala.collection.immutable.List.foreach(List.scala:318)
> 	at org.apache.spark.sql.execution.stat.StatFunctions$.collectStatisticalData(StatFunctions.scala:78)
> 	at org.apache.spark.sql.execution.stat.StatFunctions$.calculateCov(StatFunctions.scala:100)
> 	at org.apache.spark.sql.DataFrameStatFunctions.cov(DataFrameStatFunctions.scala:41)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
> 	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> 	at py4j.Gateway.invoke(Gateway.java:259)
> 	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> 	at py4j.commands.CallCommand.execute(CallCommand.java:79)
> 	at py4j.GatewayConnection.run(GatewayConnection.java:207)
> 	at java.lang.Thread.run(Thread.java:744)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org