You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Reynold Xin (JIRA)" <ji...@apache.org> on 2015/07/18 09:39:05 UTC
[jira] [Closed] (SPARK-7981) Improve DataFrame Python exception
[ https://issues.apache.org/jira/browse/SPARK-7981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Reynold Xin closed SPARK-7981.
------------------------------
Resolution: Duplicate
Assignee: Davies Liu
Fix Version/s: 1.5.0
> Improve DataFrame Python exception
> ----------------------------------
>
> Key: SPARK-7981
> URL: https://issues.apache.org/jira/browse/SPARK-7981
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Reporter: Reynold Xin
> Assignee: Davies Liu
> Fix For: 1.5.0
>
>
> It would be great if most exceptions thrown are rethrown as Python exceptions, rather than some crazy Py4j exception with a long stacktrace that is not Python friendly.
> As an example
> {code}
> In [61]: df.stat.cov('id', 'uniform')
> ---------------------------------------------------------------------------
> Py4JJavaError Traceback (most recent call last)
> <ipython-input-61-30146c89cbd6> in <module>()
> ----> 1 df.stat.cov('id', 'uniform')
> /scratch/rxin/spark/python/pyspark/sql/dataframe.pyc in cov(self, col1, col2)
> 1289
> 1290 def cov(self, col1, col2):
> -> 1291 return self.df.cov(col1, col2)
> 1292
> 1293 cov.__doc__ = DataFrame.cov.__doc__
> /scratch/rxin/spark/python/pyspark/sql/dataframe.pyc in cov(self, col1, col2)
> 1139 if not isinstance(col2, str):
> 1140 raise ValueError("col2 should be a string.")
> -> 1141 return self._jdf.stat().cov(col1, col2)
> 1142
> 1143 @since(1.4)
> /Users/rxin/anaconda/lib/python2.7/site-packages/py4j-0.8.1-py2.7.egg/py4j/java_gateway.pyc in __call__(self, *args)
> 535 answer = self.gateway_client.send_command(command)
> 536 return_value = get_return_value(answer, self.gateway_client,
> --> 537 self.target_id, self.name)
> 538
> 539 for temp_arg in temp_args:
> /Users/rxin/anaconda/lib/python2.7/site-packages/py4j-0.8.1-py2.7.egg/py4j/protocol.pyc in get_return_value(answer, gateway_client, target_id, name)
> 298 raise Py4JJavaError(
> 299 'An error occurred while calling {0}{1}{2}.\n'.
> --> 300 format(target_id, '.', name), value)
> 301 else:
> 302 raise Py4JError(
> Py4JJavaError: An error occurred while calling o87.cov.
> : java.lang.IllegalArgumentException: requirement failed: Couldn't find column with name id
> at scala.Predef$.require(Predef.scala:233)
> at org.apache.spark.sql.execution.stat.StatFunctions$$anonfun$collectStatisticalData$3.apply(StatFunctions.scala:79)
> at org.apache.spark.sql.execution.stat.StatFunctions$$anonfun$collectStatisticalData$3.apply(StatFunctions.scala:78)
> at scala.collection.immutable.List.foreach(List.scala:318)
> at org.apache.spark.sql.execution.stat.StatFunctions$.collectStatisticalData(StatFunctions.scala:78)
> at org.apache.spark.sql.execution.stat.StatFunctions$.calculateCov(StatFunctions.scala:100)
> at org.apache.spark.sql.DataFrameStatFunctions.cov(DataFrameStatFunctions.scala:41)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> at py4j.Gateway.invoke(Gateway.java:259)
> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> at py4j.commands.CallCommand.execute(CallCommand.java:79)
> at py4j.GatewayConnection.run(GatewayConnection.java:207)
> at java.lang.Thread.run(Thread.java:744)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org