You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jeff Zhang (JIRA)" <ji...@apache.org> on 2015/12/01 08:00:17 UTC

[jira] [Commented] (SPARK-12070) PySpark implementation of Slicing operator incorrect

    [ https://issues.apache.org/jira/browse/SPARK-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033218#comment-15033218 ] 

Jeff Zhang commented on SPARK-12070:
------------------------------------

The root cause is that when using syntax like this str[1:] for slice, the length will be set as the max int of python which is long for java. Because the range of python int is larger than that of java int. 



> PySpark implementation of Slicing operator incorrect
> ----------------------------------------------------
>
>                 Key: SPARK-12070
>                 URL: https://issues.apache.org/jira/browse/SPARK-12070
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.5.2
>            Reporter: Jeff Zhang
>
> {code}
> aa=('Ofer', 1), ('Wei', 2)
> a = sqlContext.createDataFrame(aa)
> a.select(a._1[2:]).show()
> {code}
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/Users/jzhang/github/spark/python/pyspark/sql/column.py", line 286, in substr
>     jc = self._jc.substr(startPos, length)
>   File "/Users/jzhang/github/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__
>   File "/Users/jzhang/github/spark/python/pyspark/sql/utils.py", line 45, in deco
>     return f(*a, **kw)
>   File "/Users/jzhang/github/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 312, in get_return_value
> py4j.protocol.Py4JError: An error occurred while calling o37.substr. Trace:
> py4j.Py4JException: Method substr([class java.lang.Integer, class java.lang.Long]) does not exist
> 	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:335)
> 	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:344)
> 	at py4j.Gateway.invoke(Gateway.java:252)
> 	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> 	at py4j.commands.CallCommand.execute(CallCommand.java:79)
> 	at py4j.GatewayConnection.run(GatewayConnection.java:209)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org