You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jeff Zhang (JIRA)" <ji...@apache.org> on 2015/12/01 08:00:17 UTC

[jira] [Comment Edited] (SPARK-12070) PySpark implementation of Slicing operator incorrect

    [ https://issues.apache.org/jira/browse/SPARK-12070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033218#comment-15033218 ] 

Jeff Zhang edited comment on SPARK-12070 at 12/1/15 6:59 AM:
-------------------------------------------------------------

The root cause is that when using syntax like this str[1:] for slice, the length will be set as the max int of python which is long for java. Because the range of python int is larger than that of java int. 

Will create a PR.


was (Author: zjffdu):
The root cause is that when using syntax like this str[1:] for slice, the length will be set as the max int of python which is long for java. Because the range of python int is larger than that of java int. 



> PySpark implementation of Slicing operator incorrect
> ----------------------------------------------------
>
>                 Key: SPARK-12070
>                 URL: https://issues.apache.org/jira/browse/SPARK-12070
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.5.2
>            Reporter: Jeff Zhang
>
> {code}
> aa=('Ofer', 1), ('Wei', 2)
> a = sqlContext.createDataFrame(aa)
> a.select(a._1[2:]).show()
> {code}
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/Users/jzhang/github/spark/python/pyspark/sql/column.py", line 286, in substr
>     jc = self._jc.substr(startPos, length)
>   File "/Users/jzhang/github/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__
>   File "/Users/jzhang/github/spark/python/pyspark/sql/utils.py", line 45, in deco
>     return f(*a, **kw)
>   File "/Users/jzhang/github/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 312, in get_return_value
> py4j.protocol.Py4JError: An error occurred while calling o37.substr. Trace:
> py4j.Py4JException: Method substr([class java.lang.Integer, class java.lang.Long]) does not exist
> 	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:335)
> 	at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:344)
> 	at py4j.Gateway.invoke(Gateway.java:252)
> 	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> 	at py4j.commands.CallCommand.execute(CallCommand.java:79)
> 	at py4j.GatewayConnection.run(GatewayConnection.java:209)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org