You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Reynold Xin (JIRA)" <ji...@apache.org> on 2016/10/13 20:18:20 UTC

[jira] [Closed] (SPARK-15369) Investigate selectively using Jython for parts of PySpark

     [ https://issues.apache.org/jira/browse/SPARK-15369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Reynold Xin closed SPARK-15369.
-------------------------------
    Resolution: Won't Fix

In the spirit of having more explicitly accept/rejects, and given the discussions so far on both this ticket and on the github pull request), I'm going to close this as won't fix for now. We can still continue to discuss here on the merits, but the reject is based on the following:

1. Maintenance cost of supporting another runtime.

2. Jython is years behind in terms of features with Cython (or even PyPy).

3. Jython cannot leverage any of the numeric tools available.

(In hindsight maybe PyPy support was also added prematurely.)



> Investigate selectively using Jython for parts of PySpark
> ---------------------------------------------------------
>
>                 Key: SPARK-15369
>                 URL: https://issues.apache.org/jira/browse/SPARK-15369
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>            Reporter: holdenk
>            Priority: Minor
>
> Transferring data from the JVM to the Python executor can be a substantial bottleneck. While Jython is not suitable for all UDFs or map functions, it may be suitable for some simple ones. We should investigate the option of using Jython to accelerate these small functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org