You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Cyanny LIANG <lg...@gmail.com> on 2017/05/31 10:36:27 UTC

When will spark 2.0 support dataset python API?

Hi,
Since DataSet API has become a common way to process structured data in
spark 2.0, and Scala , Java API support dataset now, and When will python
dataset API release? or are there some plans?
Consider that, in our production environment, many users love to use python
API, which has many machine learning tools, so python DataSet API will be
very helpful for us. Really looking forward to it.
I searched some jira issues about this:
https://issues.apache.org/jira/browse/SPARK-12776
https://issues.apache.org/jira/browse/SPARK-9999

And int this blog:
https://databricks.com/blog/2016/01/04/introducing-apache-spark-datasets.html
it said, python API will be supported in spark 2.0

-- 
Best & Regards
Cyanny LIANG
email: lgrcyanny@gmail.com

Re: When will spark 2.0 support dataset python API?

Posted by Wenchen Fan <cl...@gmail.com>.
We tried but didn’t get much benefits from Python Dataset, as Python is dynamic typed and there is not much we can do to optimize running python functions.

> On 31 May 2017, at 3:36 AM, Cyanny LIANG <lg...@gmail.com> wrote:
> 
> Hi,
> Since DataSet API has become a common way to process structured data in spark 2.0, and Scala , Java API support dataset now, and When will python dataset API release? or are there some plans?
> Consider that, in our production environment, many users love to use python API, which has many machine learning tools, so python DataSet API will be very helpful for us. Really looking forward to it.
> I searched some jira issues about this:
> https://issues.apache.org/jira/browse/SPARK-12776 <https://issues.apache.org/jira/browse/SPARK-12776>
> https://issues.apache.org/jira/browse/SPARK-9999 <https://issues.apache.org/jira/browse/SPARK-9999>
> 
> And int this blog: https://databricks.com/blog/2016/01/04/introducing-apache-spark-datasets.html <https://databricks.com/blog/2016/01/04/introducing-apache-spark-datasets.html>
> it said, python API will be supported in spark 2.0
> 
> -- 
> Best & Regards
> Cyanny LIANG
> email: lgrcyanny@gmail.com <ma...@gmail.com>