You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Hequn Cheng <he...@apache.org> on 2020/02/05 09:24:16 UTC

[DISCUSS] Support Python ML Pipeline API

Hi everyone,

FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI and introduces
a new set of Java APIs. As Python is widely used in ML areas, providing
Python ML Pipeline APIs for Flink can not only make it easier to write ML
jobs for Python users but also broaden the adoption of Flink ML.

Given this, Jincheng and I discussed offline about the support of Python ML
Pipeline API and drafted a design doc[2]. We'd like to achieve three goals
for supporting Python Pipeline API:
- Add Python pipeline API according to Java pipeline API(we will adapt the
Python pipeline API if Java pipeline API changes).
- Support native Python Transformer/Estimator/Model, i.e., users can write
not only Python Transformer/Estimator/Model wrappers for calling Java ones
but also can write native Python Transformer/Estimator/Models.
- Ease of use. Support keyword arguments when defining parameters.

More details can be found in the design doc and we are looking forward to
your feedback.

Best,
Hequn

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
[2]
https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing

Re: [DISCUSS] Support Python ML Pipeline API

Posted by Hequn Cheng <he...@apache.org>.
Hi Becket,

Thanks a lot for your advice. Definitely agree with you that we should
follow the FLIP process.
Will pay more attention to this next time.

Best, Hequn


On Fri, Feb 14, 2020 at 2:19 PM Becket Qin <be...@gmail.com> wrote:

> I just had an offline chat with Hequn and realized that FLIP-96 has already
> been opened for this discussion. I missed that because the FLIP was not
> mentioned in the thread.
>
> I am fine with proceeding to a vote.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Feb 14, 2020 at 12:52 PM Becket Qin <be...@gmail.com> wrote:
>
> > Hi Hequn,
> >
> > Given this is an addition to the public API, we probably should follow
> the
> > FLIP process. It would be a quick one though, I think.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Fri, Feb 14, 2020 at 10:03 AM Hequn Cheng <he...@apache.org> wrote:
> >
> >> Hi all,
> >>
> >> Thanks a lot for your valuable feedback!
> >> As it seems we have reached a consensus on the discussion now. I have
> >> started a VOTE thread[1]. Looking forward to your vote.
> >>
> >> Best,
> >> Hequn
> >>
> >> [1]
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Support-Python-ML-Pipeline-API-td37637.html
> >>
> >> On Thu, Feb 13, 2020 at 10:40 AM Becket Qin <be...@gmail.com>
> wrote:
> >>
> >>> +1. I'd say this is almost a must-have for machine learning.
> >>>
> >>> Thanks,
> >>>
> >>> Jiangjie (Becket) Qin
> >>>
> >>> On Thu, Feb 13, 2020 at 10:03 AM Rong Rong <wa...@gmail.com>
> wrote:
> >>>
> >>>> Thanks for driving this initiative @Hequn Cheng <he...@apache.org>.
> >>>>
> >>>> Moving towards python based ML is definitely a huge win consider how
> >>>> large
> >>>> the python-ML community is. a big +1 on my side!
> >>>> Regarding the doc, I only left a few comments on the specific APIs.
> >>>> overall
> >>>> the architecture looks very good!
> >>>>
> >>>> Looking forward to it!
> >>>> --
> >>>> Rong
> >>>>
> >>>> On Sun, Feb 9, 2020 at 10:28 PM Hequn Cheng <he...@apache.org> wrote:
> >>>>
> >>>> > Hi everyone,
> >>>> >
> >>>> > Thanks a lot for your feedback. I have created the FLIP[1].
> >>>> >
> >>>> > Best,
> >>>> > Hequn
> >>>> >
> >>>> > [1]
> >>>> >
> >>>> >
> >>>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP+96%3A+Support+Python+ML+Pipeline+API
> >>>> >
> >>>> > On Mon, Feb 10, 2020 at 12:29 PM Dian Fu <di...@gmail.com>
> >>>> wrote:
> >>>> >
> >>>> > > Hi Hequn,
> >>>> > >
> >>>> > > Thanks for bringing up the discussion. +1 to this feature. The
> >>>> design
> >>>> > LGTM.
> >>>> > > It's great that the Python ML users could use both the Java
> Pipeline
> >>>> > > Transformer/Estimator/Model classes and the Python
> >>>> > > Pipeline Transformer/Estimator/Model in the same job.
> >>>> > >
> >>>> > > Regards,
> >>>> > > Dian
> >>>> > >
> >>>> > > On Mon, Feb 10, 2020 at 11:08 AM jincheng sun <
> >>>> sunjincheng121@gmail.com>
> >>>> > > wrote:
> >>>> > >
> >>>> > > > Hi Hequn,
> >>>> > > >
> >>>> > > > Thanks for bring up this discussion.
> >>>> > > >
> >>>> > > > +1 for add Python ML Pipeline API, even though the Java pipeline
> >>>> API
> >>>> > may
> >>>> > > > change.
> >>>> > > >
> >>>> > > > I would like to suggest create a FLIP for this API changes. :)
> >>>> > > >
> >>>> > > > Best,
> >>>> > > > Jincheng
> >>>> > > >
> >>>> > > >
> >>>> > > > Hequn Cheng <he...@apache.org> 于2020年2月5日周三 下午5:24写道:
> >>>> > > >
> >>>> > > > > Hi everyone,
> >>>> > > > >
> >>>> > > > > FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI
> and
> >>>> > > > introduces
> >>>> > > > > a new set of Java APIs. As Python is widely used in ML areas,
> >>>> > providing
> >>>> > > > > Python ML Pipeline APIs for Flink can not only make it easier
> to
> >>>> > write
> >>>> > > ML
> >>>> > > > > jobs for Python users but also broaden the adoption of Flink
> ML.
> >>>> > > > >
> >>>> > > > > Given this, Jincheng and I discussed offline about the support
> >>>> of
> >>>> > > Python
> >>>> > > > ML
> >>>> > > > > Pipeline API and drafted a design doc[2]. We'd like to achieve
> >>>> three
> >>>> > > > goals
> >>>> > > > > for supporting Python Pipeline API:
> >>>> > > > > - Add Python pipeline API according to Java pipeline API(we
> will
> >>>> > adapt
> >>>> > > > the
> >>>> > > > > Python pipeline API if Java pipeline API changes).
> >>>> > > > > - Support native Python Transformer/Estimator/Model, i.e.,
> >>>> users can
> >>>> > > > write
> >>>> > > > > not only Python Transformer/Estimator/Model wrappers for
> >>>> calling Java
> >>>> > > > ones
> >>>> > > > > but also can write native Python Transformer/Estimator/Models.
> >>>> > > > > - Ease of use. Support keyword arguments when defining
> >>>> parameters.
> >>>> > > > >
> >>>> > > > > More details can be found in the design doc and we are looking
> >>>> > forward
> >>>> > > to
> >>>> > > > > your feedback.
> >>>> > > > >
> >>>> > > > > Best,
> >>>> > > > > Hequn
> >>>> > > > >
> >>>> > > > > [1]
> >>>> > > > >
> >>>> > > > >
> >>>> > > >
> >>>> > >
> >>>> >
> >>>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
> >>>> > > > > [2]
> >>>> > > > >
> >>>> > > > >
> >>>> > > >
> >>>> > >
> >>>> >
> >>>>
> https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing
> >>>> > > > >
> >>>> > > >
> >>>> > >
> >>>> >
> >>>>
> >>>
>

Re: [DISCUSS] Support Python ML Pipeline API

Posted by Becket Qin <be...@gmail.com>.
I just had an offline chat with Hequn and realized that FLIP-96 has already
been opened for this discussion. I missed that because the FLIP was not
mentioned in the thread.

I am fine with proceeding to a vote.

Thanks,

Jiangjie (Becket) Qin

On Fri, Feb 14, 2020 at 12:52 PM Becket Qin <be...@gmail.com> wrote:

> Hi Hequn,
>
> Given this is an addition to the public API, we probably should follow the
> FLIP process. It would be a quick one though, I think.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Feb 14, 2020 at 10:03 AM Hequn Cheng <he...@apache.org> wrote:
>
>> Hi all,
>>
>> Thanks a lot for your valuable feedback!
>> As it seems we have reached a consensus on the discussion now. I have
>> started a VOTE thread[1]. Looking forward to your vote.
>>
>> Best,
>> Hequn
>>
>> [1]
>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Support-Python-ML-Pipeline-API-td37637.html
>>
>> On Thu, Feb 13, 2020 at 10:40 AM Becket Qin <be...@gmail.com> wrote:
>>
>>> +1. I'd say this is almost a must-have for machine learning.
>>>
>>> Thanks,
>>>
>>> Jiangjie (Becket) Qin
>>>
>>> On Thu, Feb 13, 2020 at 10:03 AM Rong Rong <wa...@gmail.com> wrote:
>>>
>>>> Thanks for driving this initiative @Hequn Cheng <he...@apache.org>.
>>>>
>>>> Moving towards python based ML is definitely a huge win consider how
>>>> large
>>>> the python-ML community is. a big +1 on my side!
>>>> Regarding the doc, I only left a few comments on the specific APIs.
>>>> overall
>>>> the architecture looks very good!
>>>>
>>>> Looking forward to it!
>>>> --
>>>> Rong
>>>>
>>>> On Sun, Feb 9, 2020 at 10:28 PM Hequn Cheng <he...@apache.org> wrote:
>>>>
>>>> > Hi everyone,
>>>> >
>>>> > Thanks a lot for your feedback. I have created the FLIP[1].
>>>> >
>>>> > Best,
>>>> > Hequn
>>>> >
>>>> > [1]
>>>> >
>>>> >
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP+96%3A+Support+Python+ML+Pipeline+API
>>>> >
>>>> > On Mon, Feb 10, 2020 at 12:29 PM Dian Fu <di...@gmail.com>
>>>> wrote:
>>>> >
>>>> > > Hi Hequn,
>>>> > >
>>>> > > Thanks for bringing up the discussion. +1 to this feature. The
>>>> design
>>>> > LGTM.
>>>> > > It's great that the Python ML users could use both the Java Pipeline
>>>> > > Transformer/Estimator/Model classes and the Python
>>>> > > Pipeline Transformer/Estimator/Model in the same job.
>>>> > >
>>>> > > Regards,
>>>> > > Dian
>>>> > >
>>>> > > On Mon, Feb 10, 2020 at 11:08 AM jincheng sun <
>>>> sunjincheng121@gmail.com>
>>>> > > wrote:
>>>> > >
>>>> > > > Hi Hequn,
>>>> > > >
>>>> > > > Thanks for bring up this discussion.
>>>> > > >
>>>> > > > +1 for add Python ML Pipeline API, even though the Java pipeline
>>>> API
>>>> > may
>>>> > > > change.
>>>> > > >
>>>> > > > I would like to suggest create a FLIP for this API changes. :)
>>>> > > >
>>>> > > > Best,
>>>> > > > Jincheng
>>>> > > >
>>>> > > >
>>>> > > > Hequn Cheng <he...@apache.org> 于2020年2月5日周三 下午5:24写道:
>>>> > > >
>>>> > > > > Hi everyone,
>>>> > > > >
>>>> > > > > FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI and
>>>> > > > introduces
>>>> > > > > a new set of Java APIs. As Python is widely used in ML areas,
>>>> > providing
>>>> > > > > Python ML Pipeline APIs for Flink can not only make it easier to
>>>> > write
>>>> > > ML
>>>> > > > > jobs for Python users but also broaden the adoption of Flink ML.
>>>> > > > >
>>>> > > > > Given this, Jincheng and I discussed offline about the support
>>>> of
>>>> > > Python
>>>> > > > ML
>>>> > > > > Pipeline API and drafted a design doc[2]. We'd like to achieve
>>>> three
>>>> > > > goals
>>>> > > > > for supporting Python Pipeline API:
>>>> > > > > - Add Python pipeline API according to Java pipeline API(we will
>>>> > adapt
>>>> > > > the
>>>> > > > > Python pipeline API if Java pipeline API changes).
>>>> > > > > - Support native Python Transformer/Estimator/Model, i.e.,
>>>> users can
>>>> > > > write
>>>> > > > > not only Python Transformer/Estimator/Model wrappers for
>>>> calling Java
>>>> > > > ones
>>>> > > > > but also can write native Python Transformer/Estimator/Models.
>>>> > > > > - Ease of use. Support keyword arguments when defining
>>>> parameters.
>>>> > > > >
>>>> > > > > More details can be found in the design doc and we are looking
>>>> > forward
>>>> > > to
>>>> > > > > your feedback.
>>>> > > > >
>>>> > > > > Best,
>>>> > > > > Hequn
>>>> > > > >
>>>> > > > > [1]
>>>> > > > >
>>>> > > > >
>>>> > > >
>>>> > >
>>>> >
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
>>>> > > > > [2]
>>>> > > > >
>>>> > > > >
>>>> > > >
>>>> > >
>>>> >
>>>> https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing
>>>> > > > >
>>>> > > >
>>>> > >
>>>> >
>>>>
>>>

Re: [DISCUSS] Support Python ML Pipeline API

Posted by Becket Qin <be...@gmail.com>.
Hi Hequn,

Given this is an addition to the public API, we probably should follow the
FLIP process. It would be a quick one though, I think.

Thanks,

Jiangjie (Becket) Qin

On Fri, Feb 14, 2020 at 10:03 AM Hequn Cheng <he...@apache.org> wrote:

> Hi all,
>
> Thanks a lot for your valuable feedback!
> As it seems we have reached a consensus on the discussion now. I have
> started a VOTE thread[1]. Looking forward to your vote.
>
> Best,
> Hequn
>
> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Support-Python-ML-Pipeline-API-td37637.html
>
> On Thu, Feb 13, 2020 at 10:40 AM Becket Qin <be...@gmail.com> wrote:
>
>> +1. I'd say this is almost a must-have for machine learning.
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On Thu, Feb 13, 2020 at 10:03 AM Rong Rong <wa...@gmail.com> wrote:
>>
>>> Thanks for driving this initiative @Hequn Cheng <he...@apache.org>.
>>>
>>> Moving towards python based ML is definitely a huge win consider how
>>> large
>>> the python-ML community is. a big +1 on my side!
>>> Regarding the doc, I only left a few comments on the specific APIs.
>>> overall
>>> the architecture looks very good!
>>>
>>> Looking forward to it!
>>> --
>>> Rong
>>>
>>> On Sun, Feb 9, 2020 at 10:28 PM Hequn Cheng <he...@apache.org> wrote:
>>>
>>> > Hi everyone,
>>> >
>>> > Thanks a lot for your feedback. I have created the FLIP[1].
>>> >
>>> > Best,
>>> > Hequn
>>> >
>>> > [1]
>>> >
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP+96%3A+Support+Python+ML+Pipeline+API
>>> >
>>> > On Mon, Feb 10, 2020 at 12:29 PM Dian Fu <di...@gmail.com>
>>> wrote:
>>> >
>>> > > Hi Hequn,
>>> > >
>>> > > Thanks for bringing up the discussion. +1 to this feature. The design
>>> > LGTM.
>>> > > It's great that the Python ML users could use both the Java Pipeline
>>> > > Transformer/Estimator/Model classes and the Python
>>> > > Pipeline Transformer/Estimator/Model in the same job.
>>> > >
>>> > > Regards,
>>> > > Dian
>>> > >
>>> > > On Mon, Feb 10, 2020 at 11:08 AM jincheng sun <
>>> sunjincheng121@gmail.com>
>>> > > wrote:
>>> > >
>>> > > > Hi Hequn,
>>> > > >
>>> > > > Thanks for bring up this discussion.
>>> > > >
>>> > > > +1 for add Python ML Pipeline API, even though the Java pipeline
>>> API
>>> > may
>>> > > > change.
>>> > > >
>>> > > > I would like to suggest create a FLIP for this API changes. :)
>>> > > >
>>> > > > Best,
>>> > > > Jincheng
>>> > > >
>>> > > >
>>> > > > Hequn Cheng <he...@apache.org> 于2020年2月5日周三 下午5:24写道:
>>> > > >
>>> > > > > Hi everyone,
>>> > > > >
>>> > > > > FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI and
>>> > > > introduces
>>> > > > > a new set of Java APIs. As Python is widely used in ML areas,
>>> > providing
>>> > > > > Python ML Pipeline APIs for Flink can not only make it easier to
>>> > write
>>> > > ML
>>> > > > > jobs for Python users but also broaden the adoption of Flink ML.
>>> > > > >
>>> > > > > Given this, Jincheng and I discussed offline about the support of
>>> > > Python
>>> > > > ML
>>> > > > > Pipeline API and drafted a design doc[2]. We'd like to achieve
>>> three
>>> > > > goals
>>> > > > > for supporting Python Pipeline API:
>>> > > > > - Add Python pipeline API according to Java pipeline API(we will
>>> > adapt
>>> > > > the
>>> > > > > Python pipeline API if Java pipeline API changes).
>>> > > > > - Support native Python Transformer/Estimator/Model, i.e., users
>>> can
>>> > > > write
>>> > > > > not only Python Transformer/Estimator/Model wrappers for calling
>>> Java
>>> > > > ones
>>> > > > > but also can write native Python Transformer/Estimator/Models.
>>> > > > > - Ease of use. Support keyword arguments when defining
>>> parameters.
>>> > > > >
>>> > > > > More details can be found in the design doc and we are looking
>>> > forward
>>> > > to
>>> > > > > your feedback.
>>> > > > >
>>> > > > > Best,
>>> > > > > Hequn
>>> > > > >
>>> > > > > [1]
>>> > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
>>> > > > > [2]
>>> > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>

Re: [DISCUSS] Support Python ML Pipeline API

Posted by Hequn Cheng <he...@apache.org>.
Hi all,

Thanks a lot for your valuable feedback!
As it seems we have reached a consensus on the discussion now. I have
started a VOTE thread[1]. Looking forward to your vote.

Best,
Hequn

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Support-Python-ML-Pipeline-API-td37637.html

On Thu, Feb 13, 2020 at 10:40 AM Becket Qin <be...@gmail.com> wrote:

> +1. I'd say this is almost a must-have for machine learning.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Thu, Feb 13, 2020 at 10:03 AM Rong Rong <wa...@gmail.com> wrote:
>
>> Thanks for driving this initiative @Hequn Cheng <he...@apache.org>.
>>
>> Moving towards python based ML is definitely a huge win consider how large
>> the python-ML community is. a big +1 on my side!
>> Regarding the doc, I only left a few comments on the specific APIs.
>> overall
>> the architecture looks very good!
>>
>> Looking forward to it!
>> --
>> Rong
>>
>> On Sun, Feb 9, 2020 at 10:28 PM Hequn Cheng <he...@apache.org> wrote:
>>
>> > Hi everyone,
>> >
>> > Thanks a lot for your feedback. I have created the FLIP[1].
>> >
>> > Best,
>> > Hequn
>> >
>> > [1]
>> >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP+96%3A+Support+Python+ML+Pipeline+API
>> >
>> > On Mon, Feb 10, 2020 at 12:29 PM Dian Fu <di...@gmail.com> wrote:
>> >
>> > > Hi Hequn,
>> > >
>> > > Thanks for bringing up the discussion. +1 to this feature. The design
>> > LGTM.
>> > > It's great that the Python ML users could use both the Java Pipeline
>> > > Transformer/Estimator/Model classes and the Python
>> > > Pipeline Transformer/Estimator/Model in the same job.
>> > >
>> > > Regards,
>> > > Dian
>> > >
>> > > On Mon, Feb 10, 2020 at 11:08 AM jincheng sun <
>> sunjincheng121@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi Hequn,
>> > > >
>> > > > Thanks for bring up this discussion.
>> > > >
>> > > > +1 for add Python ML Pipeline API, even though the Java pipeline API
>> > may
>> > > > change.
>> > > >
>> > > > I would like to suggest create a FLIP for this API changes. :)
>> > > >
>> > > > Best,
>> > > > Jincheng
>> > > >
>> > > >
>> > > > Hequn Cheng <he...@apache.org> 于2020年2月5日周三 下午5:24写道:
>> > > >
>> > > > > Hi everyone,
>> > > > >
>> > > > > FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI and
>> > > > introduces
>> > > > > a new set of Java APIs. As Python is widely used in ML areas,
>> > providing
>> > > > > Python ML Pipeline APIs for Flink can not only make it easier to
>> > write
>> > > ML
>> > > > > jobs for Python users but also broaden the adoption of Flink ML.
>> > > > >
>> > > > > Given this, Jincheng and I discussed offline about the support of
>> > > Python
>> > > > ML
>> > > > > Pipeline API and drafted a design doc[2]. We'd like to achieve
>> three
>> > > > goals
>> > > > > for supporting Python Pipeline API:
>> > > > > - Add Python pipeline API according to Java pipeline API(we will
>> > adapt
>> > > > the
>> > > > > Python pipeline API if Java pipeline API changes).
>> > > > > - Support native Python Transformer/Estimator/Model, i.e., users
>> can
>> > > > write
>> > > > > not only Python Transformer/Estimator/Model wrappers for calling
>> Java
>> > > > ones
>> > > > > but also can write native Python Transformer/Estimator/Models.
>> > > > > - Ease of use. Support keyword arguments when defining parameters.
>> > > > >
>> > > > > More details can be found in the design doc and we are looking
>> > forward
>> > > to
>> > > > > your feedback.
>> > > > >
>> > > > > Best,
>> > > > > Hequn
>> > > > >
>> > > > > [1]
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
>> > > > > [2]
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: [DISCUSS] Support Python ML Pipeline API

Posted by Becket Qin <be...@gmail.com>.
+1. I'd say this is almost a must-have for machine learning.

Thanks,

Jiangjie (Becket) Qin

On Thu, Feb 13, 2020 at 10:03 AM Rong Rong <wa...@gmail.com> wrote:

> Thanks for driving this initiative @Hequn Cheng <he...@apache.org>.
>
> Moving towards python based ML is definitely a huge win consider how large
> the python-ML community is. a big +1 on my side!
> Regarding the doc, I only left a few comments on the specific APIs. overall
> the architecture looks very good!
>
> Looking forward to it!
> --
> Rong
>
> On Sun, Feb 9, 2020 at 10:28 PM Hequn Cheng <he...@apache.org> wrote:
>
> > Hi everyone,
> >
> > Thanks a lot for your feedback. I have created the FLIP[1].
> >
> > Best,
> > Hequn
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP+96%3A+Support+Python+ML+Pipeline+API
> >
> > On Mon, Feb 10, 2020 at 12:29 PM Dian Fu <di...@gmail.com> wrote:
> >
> > > Hi Hequn,
> > >
> > > Thanks for bringing up the discussion. +1 to this feature. The design
> > LGTM.
> > > It's great that the Python ML users could use both the Java Pipeline
> > > Transformer/Estimator/Model classes and the Python
> > > Pipeline Transformer/Estimator/Model in the same job.
> > >
> > > Regards,
> > > Dian
> > >
> > > On Mon, Feb 10, 2020 at 11:08 AM jincheng sun <
> sunjincheng121@gmail.com>
> > > wrote:
> > >
> > > > Hi Hequn,
> > > >
> > > > Thanks for bring up this discussion.
> > > >
> > > > +1 for add Python ML Pipeline API, even though the Java pipeline API
> > may
> > > > change.
> > > >
> > > > I would like to suggest create a FLIP for this API changes. :)
> > > >
> > > > Best,
> > > > Jincheng
> > > >
> > > >
> > > > Hequn Cheng <he...@apache.org> 于2020年2月5日周三 下午5:24写道:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI and
> > > > introduces
> > > > > a new set of Java APIs. As Python is widely used in ML areas,
> > providing
> > > > > Python ML Pipeline APIs for Flink can not only make it easier to
> > write
> > > ML
> > > > > jobs for Python users but also broaden the adoption of Flink ML.
> > > > >
> > > > > Given this, Jincheng and I discussed offline about the support of
> > > Python
> > > > ML
> > > > > Pipeline API and drafted a design doc[2]. We'd like to achieve
> three
> > > > goals
> > > > > for supporting Python Pipeline API:
> > > > > - Add Python pipeline API according to Java pipeline API(we will
> > adapt
> > > > the
> > > > > Python pipeline API if Java pipeline API changes).
> > > > > - Support native Python Transformer/Estimator/Model, i.e., users
> can
> > > > write
> > > > > not only Python Transformer/Estimator/Model wrappers for calling
> Java
> > > > ones
> > > > > but also can write native Python Transformer/Estimator/Models.
> > > > > - Ease of use. Support keyword arguments when defining parameters.
> > > > >
> > > > > More details can be found in the design doc and we are looking
> > forward
> > > to
> > > > > your feedback.
> > > > >
> > > > > Best,
> > > > > Hequn
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
> > > > > [2]
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Support Python ML Pipeline API

Posted by Rong Rong <wa...@gmail.com>.
Thanks for driving this initiative @Hequn Cheng <he...@apache.org>.

Moving towards python based ML is definitely a huge win consider how large
the python-ML community is. a big +1 on my side!
Regarding the doc, I only left a few comments on the specific APIs. overall
the architecture looks very good!

Looking forward to it!
--
Rong

On Sun, Feb 9, 2020 at 10:28 PM Hequn Cheng <he...@apache.org> wrote:

> Hi everyone,
>
> Thanks a lot for your feedback. I have created the FLIP[1].
>
> Best,
> Hequn
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP+96%3A+Support+Python+ML+Pipeline+API
>
> On Mon, Feb 10, 2020 at 12:29 PM Dian Fu <di...@gmail.com> wrote:
>
> > Hi Hequn,
> >
> > Thanks for bringing up the discussion. +1 to this feature. The design
> LGTM.
> > It's great that the Python ML users could use both the Java Pipeline
> > Transformer/Estimator/Model classes and the Python
> > Pipeline Transformer/Estimator/Model in the same job.
> >
> > Regards,
> > Dian
> >
> > On Mon, Feb 10, 2020 at 11:08 AM jincheng sun <su...@gmail.com>
> > wrote:
> >
> > > Hi Hequn,
> > >
> > > Thanks for bring up this discussion.
> > >
> > > +1 for add Python ML Pipeline API, even though the Java pipeline API
> may
> > > change.
> > >
> > > I would like to suggest create a FLIP for this API changes. :)
> > >
> > > Best,
> > > Jincheng
> > >
> > >
> > > Hequn Cheng <he...@apache.org> 于2020年2月5日周三 下午5:24写道:
> > >
> > > > Hi everyone,
> > > >
> > > > FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI and
> > > introduces
> > > > a new set of Java APIs. As Python is widely used in ML areas,
> providing
> > > > Python ML Pipeline APIs for Flink can not only make it easier to
> write
> > ML
> > > > jobs for Python users but also broaden the adoption of Flink ML.
> > > >
> > > > Given this, Jincheng and I discussed offline about the support of
> > Python
> > > ML
> > > > Pipeline API and drafted a design doc[2]. We'd like to achieve three
> > > goals
> > > > for supporting Python Pipeline API:
> > > > - Add Python pipeline API according to Java pipeline API(we will
> adapt
> > > the
> > > > Python pipeline API if Java pipeline API changes).
> > > > - Support native Python Transformer/Estimator/Model, i.e., users can
> > > write
> > > > not only Python Transformer/Estimator/Model wrappers for calling Java
> > > ones
> > > > but also can write native Python Transformer/Estimator/Models.
> > > > - Ease of use. Support keyword arguments when defining parameters.
> > > >
> > > > More details can be found in the design doc and we are looking
> forward
> > to
> > > > your feedback.
> > > >
> > > > Best,
> > > > Hequn
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
> > > > [2]
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing
> > > >
> > >
> >
>

Re: [DISCUSS] Support Python ML Pipeline API

Posted by Hequn Cheng <he...@apache.org>.
Hi everyone,

Thanks a lot for your feedback. I have created the FLIP[1].

Best,
Hequn

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP+96%3A+Support+Python+ML+Pipeline+API

On Mon, Feb 10, 2020 at 12:29 PM Dian Fu <di...@gmail.com> wrote:

> Hi Hequn,
>
> Thanks for bringing up the discussion. +1 to this feature. The design LGTM.
> It's great that the Python ML users could use both the Java Pipeline
> Transformer/Estimator/Model classes and the Python
> Pipeline Transformer/Estimator/Model in the same job.
>
> Regards,
> Dian
>
> On Mon, Feb 10, 2020 at 11:08 AM jincheng sun <su...@gmail.com>
> wrote:
>
> > Hi Hequn,
> >
> > Thanks for bring up this discussion.
> >
> > +1 for add Python ML Pipeline API, even though the Java pipeline API may
> > change.
> >
> > I would like to suggest create a FLIP for this API changes. :)
> >
> > Best,
> > Jincheng
> >
> >
> > Hequn Cheng <he...@apache.org> 于2020年2月5日周三 下午5:24写道:
> >
> > > Hi everyone,
> > >
> > > FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI and
> > introduces
> > > a new set of Java APIs. As Python is widely used in ML areas, providing
> > > Python ML Pipeline APIs for Flink can not only make it easier to write
> ML
> > > jobs for Python users but also broaden the adoption of Flink ML.
> > >
> > > Given this, Jincheng and I discussed offline about the support of
> Python
> > ML
> > > Pipeline API and drafted a design doc[2]. We'd like to achieve three
> > goals
> > > for supporting Python Pipeline API:
> > > - Add Python pipeline API according to Java pipeline API(we will adapt
> > the
> > > Python pipeline API if Java pipeline API changes).
> > > - Support native Python Transformer/Estimator/Model, i.e., users can
> > write
> > > not only Python Transformer/Estimator/Model wrappers for calling Java
> > ones
> > > but also can write native Python Transformer/Estimator/Models.
> > > - Ease of use. Support keyword arguments when defining parameters.
> > >
> > > More details can be found in the design doc and we are looking forward
> to
> > > your feedback.
> > >
> > > Best,
> > > Hequn
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
> > > [2]
> > >
> > >
> >
> https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing
> > >
> >
>

Re: [DISCUSS] Support Python ML Pipeline API

Posted by Dian Fu <di...@gmail.com>.
Hi Hequn,

Thanks for bringing up the discussion. +1 to this feature. The design LGTM.
It's great that the Python ML users could use both the Java Pipeline
Transformer/Estimator/Model classes and the Python
Pipeline Transformer/Estimator/Model in the same job.

Regards,
Dian

On Mon, Feb 10, 2020 at 11:08 AM jincheng sun <su...@gmail.com>
wrote:

> Hi Hequn,
>
> Thanks for bring up this discussion.
>
> +1 for add Python ML Pipeline API, even though the Java pipeline API may
> change.
>
> I would like to suggest create a FLIP for this API changes. :)
>
> Best,
> Jincheng
>
>
> Hequn Cheng <he...@apache.org> 于2020年2月5日周三 下午5:24写道:
>
> > Hi everyone,
> >
> > FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI and
> introduces
> > a new set of Java APIs. As Python is widely used in ML areas, providing
> > Python ML Pipeline APIs for Flink can not only make it easier to write ML
> > jobs for Python users but also broaden the adoption of Flink ML.
> >
> > Given this, Jincheng and I discussed offline about the support of Python
> ML
> > Pipeline API and drafted a design doc[2]. We'd like to achieve three
> goals
> > for supporting Python Pipeline API:
> > - Add Python pipeline API according to Java pipeline API(we will adapt
> the
> > Python pipeline API if Java pipeline API changes).
> > - Support native Python Transformer/Estimator/Model, i.e., users can
> write
> > not only Python Transformer/Estimator/Model wrappers for calling Java
> ones
> > but also can write native Python Transformer/Estimator/Models.
> > - Ease of use. Support keyword arguments when defining parameters.
> >
> > More details can be found in the design doc and we are looking forward to
> > your feedback.
> >
> > Best,
> > Hequn
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
> > [2]
> >
> >
> https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing
> >
>

Re: [DISCUSS] Support Python ML Pipeline API

Posted by jincheng sun <su...@gmail.com>.
Hi Hequn,

Thanks for bring up this discussion.

+1 for add Python ML Pipeline API, even though the Java pipeline API may
change.

I would like to suggest create a FLIP for this API changes. :)

Best,
Jincheng


Hequn Cheng <he...@apache.org> 于2020年2月5日周三 下午5:24写道:

> Hi everyone,
>
> FLIP-39[1] rebuilds the Flink ML pipeline on top of TableAPI and introduces
> a new set of Java APIs. As Python is widely used in ML areas, providing
> Python ML Pipeline APIs for Flink can not only make it easier to write ML
> jobs for Python users but also broaden the adoption of Flink ML.
>
> Given this, Jincheng and I discussed offline about the support of Python ML
> Pipeline API and drafted a design doc[2]. We'd like to achieve three goals
> for supporting Python Pipeline API:
> - Add Python pipeline API according to Java pipeline API(we will adapt the
> Python pipeline API if Java pipeline API changes).
> - Support native Python Transformer/Estimator/Model, i.e., users can write
> not only Python Transformer/Estimator/Model wrappers for calling Java ones
> but also can write native Python Transformer/Estimator/Models.
> - Ease of use. Support keyword arguments when defining parameters.
>
> More details can be found in the design doc and we are looking forward to
> your feedback.
>
> Best,
> Hequn
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-39+Flink+ML+pipeline+and+ML+libs
> [2]
>
> https://docs.google.com/document/d/1fwSO5sRNWMoYuvNgfQJUV6N2n2q5UEVA4sezCljKcVQ/edit?usp=sharing
>