You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Xingbo Huang <hx...@gmail.com> on 2020/02/03 07:01:43 UTC

[DISCUSS] Support User-Defined Table Function in PyFlink

Hi all,

The scalar Python UDF has already been supported in coming release of 1.10,
we’d like to introduce Python UDTF now. FLIP-58[1] has already introduced
some content about Python UDTF. However, the implementation details are
still not touched. I have drafted a design doc[2]. It includes the
following items:

- How to define Python UDTF.

- The introduced rules for Python UDTF.

- How to execute Python UDTF.

Because the implementation relies on Beam's portability framework for
Python user-defined table function execution and not all the contributors
are familiar with it, I have done a prototype[3].

Welcome any feedback.

Best,

Xingbo

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table

[2]
https://docs.google.com/document/d/1Pkv5S0geoYQ2ySS5YTTBivJ3hoi-uzLXVQkDVIaR0cE/edit#heading=h.pzeztvig3kg1
[3] https://github.com/HuangXingBo/flink/commits/FLINK-UDTF

Re: [DISCUSS] Support User-Defined Table Function in PyFlink

Posted by Xingbo Huang <hx...@gmail.com>.
Hi Jincheng,

Thanks for your feed back. The more details we can discussed in the JIRA
and PR. :)

Best,
Xingbo

jincheng sun <su...@gmail.com> 于2020年2月4日周二 下午9:09写道:

> Thanks for bring up this discussion Xingbo!
>
> The the design is pretty nice for me! This feature is really need which
> mentioned in FLIP-58. So, I think is better to create the JIRA and open the
> PR, then more detail can be reviewed. :)
>
> Best,
> Jincheng
>
>
>
> Xingbo Huang <hx...@gmail.com> 于2020年2月3日周一 下午3:02写道:
>
> > Hi all,
> >
> > The scalar Python UDF has already been supported in coming release of
> 1.10,
> > we’d like to introduce Python UDTF now. FLIP-58[1] has already introduced
> > some content about Python UDTF. However, the implementation details are
> > still not touched. I have drafted a design doc[2]. It includes the
> > following items:
> >
> > - How to define Python UDTF.
> >
> > - The introduced rules for Python UDTF.
> >
> > - How to execute Python UDTF.
> >
> > Because the implementation relies on Beam's portability framework for
> > Python user-defined table function execution and not all the contributors
> > are familiar with it, I have done a prototype[3].
> >
> > Welcome any feedback.
> >
> > Best,
> >
> > Xingbo
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
> >
> > [2]
> >
> >
> https://docs.google.com/document/d/1Pkv5S0geoYQ2ySS5YTTBivJ3hoi-uzLXVQkDVIaR0cE/edit#heading=h.pzeztvig3kg1
> > [3] https://github.com/HuangXingBo/flink/commits/FLINK-UDTF
> >
>

Re: [DISCUSS] Support User-Defined Table Function in PyFlink

Posted by Xingbo Huang <hx...@gmail.com>.
Hi Hequn,

Thanks for your feedback. Good suggestion. I will avoid Scala code in the
flink-table module.

Best,
Xingbo

Hequn Cheng <he...@apache.org> 于2020年2月4日周二 下午10:14写道:

> Hi Xingbo,
>
> Thanks a lot for bringing up the discussion. Looks good from my side.
> One suggestion beyond the document: it would be nice to avoid Scala code in
> the flink-table module since we would like to get rid of Scala in the
> long-term[1][2].
>
> Best, Hequn
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free
> [2]
> https://flink.apache.org/contributing/code-style-and-quality-scala.html
>
>
> On Tue, Feb 4, 2020 at 9:09 PM jincheng sun <su...@gmail.com>
> wrote:
>
> > Thanks for bring up this discussion Xingbo!
> >
> > The the design is pretty nice for me! This feature is really need which
> > mentioned in FLIP-58. So, I think is better to create the JIRA and open
> the
> > PR, then more detail can be reviewed. :)
> >
> > Best,
> > Jincheng
> >
> >
> >
> > Xingbo Huang <hx...@gmail.com> 于2020年2月3日周一 下午3:02写道:
> >
> > > Hi all,
> > >
> > > The scalar Python UDF has already been supported in coming release of
> > 1.10,
> > > we’d like to introduce Python UDTF now. FLIP-58[1] has already
> introduced
> > > some content about Python UDTF. However, the implementation details are
> > > still not touched. I have drafted a design doc[2]. It includes the
> > > following items:
> > >
> > > - How to define Python UDTF.
> > >
> > > - The introduced rules for Python UDTF.
> > >
> > > - How to execute Python UDTF.
> > >
> > > Because the implementation relies on Beam's portability framework for
> > > Python user-defined table function execution and not all the
> contributors
> > > are familiar with it, I have done a prototype[3].
> > >
> > > Welcome any feedback.
> > >
> > > Best,
> > >
> > > Xingbo
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
> > >
> > > [2]
> > >
> > >
> >
> https://docs.google.com/document/d/1Pkv5S0geoYQ2ySS5YTTBivJ3hoi-uzLXVQkDVIaR0cE/edit#heading=h.pzeztvig3kg1
> > > [3] https://github.com/HuangXingBo/flink/commits/FLINK-UDTF
> > >
> >
>

Re: [DISCUSS] Support User-Defined Table Function in PyFlink

Posted by Hequn Cheng <he...@apache.org>.
Hi Xingbo,

Thanks a lot for bringing up the discussion. Looks good from my side.
One suggestion beyond the document: it would be nice to avoid Scala code in
the flink-table module since we would like to get rid of Scala in the
long-term[1][2].

Best, Hequn

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free
[2] https://flink.apache.org/contributing/code-style-and-quality-scala.html


On Tue, Feb 4, 2020 at 9:09 PM jincheng sun <su...@gmail.com>
wrote:

> Thanks for bring up this discussion Xingbo!
>
> The the design is pretty nice for me! This feature is really need which
> mentioned in FLIP-58. So, I think is better to create the JIRA and open the
> PR, then more detail can be reviewed. :)
>
> Best,
> Jincheng
>
>
>
> Xingbo Huang <hx...@gmail.com> 于2020年2月3日周一 下午3:02写道:
>
> > Hi all,
> >
> > The scalar Python UDF has already been supported in coming release of
> 1.10,
> > we’d like to introduce Python UDTF now. FLIP-58[1] has already introduced
> > some content about Python UDTF. However, the implementation details are
> > still not touched. I have drafted a design doc[2]. It includes the
> > following items:
> >
> > - How to define Python UDTF.
> >
> > - The introduced rules for Python UDTF.
> >
> > - How to execute Python UDTF.
> >
> > Because the implementation relies on Beam's portability framework for
> > Python user-defined table function execution and not all the contributors
> > are familiar with it, I have done a prototype[3].
> >
> > Welcome any feedback.
> >
> > Best,
> >
> > Xingbo
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
> >
> > [2]
> >
> >
> https://docs.google.com/document/d/1Pkv5S0geoYQ2ySS5YTTBivJ3hoi-uzLXVQkDVIaR0cE/edit#heading=h.pzeztvig3kg1
> > [3] https://github.com/HuangXingBo/flink/commits/FLINK-UDTF
> >
>

Re: [DISCUSS] Support User-Defined Table Function in PyFlink

Posted by jincheng sun <su...@gmail.com>.
Thanks for bring up this discussion Xingbo!

The the design is pretty nice for me! This feature is really need which
mentioned in FLIP-58. So, I think is better to create the JIRA and open the
PR, then more detail can be reviewed. :)

Best,
Jincheng



Xingbo Huang <hx...@gmail.com> 于2020年2月3日周一 下午3:02写道:

> Hi all,
>
> The scalar Python UDF has already been supported in coming release of 1.10,
> we’d like to introduce Python UDTF now. FLIP-58[1] has already introduced
> some content about Python UDTF. However, the implementation details are
> still not touched. I have drafted a design doc[2]. It includes the
> following items:
>
> - How to define Python UDTF.
>
> - The introduced rules for Python UDTF.
>
> - How to execute Python UDTF.
>
> Because the implementation relies on Beam's portability framework for
> Python user-defined table function execution and not all the contributors
> are familiar with it, I have done a prototype[3].
>
> Welcome any feedback.
>
> Best,
>
> Xingbo
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
>
> [2]
>
> https://docs.google.com/document/d/1Pkv5S0geoYQ2ySS5YTTBivJ3hoi-uzLXVQkDVIaR0cE/edit#heading=h.pzeztvig3kg1
> [3] https://github.com/HuangXingBo/flink/commits/FLINK-UDTF
>