You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Kai Fu <zz...@gmail.com> on 2021/09/17 23:35:19 UTC

Built-in functions to manipulate MULTISET type

Hi team,

We want to know if there is any built-in function to extract the keys in
MULTISET
<https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/>
to be an ARRAY. There is no such function as far as we can find, except to
define a simple wrapper UDF for that, please advise.

-- 
*Best wishes,*
*- Kai*

Re: Built-in functions to manipulate MULTISET type

Posted by Yuval Itzchakov <yu...@gmail.com>.
Hi Seth,

You're right, but you still have to roll your own type inference for all
supported types you want, which isn't terrible but not ideal.

On Sun, Sep 19, 2021, 18:06 Seth Wiesman <sj...@gmail.com> wrote:

> Hi,
>
> I agree it would be great to see these functions built-in, but you do not
> need to write a UDF for each type. You can overload a UDFs type inference
> and have the same capabilities as built-in functions, which means
> supporting generics.
>
>
> https://github.com/apache/flink/blob/master/flink-examples/flink-examples-table/src/main/java/org/apache/flink/table/examples/java/functions/LastDatedValueFunction.java
>
> On Sat, Sep 18, 2021 at 7:42 AM Yuval Itzchakov <yu...@gmail.com> wrote:
>
>> Hi Jing,
>>
>> I recall there is already an open ticket for built-in aggregate functions
>>
>> On Sat, Sep 18, 2021, 15:08 JING ZHANG <be...@gmail.com> wrote:
>>
>>> Hi Yuval,
>>> You could open a JIRA to track this if you think some functions should
>>> be added as built-in functions in Flink.
>>>
>>> Best,
>>> JING ZHANG
>>>
>>> Yuval Itzchakov <yu...@gmail.com> 于2021年9月18日周六 下午3:33写道:
>>>
>>>> The problem with defining a UDF is that you have to create one overload
>>>> per key type in the MULTISET. It would be very convenient to have functions
>>>> like Snowflakes ARRAY_AGG.
>>>>
>>>> On Sat, Sep 18, 2021, 05:43 JING ZHANG <be...@gmail.com> wrote:
>>>>
>>>>> Hi Kai,
>>>>> AFAIK, there is no built-in function to extract the keys in MULTISET
>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/> to
>>>>> be an ARRAY. Define a UTF is a good solution.
>>>>>
>>>>> Best,
>>>>> JING ZHANG
>>>>>
>>>>> Kai Fu <zz...@gmail.com> 于2021年9月18日周六 上午7:35写道:
>>>>>
>>>>>> Hi team,
>>>>>>
>>>>>> We want to know if there is any built-in function to extract the keys
>>>>>> in MULTISET
>>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/>
>>>>>> to be an ARRAY. There is no such function as far as we can find, except to
>>>>>> define a simple wrapper UDF for that, please advise.
>>>>>>
>>>>>> --
>>>>>> *Best wishes,*
>>>>>> *- Kai*
>>>>>>
>>>>>

Re: Built-in functions to manipulate MULTISET type

Posted by Francesco Guardiani <fr...@ververica.com>.
Hi, for type strategies you can check out
org.apache.flink.table.types.inference.InputTypeStrategies. They are pretty
extensive and widely covers most use cases. In your case, this function
probably requires the COMMON type strategy. If you want to roll out your
own type inference, look at the
org.apache.flink.table.types.inference.strategies package, you can find all
the type strategies we have inside it.

On Mon, Sep 20, 2021 at 4:39 PM Seth Wiesman <sj...@gmail.com> wrote:

> The type strategy can be generic over the input and output types, so you
> can write something generic that say given a multiset of some type T this
> function returns an array of some type T. This is the exact same logic
> built-in functions use and is just as expressive as anything Flink could
> provide.
>
> Seth
>
> On Mon, Sep 20, 2021 at 1:26 AM Kai Fu <zz...@gmail.com> wrote:
>
>> Hi Seth,
>>
>> This is really helpful and inspiring, thank you for the information.
>>
>> On Sun, Sep 19, 2021 at 11:06 PM Seth Wiesman <sj...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I agree it would be great to see these functions built-in, but you do
>>> not need to write a UDF for each type. You can overload a UDFs type
>>> inference and have the same capabilities as built-in functions, which means
>>> supporting generics.
>>>
>>>
>>> https://github.com/apache/flink/blob/master/flink-examples/flink-examples-table/src/main/java/org/apache/flink/table/examples/java/functions/LastDatedValueFunction.java
>>>
>>> On Sat, Sep 18, 2021 at 7:42 AM Yuval Itzchakov <yu...@gmail.com>
>>> wrote:
>>>
>>>> Hi Jing,
>>>>
>>>> I recall there is already an open ticket for built-in aggregate
>>>> functions
>>>>
>>>> On Sat, Sep 18, 2021, 15:08 JING ZHANG <be...@gmail.com> wrote:
>>>>
>>>>> Hi Yuval,
>>>>> You could open a JIRA to track this if you think some functions should
>>>>> be added as built-in functions in Flink.
>>>>>
>>>>> Best,
>>>>> JING ZHANG
>>>>>
>>>>> Yuval Itzchakov <yu...@gmail.com> 于2021年9月18日周六 下午3:33写道:
>>>>>
>>>>>> The problem with defining a UDF is that you have to create one
>>>>>> overload per key type in the MULTISET. It would be very convenient to have
>>>>>> functions like Snowflakes ARRAY_AGG.
>>>>>>
>>>>>> On Sat, Sep 18, 2021, 05:43 JING ZHANG <be...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Kai,
>>>>>>> AFAIK, there is no built-in function to extract the keys in MULTISET
>>>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/> to
>>>>>>> be an ARRAY. Define a UTF is a good solution.
>>>>>>>
>>>>>>> Best,
>>>>>>> JING ZHANG
>>>>>>>
>>>>>>> Kai Fu <zz...@gmail.com> 于2021年9月18日周六 上午7:35写道:
>>>>>>>
>>>>>>>> Hi team,
>>>>>>>>
>>>>>>>> We want to know if there is any built-in function to extract the
>>>>>>>> keys in MULTISET
>>>>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/>
>>>>>>>> to be an ARRAY. There is no such function as far as we can find, except to
>>>>>>>> define a simple wrapper UDF for that, please advise.
>>>>>>>>
>>>>>>>> --
>>>>>>>> *Best wishes,*
>>>>>>>> *- Kai*
>>>>>>>>
>>>>>>>
>>
>> --
>> *Best wishes,*
>> *- Kai*
>>
>

Re: Built-in functions to manipulate MULTISET type

Posted by Seth Wiesman <sj...@gmail.com>.
The type strategy can be generic over the input and output types, so you
can write something generic that say given a multiset of some type T this
function returns an array of some type T. This is the exact same logic
built-in functions use and is just as expressive as anything Flink could
provide.

Seth

On Mon, Sep 20, 2021 at 1:26 AM Kai Fu <zz...@gmail.com> wrote:

> Hi Seth,
>
> This is really helpful and inspiring, thank you for the information.
>
> On Sun, Sep 19, 2021 at 11:06 PM Seth Wiesman <sj...@gmail.com> wrote:
>
>> Hi,
>>
>> I agree it would be great to see these functions built-in, but you do not
>> need to write a UDF for each type. You can overload a UDFs type inference
>> and have the same capabilities as built-in functions, which means
>> supporting generics.
>>
>>
>> https://github.com/apache/flink/blob/master/flink-examples/flink-examples-table/src/main/java/org/apache/flink/table/examples/java/functions/LastDatedValueFunction.java
>>
>> On Sat, Sep 18, 2021 at 7:42 AM Yuval Itzchakov <yu...@gmail.com>
>> wrote:
>>
>>> Hi Jing,
>>>
>>> I recall there is already an open ticket for built-in aggregate functions
>>>
>>> On Sat, Sep 18, 2021, 15:08 JING ZHANG <be...@gmail.com> wrote:
>>>
>>>> Hi Yuval,
>>>> You could open a JIRA to track this if you think some functions should
>>>> be added as built-in functions in Flink.
>>>>
>>>> Best,
>>>> JING ZHANG
>>>>
>>>> Yuval Itzchakov <yu...@gmail.com> 于2021年9月18日周六 下午3:33写道:
>>>>
>>>>> The problem with defining a UDF is that you have to create one
>>>>> overload per key type in the MULTISET. It would be very convenient to have
>>>>> functions like Snowflakes ARRAY_AGG.
>>>>>
>>>>> On Sat, Sep 18, 2021, 05:43 JING ZHANG <be...@gmail.com> wrote:
>>>>>
>>>>>> Hi Kai,
>>>>>> AFAIK, there is no built-in function to extract the keys in MULTISET
>>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/> to
>>>>>> be an ARRAY. Define a UTF is a good solution.
>>>>>>
>>>>>> Best,
>>>>>> JING ZHANG
>>>>>>
>>>>>> Kai Fu <zz...@gmail.com> 于2021年9月18日周六 上午7:35写道:
>>>>>>
>>>>>>> Hi team,
>>>>>>>
>>>>>>> We want to know if there is any built-in function to extract the
>>>>>>> keys in MULTISET
>>>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/>
>>>>>>> to be an ARRAY. There is no such function as far as we can find, except to
>>>>>>> define a simple wrapper UDF for that, please advise.
>>>>>>>
>>>>>>> --
>>>>>>> *Best wishes,*
>>>>>>> *- Kai*
>>>>>>>
>>>>>>
>
> --
> *Best wishes,*
> *- Kai*
>

Re: Built-in functions to manipulate MULTISET type

Posted by Kai Fu <zz...@gmail.com>.
Hi Seth,

This is really helpful and inspiring, thank you for the information.

On Sun, Sep 19, 2021 at 11:06 PM Seth Wiesman <sj...@gmail.com> wrote:

> Hi,
>
> I agree it would be great to see these functions built-in, but you do not
> need to write a UDF for each type. You can overload a UDFs type inference
> and have the same capabilities as built-in functions, which means
> supporting generics.
>
>
> https://github.com/apache/flink/blob/master/flink-examples/flink-examples-table/src/main/java/org/apache/flink/table/examples/java/functions/LastDatedValueFunction.java
>
> On Sat, Sep 18, 2021 at 7:42 AM Yuval Itzchakov <yu...@gmail.com> wrote:
>
>> Hi Jing,
>>
>> I recall there is already an open ticket for built-in aggregate functions
>>
>> On Sat, Sep 18, 2021, 15:08 JING ZHANG <be...@gmail.com> wrote:
>>
>>> Hi Yuval,
>>> You could open a JIRA to track this if you think some functions should
>>> be added as built-in functions in Flink.
>>>
>>> Best,
>>> JING ZHANG
>>>
>>> Yuval Itzchakov <yu...@gmail.com> 于2021年9月18日周六 下午3:33写道:
>>>
>>>> The problem with defining a UDF is that you have to create one overload
>>>> per key type in the MULTISET. It would be very convenient to have functions
>>>> like Snowflakes ARRAY_AGG.
>>>>
>>>> On Sat, Sep 18, 2021, 05:43 JING ZHANG <be...@gmail.com> wrote:
>>>>
>>>>> Hi Kai,
>>>>> AFAIK, there is no built-in function to extract the keys in MULTISET
>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/> to
>>>>> be an ARRAY. Define a UTF is a good solution.
>>>>>
>>>>> Best,
>>>>> JING ZHANG
>>>>>
>>>>> Kai Fu <zz...@gmail.com> 于2021年9月18日周六 上午7:35写道:
>>>>>
>>>>>> Hi team,
>>>>>>
>>>>>> We want to know if there is any built-in function to extract the keys
>>>>>> in MULTISET
>>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/>
>>>>>> to be an ARRAY. There is no such function as far as we can find, except to
>>>>>> define a simple wrapper UDF for that, please advise.
>>>>>>
>>>>>> --
>>>>>> *Best wishes,*
>>>>>> *- Kai*
>>>>>>
>>>>>

-- 
*Best wishes,*
*- Kai*

Re: Built-in functions to manipulate MULTISET type

Posted by Seth Wiesman <sj...@gmail.com>.
Hi,

I agree it would be great to see these functions built-in, but you do not
need to write a UDF for each type. You can overload a UDFs type inference
and have the same capabilities as built-in functions, which means
supporting generics.

https://github.com/apache/flink/blob/master/flink-examples/flink-examples-table/src/main/java/org/apache/flink/table/examples/java/functions/LastDatedValueFunction.java

On Sat, Sep 18, 2021 at 7:42 AM Yuval Itzchakov <yu...@gmail.com> wrote:

> Hi Jing,
>
> I recall there is already an open ticket for built-in aggregate functions
>
> On Sat, Sep 18, 2021, 15:08 JING ZHANG <be...@gmail.com> wrote:
>
>> Hi Yuval,
>> You could open a JIRA to track this if you think some functions should be
>> added as built-in functions in Flink.
>>
>> Best,
>> JING ZHANG
>>
>> Yuval Itzchakov <yu...@gmail.com> 于2021年9月18日周六 下午3:33写道:
>>
>>> The problem with defining a UDF is that you have to create one overload
>>> per key type in the MULTISET. It would be very convenient to have functions
>>> like Snowflakes ARRAY_AGG.
>>>
>>> On Sat, Sep 18, 2021, 05:43 JING ZHANG <be...@gmail.com> wrote:
>>>
>>>> Hi Kai,
>>>> AFAIK, there is no built-in function to extract the keys in MULTISET
>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/> to
>>>> be an ARRAY. Define a UTF is a good solution.
>>>>
>>>> Best,
>>>> JING ZHANG
>>>>
>>>> Kai Fu <zz...@gmail.com> 于2021年9月18日周六 上午7:35写道:
>>>>
>>>>> Hi team,
>>>>>
>>>>> We want to know if there is any built-in function to extract the keys
>>>>> in MULTISET
>>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/>
>>>>> to be an ARRAY. There is no such function as far as we can find, except to
>>>>> define a simple wrapper UDF for that, please advise.
>>>>>
>>>>> --
>>>>> *Best wishes,*
>>>>> *- Kai*
>>>>>
>>>>

Re: Built-in functions to manipulate MULTISET type

Posted by Yuval Itzchakov <yu...@gmail.com>.
Hi Jing,

I recall there is already an open ticket for built-in aggregate functions

On Sat, Sep 18, 2021, 15:08 JING ZHANG <be...@gmail.com> wrote:

> Hi Yuval,
> You could open a JIRA to track this if you think some functions should be
> added as built-in functions in Flink.
>
> Best,
> JING ZHANG
>
> Yuval Itzchakov <yu...@gmail.com> 于2021年9月18日周六 下午3:33写道:
>
>> The problem with defining a UDF is that you have to create one overload
>> per key type in the MULTISET. It would be very convenient to have functions
>> like Snowflakes ARRAY_AGG.
>>
>> On Sat, Sep 18, 2021, 05:43 JING ZHANG <be...@gmail.com> wrote:
>>
>>> Hi Kai,
>>> AFAIK, there is no built-in function to extract the keys in MULTISET
>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/> to
>>> be an ARRAY. Define a UTF is a good solution.
>>>
>>> Best,
>>> JING ZHANG
>>>
>>> Kai Fu <zz...@gmail.com> 于2021年9月18日周六 上午7:35写道:
>>>
>>>> Hi team,
>>>>
>>>> We want to know if there is any built-in function to extract the keys
>>>> in MULTISET
>>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/>
>>>> to be an ARRAY. There is no such function as far as we can find, except to
>>>> define a simple wrapper UDF for that, please advise.
>>>>
>>>> --
>>>> *Best wishes,*
>>>> *- Kai*
>>>>
>>>

Re: Built-in functions to manipulate MULTISET type

Posted by JING ZHANG <be...@gmail.com>.
Hi Yuval,
You could open a JIRA to track this if you think some functions should be
added as built-in functions in Flink.

Best,
JING ZHANG

Yuval Itzchakov <yu...@gmail.com> 于2021年9月18日周六 下午3:33写道:

> The problem with defining a UDF is that you have to create one overload
> per key type in the MULTISET. It would be very convenient to have functions
> like Snowflakes ARRAY_AGG.
>
> On Sat, Sep 18, 2021, 05:43 JING ZHANG <be...@gmail.com> wrote:
>
>> Hi Kai,
>> AFAIK, there is no built-in function to extract the keys in MULTISET
>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/> to
>> be an ARRAY. Define a UTF is a good solution.
>>
>> Best,
>> JING ZHANG
>>
>> Kai Fu <zz...@gmail.com> 于2021年9月18日周六 上午7:35写道:
>>
>>> Hi team,
>>>
>>> We want to know if there is any built-in function to extract the keys in
>>> MULTISET
>>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/>
>>> to be an ARRAY. There is no such function as far as we can find, except to
>>> define a simple wrapper UDF for that, please advise.
>>>
>>> --
>>> *Best wishes,*
>>> *- Kai*
>>>
>>

Re: Built-in functions to manipulate MULTISET type

Posted by Yuval Itzchakov <yu...@gmail.com>.
The problem with defining a UDF is that you have to create one overload per
key type in the MULTISET. It would be very convenient to have functions
like Snowflakes ARRAY_AGG.

On Sat, Sep 18, 2021, 05:43 JING ZHANG <be...@gmail.com> wrote:

> Hi Kai,
> AFAIK, there is no built-in function to extract the keys in MULTISET
> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/> to
> be an ARRAY. Define a UTF is a good solution.
>
> Best,
> JING ZHANG
>
> Kai Fu <zz...@gmail.com> 于2021年9月18日周六 上午7:35写道:
>
>> Hi team,
>>
>> We want to know if there is any built-in function to extract the keys in
>> MULTISET
>> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/>
>> to be an ARRAY. There is no such function as far as we can find, except to
>> define a simple wrapper UDF for that, please advise.
>>
>> --
>> *Best wishes,*
>> *- Kai*
>>
>

Re: Built-in functions to manipulate MULTISET type

Posted by JING ZHANG <be...@gmail.com>.
Hi Kai,
AFAIK, there is no built-in function to extract the keys in MULTISET
<https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/>
to
be an ARRAY. Define a UTF is a good solution.

Best,
JING ZHANG

Kai Fu <zz...@gmail.com> 于2021年9月18日周六 上午7:35写道:

> Hi team,
>
> We want to know if there is any built-in function to extract the keys in
> MULTISET
> <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/>
> to be an ARRAY. There is no such function as far as we can find, except to
> define a simple wrapper UDF for that, please advise.
>
> --
> *Best wishes,*
> *- Kai*
>