You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Martijn Visser <ma...@ververica.com> on 2022/01/03 13:20:07 UTC

[DISCUSS] Drop Gelly

Hi everyone,

Flink is bundled with Gelly, a Graph API library [1]. This has been marked
as approaching end-of-life for quite some time [2].

Gelly is built on top of Flink's DataSet API, which is deprecated and
slowly being phased out [3]. It only works on batch jobs. Based on the
activity in the Dev and User mailing lists, I don't see a lot of questions
popping up regarding the usage of Gelly. Removing Gelly would reduce CI
time and resources because we won't need to run tests for this anymore.

I'm cross-posting this to the User mailing list to see if there are any
users of Gelly at the moment.

Let me know your thoughts.

Martijn Visser | Product Manager

martijn@ververica.com


[1]
https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/

[2] https://flink.apache.org/roadmap.html

[3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz

<https://www.ververica.com/>


Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

Re: [DISCUSS] Drop Gelly

Posted by Jing Ge <ji...@ververica.com>.
Hi,

thanks Martijn for bringing it up for discussion. I think we could make the
discussion a little bit clearer by splitting it into two questions:

1. should Flink drop Gelly?
2. should Flink drop the graph computing?

The answer of the first question could be yes, since there have been no
changes for years. +1 for dropping Gelly.

But for the second question, I would suggest answering it with no or with
strategic yes and will definitely support it again in the near future,
because there are many use cases that could be solved by streaming/batch +
graph in a more elegant way. Afaik fintech companies have a lot of those
use cases[1]. It would be great if we could find a way to drop Gelly but
keep the graph computing ability within Flink's ecosystem.

Best regards
Jing

[1]
https://california18.com/the-ant-graph-calculation-is-upgraded-to-tugraph-and-it-won-the-2021-world-internet-leading-scientific-and-technological-achievement-award/2117052021/


On Tue, Jan 4, 2022 at 2:02 PM Zhipeng Zhang <zh...@gmail.com>
wrote:

> Hi Martijin,
>
> Thanks for the feedback. I am not proposing  to bundle the new graph
> library with Alink. I am +1 for dropping the DataSet-based Gelly library,
> but we probably need a new graph library in Flink for the possible
> migration.
>
> We haven't decided what to do yet and probably need more discussion. There
> are some possible solutions:
> 1. We include a new DataStream-based graph library in FlinkML[1], given
> that graphs and machine learning algorithms are more often used together
> [2][3][4]. To achieve this, we could reuse the `AlgoOperator` interface in
> FlinkML.
> 2. We include a new DataStream-based graph library as a separate
> module/repo. This is consistent with existing libraries like Spark [5].
>
> What do you think?
>
>
> [1] https://github.com/apache/flink-ml
> [2] https://arxiv.org/abs/1403.6652
> [3] https://arxiv.org/abs/1503.03578
> [4] https://github.com/apache/spark
>
> Best,
> Zhipeng
>
> Martijn Visser <ma...@ververica.com> 于2022年1月4日周二 15:27写道:
>
>> Hi Zhipeng,
>>
>> Good that you've reached out, I wasn't aware that Gelly is being used in
>> Alink. Are you proposing to write a new graph library as a successor of
>> Gelly and bundle that with Alink?
>>
>> Best regards,
>>
>> Martijn
>>
>> On Tue, 4 Jan 2022 at 02:57, Zhipeng Zhang <zh...@gmail.com>
>> wrote:
>>
>>> Hi everyone,
>>>
>>> Thanks for starting the discussion :)
>>>
>>> We (Alink team [1]) are actually using part of the Gelly library to
>>> support graph algorithms (connected component, single source shortest path,
>>> etc.) for users in Alibaba Inc.
>>>
>>> As DataSet API is going to be dropped, shall we also provide a new graph
>>> library based on DataStream runtime (similar as we did for machine
>>> learning)?
>>>
>>> [1] https://github.com/Alibaba/alink
>>>
>>> David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:
>>>
>>>> Most of the inquiries I've had about Gelly in recent memory have been
>>>> from folks looking for a streaming solution, and it's only been a handful.
>>>>
>>>> +1 for dropping Gelly
>>>>
>>>> David
>>>>
>>>> On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org>
>>>> wrote:
>>>>
>>>>> I haven't seen any changes or requests to/for Gelly in ages. Hence, I
>>>>> would assume that it is not really used and can be removed.
>>>>>
>>>>> +1 for dropping Gelly.
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com>
>>>>> wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>> Flink is bundled with Gelly, a Graph API library [1]. This has been
>>>>>> marked as approaching end-of-life for quite some time [2].
>>>>>>
>>>>>> Gelly is built on top of Flink's DataSet API, which is deprecated and
>>>>>> slowly being phased out [3]. It only works on batch jobs. Based on the
>>>>>> activity in the Dev and User mailing lists, I don't see a lot of questions
>>>>>> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
>>>>>> time and resources because we won't need to run tests for this anymore.
>>>>>>
>>>>>> I'm cross-posting this to the User mailing list to see if there are
>>>>>> any users of Gelly at the moment.
>>>>>>
>>>>>> Let me know your thoughts.
>>>>>>
>>>>>> Martijn Visser | Product Manager
>>>>>>
>>>>>> martijn@ververica.com
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>>>>>>
>>>>>> [2] https://flink.apache.org/roadmap.html
>>>>>>
>>>>>> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>>>>>>
>>>>>> <https://www.ververica.com/>
>>>>>>
>>>>>>
>>>>>> Follow us @VervericaData
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>>>>> Conference
>>>>>>
>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>
>>>>>>
>>>
>>> --
>>> best,
>>> Zhipeng
>>>
>>>
>
> --
> best,
> Zhipeng
>
>

Re: [DISCUSS] Drop Gelly

Posted by Martijn Visser <ma...@apache.org>.
Hi everyone,

I'm reviving this really old discussion thread, but I just stumbled across Gelly again and realized that this discussion never was finished. 

I'll open up a vote thread for dropping the current DataSet based Gelly library. 

Best regards,

Martijn

On 2022/01/05 03:37:18 Yun Gao wrote:
> Hi,
> 
> Very thanks for initiating the discussion!
> 
> Also +1 to drop the current DataSet based Gelly library so that we could finally drop the 
> legacy DataSet API. 
> 
> For whether to keep the graph computing ability, from my side graph query / graph computing and
> chaining them with the preprocessing pipeline should be an actually existent requirements. 
> Currently we also already have the basis for a graph computing library on DataStream API
> with the new iteration library[1], thus it would be already feasible to have a stream / batch
> unified graph computing library on top of the DataStream API. And it would indeed be most suitable as 
> a separate ecosystem project. 
> 
> Best,
> Yun
> 
> [1] https://cwiki.apache.org/confluence/x/hAEBCw
> 
> 
>  ------------------Original Mail ------------------
> Sender:Martijn Visser <ma...@ververica.com>
> Send Date:Wed Jan 5 02:58:53 2022
> Recipients:Zhipeng Zhang <zh...@gmail.com>
> CC:David Anderson <da...@apache.org>, Till Rohrmann <tr...@apache.org>, dev <de...@flink.apache.org>, User <us...@flink.apache.org>
> Subject:Re: [DISCUSS] Drop Gelly
> 
> Hi Zhipeng,
> 
> I think that we're seeing more code being externalised, for example with the Flink Remote Shuffle service [1] and the ongoing discussion on the external connector repository [2], it makes sense to go for your second option. Maybe it fits under Flink Extended [3]. 
> 
> The main question becomes who can contribute and maintain this library. Another (intermediate) solution might also be to find someone who can migrate/move the current Gelly codebase to use Flink's DataStream API in batch mode, so it wouldn't be using the DataSet API anymore. This has recently also happened with the State Processor API [4]. 
> 
> Best regards,
> 
> Martijn
> 
> [1] https://github.com/flink-extended/flink-remote-shuffle
> [2] https://lists.apache.org/thread/bywh947r2f5hfocxq598zhyh06zhksrm
> [3] https://github.com/flink-extended/
> [4] https://issues.apache.org/jira/browse/FLINK-24912
> On Tue, 4 Jan 2022 at 14:01, Zhipeng Zhang <zh...@gmail.com> wrote:
> 
> Hi Martijin,
> 
> Thanks for the feedback. I am not proposing  to bundle the new graph library with Alink. I am +1 for dropping the DataSet-based Gelly library, but we probably need a new graph library in Flink for the possible migration.
> 
> We haven't decided what to do yet and probably need more discussion. There are some possible solutions:
> 1. We include a new DataStream-based graph library in FlinkML[1], given that graphs and machine learning algorithms are more often used together [2][3][4]. To achieve this, we could reuse the `AlgoOperator` interface in FlinkML.
> 2. We include a new DataStream-based graph library as a separate module/repo. This is consistent with existing libraries like Spark [5].
> 
> What do you think?
> 
> 
> [1] https://github.com/apache/flink-ml
> [2] https://arxiv.org/abs/1403.6652
> [3] https://arxiv.org/abs/1503.03578
> [4] https://github.com/apache/spark
> 
> Best,
> Zhipeng
> Martijn Visser <ma...@ververica.com> 于2022年1月4日周二 15:27写道:
> 
> Hi Zhipeng,
> 
> Good that you've reached out, I wasn't aware that Gelly is being used in Alink. Are you proposing to write a new graph library as a successor of Gelly and bundle that with Alink? 
> 
> Best regards,
> 
> Martijn
> On Tue, 4 Jan 2022 at 02:57, Zhipeng Zhang <zh...@gmail.com> wrote:
> 
> Hi everyone,
> 
> Thanks for starting the discussion :)
> 
> We (Alink team [1]) are actually using part of the Gelly library to support graph algorithms (connected component, single source shortest path, etc.) for users in Alibaba Inc.
> 
> As DataSet API is going to be dropped, shall we also provide a new graph library based on DataStream runtime (similar as we did for machine learning)?
> 
> [1] https://github.com/Alibaba/alink
> David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:
> 
> Most of the inquiries I've had about Gelly in recent memory have been from folks looking for a streaming solution, and it's only been a handful. 
> 
> +1 for dropping Gelly
> 
> David
> On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org> wrote:
> 
> I haven't seen any changes or requests to/for Gelly in ages. Hence, I would assume that it is not really used and can be removed.
> 
> +1 for dropping Gelly.
> 
> Cheers,
> Till
> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com> wrote:
> 
> Hi everyone,
> 
> Flink is bundled with Gelly, a Graph API library [1]. This has been marked as approaching end-of-life for quite some time [2].
> 
> Gelly is built on top of Flink's DataSet API, which is deprecated and slowly being phased out [3]. It only works on batch jobs. Based on the activity in the Dev and User mailing lists, I don't see a lot of questions popping up regarding the usage of Gelly. Removing Gelly would reduce CI time and resources because we won't need to run tests for this anymore. 
> 
> I'm cross-posting this to the User mailing list to see if there are any users of Gelly at the moment. 
> 
> Let me know your thoughts.
> 
> Martijn Visser | Product Manager
> martijn@ververica.com
> 
> [1] https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
> [2] https://flink.apache.org/roadmap.html
> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
> 
> 
> Follow us @VervericaData
> --
> Join Flink Forward - The Apache Flink Conference
> Stream Processing | Event Driven | Real Time
> 
> 
> 
> -- 
> best,
> Zhipeng
> 
> 
> 
> -- 
> best,
> Zhipeng
> 
> 

Re: Re: [DISCUSS] Drop Gelly

Posted by Yun Gao <yu...@aliyun.com.INVALID>.
Hi,

Very thanks for initiating the discussion!

Also +1 to drop the current DataSet based Gelly library so that we could finally drop the 
legacy DataSet API. 

For whether to keep the graph computing ability, from my side graph query / graph computing and
chaining them with the preprocessing pipeline should be an actually existent requirements. 
Currently we also already have the basis for a graph computing library on DataStream API
with the new iteration library[1], thus it would be already feasible to have a stream / batch
unified graph computing library on top of the DataStream API. And it would indeed be most suitable as 
a separate ecosystem project. 

Best,
Yun

[1] https://cwiki.apache.org/confluence/x/hAEBCw


 ------------------Original Mail ------------------
Sender:Martijn Visser <ma...@ververica.com>
Send Date:Wed Jan 5 02:58:53 2022
Recipients:Zhipeng Zhang <zh...@gmail.com>
CC:David Anderson <da...@apache.org>, Till Rohrmann <tr...@apache.org>, dev <de...@flink.apache.org>, User <us...@flink.apache.org>
Subject:Re: [DISCUSS] Drop Gelly

Hi Zhipeng,

I think that we're seeing more code being externalised, for example with the Flink Remote Shuffle service [1] and the ongoing discussion on the external connector repository [2], it makes sense to go for your second option. Maybe it fits under Flink Extended [3]. 

The main question becomes who can contribute and maintain this library. Another (intermediate) solution might also be to find someone who can migrate/move the current Gelly codebase to use Flink's DataStream API in batch mode, so it wouldn't be using the DataSet API anymore. This has recently also happened with the State Processor API [4]. 

Best regards,

Martijn

[1] https://github.com/flink-extended/flink-remote-shuffle
[2] https://lists.apache.org/thread/bywh947r2f5hfocxq598zhyh06zhksrm
[3] https://github.com/flink-extended/
[4] https://issues.apache.org/jira/browse/FLINK-24912
On Tue, 4 Jan 2022 at 14:01, Zhipeng Zhang <zh...@gmail.com> wrote:

Hi Martijin,

Thanks for the feedback. I am not proposing  to bundle the new graph library with Alink. I am +1 for dropping the DataSet-based Gelly library, but we probably need a new graph library in Flink for the possible migration.

We haven't decided what to do yet and probably need more discussion. There are some possible solutions:
1. We include a new DataStream-based graph library in FlinkML[1], given that graphs and machine learning algorithms are more often used together [2][3][4]. To achieve this, we could reuse the `AlgoOperator` interface in FlinkML.
2. We include a new DataStream-based graph library as a separate module/repo. This is consistent with existing libraries like Spark [5].

What do you think?


[1] https://github.com/apache/flink-ml
[2] https://arxiv.org/abs/1403.6652
[3] https://arxiv.org/abs/1503.03578
[4] https://github.com/apache/spark

Best,
Zhipeng
Martijn Visser <ma...@ververica.com> 于2022年1月4日周二 15:27写道:

Hi Zhipeng,

Good that you've reached out, I wasn't aware that Gelly is being used in Alink. Are you proposing to write a new graph library as a successor of Gelly and bundle that with Alink? 

Best regards,

Martijn
On Tue, 4 Jan 2022 at 02:57, Zhipeng Zhang <zh...@gmail.com> wrote:

Hi everyone,

Thanks for starting the discussion :)

We (Alink team [1]) are actually using part of the Gelly library to support graph algorithms (connected component, single source shortest path, etc.) for users in Alibaba Inc.

As DataSet API is going to be dropped, shall we also provide a new graph library based on DataStream runtime (similar as we did for machine learning)?

[1] https://github.com/Alibaba/alink
David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:

Most of the inquiries I've had about Gelly in recent memory have been from folks looking for a streaming solution, and it's only been a handful. 

+1 for dropping Gelly

David
On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org> wrote:

I haven't seen any changes or requests to/for Gelly in ages. Hence, I would assume that it is not really used and can be removed.

+1 for dropping Gelly.

Cheers,
Till
On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com> wrote:

Hi everyone,

Flink is bundled with Gelly, a Graph API library [1]. This has been marked as approaching end-of-life for quite some time [2].

Gelly is built on top of Flink's DataSet API, which is deprecated and slowly being phased out [3]. It only works on batch jobs. Based on the activity in the Dev and User mailing lists, I don't see a lot of questions popping up regarding the usage of Gelly. Removing Gelly would reduce CI time and resources because we won't need to run tests for this anymore. 

I'm cross-posting this to the User mailing list to see if there are any users of Gelly at the moment. 

Let me know your thoughts.

Martijn Visser | Product Manager
martijn@ververica.com

[1] https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
[2] https://flink.apache.org/roadmap.html
[3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz


Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time



-- 
best,
Zhipeng



-- 
best,
Zhipeng


Re: Re: [DISCUSS] Drop Gelly

Posted by Yun Gao <yu...@aliyun.com>.
Hi,

Very thanks for initiating the discussion!

Also +1 to drop the current DataSet based Gelly library so that we could finally drop the 
legacy DataSet API. 

For whether to keep the graph computing ability, from my side graph query / graph computing and
chaining them with the preprocessing pipeline should be an actually existent requirements. 
Currently we also already have the basis for a graph computing library on DataStream API
with the new iteration library[1], thus it would be already feasible to have a stream / batch
unified graph computing library on top of the DataStream API. And it would indeed be most suitable as 
a separate ecosystem project. 

Best,
Yun

[1] https://cwiki.apache.org/confluence/x/hAEBCw


 ------------------Original Mail ------------------
Sender:Martijn Visser <ma...@ververica.com>
Send Date:Wed Jan 5 02:58:53 2022
Recipients:Zhipeng Zhang <zh...@gmail.com>
CC:David Anderson <da...@apache.org>, Till Rohrmann <tr...@apache.org>, dev <de...@flink.apache.org>, User <us...@flink.apache.org>
Subject:Re: [DISCUSS] Drop Gelly

Hi Zhipeng,

I think that we're seeing more code being externalised, for example with the Flink Remote Shuffle service [1] and the ongoing discussion on the external connector repository [2], it makes sense to go for your second option. Maybe it fits under Flink Extended [3]. 

The main question becomes who can contribute and maintain this library. Another (intermediate) solution might also be to find someone who can migrate/move the current Gelly codebase to use Flink's DataStream API in batch mode, so it wouldn't be using the DataSet API anymore. This has recently also happened with the State Processor API [4]. 

Best regards,

Martijn

[1] https://github.com/flink-extended/flink-remote-shuffle
[2] https://lists.apache.org/thread/bywh947r2f5hfocxq598zhyh06zhksrm
[3] https://github.com/flink-extended/
[4] https://issues.apache.org/jira/browse/FLINK-24912
On Tue, 4 Jan 2022 at 14:01, Zhipeng Zhang <zh...@gmail.com> wrote:

Hi Martijin,

Thanks for the feedback. I am not proposing  to bundle the new graph library with Alink. I am +1 for dropping the DataSet-based Gelly library, but we probably need a new graph library in Flink for the possible migration.

We haven't decided what to do yet and probably need more discussion. There are some possible solutions:
1. We include a new DataStream-based graph library in FlinkML[1], given that graphs and machine learning algorithms are more often used together [2][3][4]. To achieve this, we could reuse the `AlgoOperator` interface in FlinkML.
2. We include a new DataStream-based graph library as a separate module/repo. This is consistent with existing libraries like Spark [5].

What do you think?


[1] https://github.com/apache/flink-ml
[2] https://arxiv.org/abs/1403.6652
[3] https://arxiv.org/abs/1503.03578
[4] https://github.com/apache/spark

Best,
Zhipeng
Martijn Visser <ma...@ververica.com> 于2022年1月4日周二 15:27写道:

Hi Zhipeng,

Good that you've reached out, I wasn't aware that Gelly is being used in Alink. Are you proposing to write a new graph library as a successor of Gelly and bundle that with Alink? 

Best regards,

Martijn
On Tue, 4 Jan 2022 at 02:57, Zhipeng Zhang <zh...@gmail.com> wrote:

Hi everyone,

Thanks for starting the discussion :)

We (Alink team [1]) are actually using part of the Gelly library to support graph algorithms (connected component, single source shortest path, etc.) for users in Alibaba Inc.

As DataSet API is going to be dropped, shall we also provide a new graph library based on DataStream runtime (similar as we did for machine learning)?

[1] https://github.com/Alibaba/alink
David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:

Most of the inquiries I've had about Gelly in recent memory have been from folks looking for a streaming solution, and it's only been a handful. 

+1 for dropping Gelly

David
On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org> wrote:

I haven't seen any changes or requests to/for Gelly in ages. Hence, I would assume that it is not really used and can be removed.

+1 for dropping Gelly.

Cheers,
Till
On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com> wrote:

Hi everyone,

Flink is bundled with Gelly, a Graph API library [1]. This has been marked as approaching end-of-life for quite some time [2].

Gelly is built on top of Flink's DataSet API, which is deprecated and slowly being phased out [3]. It only works on batch jobs. Based on the activity in the Dev and User mailing lists, I don't see a lot of questions popping up regarding the usage of Gelly. Removing Gelly would reduce CI time and resources because we won't need to run tests for this anymore. 

I'm cross-posting this to the User mailing list to see if there are any users of Gelly at the moment. 

Let me know your thoughts.

Martijn Visser | Product Manager
martijn@ververica.com

[1] https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
[2] https://flink.apache.org/roadmap.html
[3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz


Follow us @VervericaData
--
Join Flink Forward - The Apache Flink Conference
Stream Processing | Event Driven | Real Time



-- 
best,
Zhipeng



-- 
best,
Zhipeng


Re: [DISCUSS] Drop Gelly

Posted by Martijn Visser <ma...@ververica.com>.
Hi Zhipeng,

I think that we're seeing more code being externalised, for example with
the Flink Remote Shuffle service [1] and the ongoing discussion on the
external connector repository [2], it makes sense to go for your second
option. Maybe it fits under Flink Extended [3].

The main question becomes who can contribute and maintain this library.
Another (intermediate) solution might also be to find someone who can
migrate/move the current Gelly codebase to use Flink's DataStream API in
batch mode, so it wouldn't be using the DataSet API anymore. This has
recently also happened with the State Processor API [4].

Best regards,

Martijn

[1] https://github.com/flink-extended/flink-remote-shuffle
[2] https://lists.apache.org/thread/bywh947r2f5hfocxq598zhyh06zhksrm
[3] https://github.com/flink-extended/
[4] https://issues.apache.org/jira/browse/FLINK-24912

On Tue, 4 Jan 2022 at 14:01, Zhipeng Zhang <zh...@gmail.com> wrote:

> Hi Martijin,
>
> Thanks for the feedback. I am not proposing  to bundle the new graph
> library with Alink. I am +1 for dropping the DataSet-based Gelly library,
> but we probably need a new graph library in Flink for the possible
> migration.
>
> We haven't decided what to do yet and probably need more discussion. There
> are some possible solutions:
> 1. We include a new DataStream-based graph library in FlinkML[1], given
> that graphs and machine learning algorithms are more often used together
> [2][3][4]. To achieve this, we could reuse the `AlgoOperator` interface in
> FlinkML.
> 2. We include a new DataStream-based graph library as a separate
> module/repo. This is consistent with existing libraries like Spark [5].
>
> What do you think?
>
>
> [1] https://github.com/apache/flink-ml
> [2] https://arxiv.org/abs/1403.6652
> [3] https://arxiv.org/abs/1503.03578
> [4] https://github.com/apache/spark
>
> Best,
> Zhipeng
>
> Martijn Visser <ma...@ververica.com> 于2022年1月4日周二 15:27写道:
>
>> Hi Zhipeng,
>>
>> Good that you've reached out, I wasn't aware that Gelly is being used in
>> Alink. Are you proposing to write a new graph library as a successor of
>> Gelly and bundle that with Alink?
>>
>> Best regards,
>>
>> Martijn
>>
>> On Tue, 4 Jan 2022 at 02:57, Zhipeng Zhang <zh...@gmail.com>
>> wrote:
>>
>>> Hi everyone,
>>>
>>> Thanks for starting the discussion :)
>>>
>>> We (Alink team [1]) are actually using part of the Gelly library to
>>> support graph algorithms (connected component, single source shortest path,
>>> etc.) for users in Alibaba Inc.
>>>
>>> As DataSet API is going to be dropped, shall we also provide a new graph
>>> library based on DataStream runtime (similar as we did for machine
>>> learning)?
>>>
>>> [1] https://github.com/Alibaba/alink
>>>
>>> David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:
>>>
>>>> Most of the inquiries I've had about Gelly in recent memory have been
>>>> from folks looking for a streaming solution, and it's only been a handful.
>>>>
>>>> +1 for dropping Gelly
>>>>
>>>> David
>>>>
>>>> On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org>
>>>> wrote:
>>>>
>>>>> I haven't seen any changes or requests to/for Gelly in ages. Hence, I
>>>>> would assume that it is not really used and can be removed.
>>>>>
>>>>> +1 for dropping Gelly.
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com>
>>>>> wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>> Flink is bundled with Gelly, a Graph API library [1]. This has been
>>>>>> marked as approaching end-of-life for quite some time [2].
>>>>>>
>>>>>> Gelly is built on top of Flink's DataSet API, which is deprecated and
>>>>>> slowly being phased out [3]. It only works on batch jobs. Based on the
>>>>>> activity in the Dev and User mailing lists, I don't see a lot of questions
>>>>>> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
>>>>>> time and resources because we won't need to run tests for this anymore.
>>>>>>
>>>>>> I'm cross-posting this to the User mailing list to see if there are
>>>>>> any users of Gelly at the moment.
>>>>>>
>>>>>> Let me know your thoughts.
>>>>>>
>>>>>> Martijn Visser | Product Manager
>>>>>>
>>>>>> martijn@ververica.com
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>>>>>>
>>>>>> [2] https://flink.apache.org/roadmap.html
>>>>>>
>>>>>> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>>>>>>
>>>>>> <https://www.ververica.com/>
>>>>>>
>>>>>>
>>>>>> Follow us @VervericaData
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>>>>> Conference
>>>>>>
>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>
>>>>>>
>>>
>>> --
>>> best,
>>> Zhipeng
>>>
>>>
>
> --
> best,
> Zhipeng
>
>

Re: [DISCUSS] Drop Gelly

Posted by Jing Ge <ji...@ververica.com>.
Hi,

thanks Martijn for bringing it up for discussion. I think we could make the
discussion a little bit clearer by splitting it into two questions:

1. should Flink drop Gelly?
2. should Flink drop the graph computing?

The answer of the first question could be yes, since there have been no
changes for years. +1 for dropping Gelly.

But for the second question, I would suggest answering it with no or with
strategic yes and will definitely support it again in the near future,
because there are many use cases that could be solved by streaming/batch +
graph in a more elegant way. Afaik fintech companies have a lot of those
use cases[1]. It would be great if we could find a way to drop Gelly but
keep the graph computing ability within Flink's ecosystem.

Best regards
Jing

[1]
https://california18.com/the-ant-graph-calculation-is-upgraded-to-tugraph-and-it-won-the-2021-world-internet-leading-scientific-and-technological-achievement-award/2117052021/


On Tue, Jan 4, 2022 at 2:02 PM Zhipeng Zhang <zh...@gmail.com>
wrote:

> Hi Martijin,
>
> Thanks for the feedback. I am not proposing  to bundle the new graph
> library with Alink. I am +1 for dropping the DataSet-based Gelly library,
> but we probably need a new graph library in Flink for the possible
> migration.
>
> We haven't decided what to do yet and probably need more discussion. There
> are some possible solutions:
> 1. We include a new DataStream-based graph library in FlinkML[1], given
> that graphs and machine learning algorithms are more often used together
> [2][3][4]. To achieve this, we could reuse the `AlgoOperator` interface in
> FlinkML.
> 2. We include a new DataStream-based graph library as a separate
> module/repo. This is consistent with existing libraries like Spark [5].
>
> What do you think?
>
>
> [1] https://github.com/apache/flink-ml
> [2] https://arxiv.org/abs/1403.6652
> [3] https://arxiv.org/abs/1503.03578
> [4] https://github.com/apache/spark
>
> Best,
> Zhipeng
>
> Martijn Visser <ma...@ververica.com> 于2022年1月4日周二 15:27写道:
>
>> Hi Zhipeng,
>>
>> Good that you've reached out, I wasn't aware that Gelly is being used in
>> Alink. Are you proposing to write a new graph library as a successor of
>> Gelly and bundle that with Alink?
>>
>> Best regards,
>>
>> Martijn
>>
>> On Tue, 4 Jan 2022 at 02:57, Zhipeng Zhang <zh...@gmail.com>
>> wrote:
>>
>>> Hi everyone,
>>>
>>> Thanks for starting the discussion :)
>>>
>>> We (Alink team [1]) are actually using part of the Gelly library to
>>> support graph algorithms (connected component, single source shortest path,
>>> etc.) for users in Alibaba Inc.
>>>
>>> As DataSet API is going to be dropped, shall we also provide a new graph
>>> library based on DataStream runtime (similar as we did for machine
>>> learning)?
>>>
>>> [1] https://github.com/Alibaba/alink
>>>
>>> David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:
>>>
>>>> Most of the inquiries I've had about Gelly in recent memory have been
>>>> from folks looking for a streaming solution, and it's only been a handful.
>>>>
>>>> +1 for dropping Gelly
>>>>
>>>> David
>>>>
>>>> On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org>
>>>> wrote:
>>>>
>>>>> I haven't seen any changes or requests to/for Gelly in ages. Hence, I
>>>>> would assume that it is not really used and can be removed.
>>>>>
>>>>> +1 for dropping Gelly.
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com>
>>>>> wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>> Flink is bundled with Gelly, a Graph API library [1]. This has been
>>>>>> marked as approaching end-of-life for quite some time [2].
>>>>>>
>>>>>> Gelly is built on top of Flink's DataSet API, which is deprecated and
>>>>>> slowly being phased out [3]. It only works on batch jobs. Based on the
>>>>>> activity in the Dev and User mailing lists, I don't see a lot of questions
>>>>>> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
>>>>>> time and resources because we won't need to run tests for this anymore.
>>>>>>
>>>>>> I'm cross-posting this to the User mailing list to see if there are
>>>>>> any users of Gelly at the moment.
>>>>>>
>>>>>> Let me know your thoughts.
>>>>>>
>>>>>> Martijn Visser | Product Manager
>>>>>>
>>>>>> martijn@ververica.com
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>>>>>>
>>>>>> [2] https://flink.apache.org/roadmap.html
>>>>>>
>>>>>> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>>>>>>
>>>>>> <https://www.ververica.com/>
>>>>>>
>>>>>>
>>>>>> Follow us @VervericaData
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>>>>> Conference
>>>>>>
>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>
>>>>>>
>>>
>>> --
>>> best,
>>> Zhipeng
>>>
>>>
>
> --
> best,
> Zhipeng
>
>

Re: [DISCUSS] Drop Gelly

Posted by Martijn Visser <ma...@ververica.com>.
Hi Zhipeng,

I think that we're seeing more code being externalised, for example with
the Flink Remote Shuffle service [1] and the ongoing discussion on the
external connector repository [2], it makes sense to go for your second
option. Maybe it fits under Flink Extended [3].

The main question becomes who can contribute and maintain this library.
Another (intermediate) solution might also be to find someone who can
migrate/move the current Gelly codebase to use Flink's DataStream API in
batch mode, so it wouldn't be using the DataSet API anymore. This has
recently also happened with the State Processor API [4].

Best regards,

Martijn

[1] https://github.com/flink-extended/flink-remote-shuffle
[2] https://lists.apache.org/thread/bywh947r2f5hfocxq598zhyh06zhksrm
[3] https://github.com/flink-extended/
[4] https://issues.apache.org/jira/browse/FLINK-24912

On Tue, 4 Jan 2022 at 14:01, Zhipeng Zhang <zh...@gmail.com> wrote:

> Hi Martijin,
>
> Thanks for the feedback. I am not proposing  to bundle the new graph
> library with Alink. I am +1 for dropping the DataSet-based Gelly library,
> but we probably need a new graph library in Flink for the possible
> migration.
>
> We haven't decided what to do yet and probably need more discussion. There
> are some possible solutions:
> 1. We include a new DataStream-based graph library in FlinkML[1], given
> that graphs and machine learning algorithms are more often used together
> [2][3][4]. To achieve this, we could reuse the `AlgoOperator` interface in
> FlinkML.
> 2. We include a new DataStream-based graph library as a separate
> module/repo. This is consistent with existing libraries like Spark [5].
>
> What do you think?
>
>
> [1] https://github.com/apache/flink-ml
> [2] https://arxiv.org/abs/1403.6652
> [3] https://arxiv.org/abs/1503.03578
> [4] https://github.com/apache/spark
>
> Best,
> Zhipeng
>
> Martijn Visser <ma...@ververica.com> 于2022年1月4日周二 15:27写道:
>
>> Hi Zhipeng,
>>
>> Good that you've reached out, I wasn't aware that Gelly is being used in
>> Alink. Are you proposing to write a new graph library as a successor of
>> Gelly and bundle that with Alink?
>>
>> Best regards,
>>
>> Martijn
>>
>> On Tue, 4 Jan 2022 at 02:57, Zhipeng Zhang <zh...@gmail.com>
>> wrote:
>>
>>> Hi everyone,
>>>
>>> Thanks for starting the discussion :)
>>>
>>> We (Alink team [1]) are actually using part of the Gelly library to
>>> support graph algorithms (connected component, single source shortest path,
>>> etc.) for users in Alibaba Inc.
>>>
>>> As DataSet API is going to be dropped, shall we also provide a new graph
>>> library based on DataStream runtime (similar as we did for machine
>>> learning)?
>>>
>>> [1] https://github.com/Alibaba/alink
>>>
>>> David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:
>>>
>>>> Most of the inquiries I've had about Gelly in recent memory have been
>>>> from folks looking for a streaming solution, and it's only been a handful.
>>>>
>>>> +1 for dropping Gelly
>>>>
>>>> David
>>>>
>>>> On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org>
>>>> wrote:
>>>>
>>>>> I haven't seen any changes or requests to/for Gelly in ages. Hence, I
>>>>> would assume that it is not really used and can be removed.
>>>>>
>>>>> +1 for dropping Gelly.
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com>
>>>>> wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>> Flink is bundled with Gelly, a Graph API library [1]. This has been
>>>>>> marked as approaching end-of-life for quite some time [2].
>>>>>>
>>>>>> Gelly is built on top of Flink's DataSet API, which is deprecated and
>>>>>> slowly being phased out [3]. It only works on batch jobs. Based on the
>>>>>> activity in the Dev and User mailing lists, I don't see a lot of questions
>>>>>> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
>>>>>> time and resources because we won't need to run tests for this anymore.
>>>>>>
>>>>>> I'm cross-posting this to the User mailing list to see if there are
>>>>>> any users of Gelly at the moment.
>>>>>>
>>>>>> Let me know your thoughts.
>>>>>>
>>>>>> Martijn Visser | Product Manager
>>>>>>
>>>>>> martijn@ververica.com
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>>>>>>
>>>>>> [2] https://flink.apache.org/roadmap.html
>>>>>>
>>>>>> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>>>>>>
>>>>>> <https://www.ververica.com/>
>>>>>>
>>>>>>
>>>>>> Follow us @VervericaData
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>>>>> Conference
>>>>>>
>>>>>> Stream Processing | Event Driven | Real Time
>>>>>>
>>>>>>
>>>
>>> --
>>> best,
>>> Zhipeng
>>>
>>>
>
> --
> best,
> Zhipeng
>
>

Re: [DISCUSS] Drop Gelly

Posted by Zhipeng Zhang <zh...@gmail.com>.
Hi Martijin,

Thanks for the feedback. I am not proposing  to bundle the new graph
library with Alink. I am +1 for dropping the DataSet-based Gelly library,
but we probably need a new graph library in Flink for the possible
migration.

We haven't decided what to do yet and probably need more discussion. There
are some possible solutions:
1. We include a new DataStream-based graph library in FlinkML[1], given
that graphs and machine learning algorithms are more often used together
[2][3][4]. To achieve this, we could reuse the `AlgoOperator` interface in
FlinkML.
2. We include a new DataStream-based graph library as a separate
module/repo. This is consistent with existing libraries like Spark [5].

What do you think?


[1] https://github.com/apache/flink-ml
[2] https://arxiv.org/abs/1403.6652
[3] https://arxiv.org/abs/1503.03578
[4] https://github.com/apache/spark

Best,
Zhipeng

Martijn Visser <ma...@ververica.com> 于2022年1月4日周二 15:27写道:

> Hi Zhipeng,
>
> Good that you've reached out, I wasn't aware that Gelly is being used in
> Alink. Are you proposing to write a new graph library as a successor of
> Gelly and bundle that with Alink?
>
> Best regards,
>
> Martijn
>
> On Tue, 4 Jan 2022 at 02:57, Zhipeng Zhang <zh...@gmail.com>
> wrote:
>
>> Hi everyone,
>>
>> Thanks for starting the discussion :)
>>
>> We (Alink team [1]) are actually using part of the Gelly library to
>> support graph algorithms (connected component, single source shortest path,
>> etc.) for users in Alibaba Inc.
>>
>> As DataSet API is going to be dropped, shall we also provide a new graph
>> library based on DataStream runtime (similar as we did for machine
>> learning)?
>>
>> [1] https://github.com/Alibaba/alink
>>
>> David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:
>>
>>> Most of the inquiries I've had about Gelly in recent memory have been
>>> from folks looking for a streaming solution, and it's only been a handful.
>>>
>>> +1 for dropping Gelly
>>>
>>> David
>>>
>>> On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org>
>>> wrote:
>>>
>>>> I haven't seen any changes or requests to/for Gelly in ages. Hence, I
>>>> would assume that it is not really used and can be removed.
>>>>
>>>> +1 for dropping Gelly.
>>>>
>>>> Cheers,
>>>> Till
>>>>
>>>> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com>
>>>> wrote:
>>>>
>>>>> Hi everyone,
>>>>>
>>>>> Flink is bundled with Gelly, a Graph API library [1]. This has been
>>>>> marked as approaching end-of-life for quite some time [2].
>>>>>
>>>>> Gelly is built on top of Flink's DataSet API, which is deprecated and
>>>>> slowly being phased out [3]. It only works on batch jobs. Based on the
>>>>> activity in the Dev and User mailing lists, I don't see a lot of questions
>>>>> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
>>>>> time and resources because we won't need to run tests for this anymore.
>>>>>
>>>>> I'm cross-posting this to the User mailing list to see if there are
>>>>> any users of Gelly at the moment.
>>>>>
>>>>> Let me know your thoughts.
>>>>>
>>>>> Martijn Visser | Product Manager
>>>>>
>>>>> martijn@ververica.com
>>>>>
>>>>>
>>>>> [1]
>>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>>>>>
>>>>> [2] https://flink.apache.org/roadmap.html
>>>>>
>>>>> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>>>>>
>>>>> <https://www.ververica.com/>
>>>>>
>>>>>
>>>>> Follow us @VervericaData
>>>>>
>>>>> --
>>>>>
>>>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>>>> Conference
>>>>>
>>>>> Stream Processing | Event Driven | Real Time
>>>>>
>>>>>
>>
>> --
>> best,
>> Zhipeng
>>
>>

-- 
best,
Zhipeng

Re: [DISCUSS] Drop Gelly

Posted by Zhipeng Zhang <zh...@gmail.com>.
Hi Martijin,

Thanks for the feedback. I am not proposing  to bundle the new graph
library with Alink. I am +1 for dropping the DataSet-based Gelly library,
but we probably need a new graph library in Flink for the possible
migration.

We haven't decided what to do yet and probably need more discussion. There
are some possible solutions:
1. We include a new DataStream-based graph library in FlinkML[1], given
that graphs and machine learning algorithms are more often used together
[2][3][4]. To achieve this, we could reuse the `AlgoOperator` interface in
FlinkML.
2. We include a new DataStream-based graph library as a separate
module/repo. This is consistent with existing libraries like Spark [5].

What do you think?


[1] https://github.com/apache/flink-ml
[2] https://arxiv.org/abs/1403.6652
[3] https://arxiv.org/abs/1503.03578
[4] https://github.com/apache/spark

Best,
Zhipeng

Martijn Visser <ma...@ververica.com> 于2022年1月4日周二 15:27写道:

> Hi Zhipeng,
>
> Good that you've reached out, I wasn't aware that Gelly is being used in
> Alink. Are you proposing to write a new graph library as a successor of
> Gelly and bundle that with Alink?
>
> Best regards,
>
> Martijn
>
> On Tue, 4 Jan 2022 at 02:57, Zhipeng Zhang <zh...@gmail.com>
> wrote:
>
>> Hi everyone,
>>
>> Thanks for starting the discussion :)
>>
>> We (Alink team [1]) are actually using part of the Gelly library to
>> support graph algorithms (connected component, single source shortest path,
>> etc.) for users in Alibaba Inc.
>>
>> As DataSet API is going to be dropped, shall we also provide a new graph
>> library based on DataStream runtime (similar as we did for machine
>> learning)?
>>
>> [1] https://github.com/Alibaba/alink
>>
>> David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:
>>
>>> Most of the inquiries I've had about Gelly in recent memory have been
>>> from folks looking for a streaming solution, and it's only been a handful.
>>>
>>> +1 for dropping Gelly
>>>
>>> David
>>>
>>> On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org>
>>> wrote:
>>>
>>>> I haven't seen any changes or requests to/for Gelly in ages. Hence, I
>>>> would assume that it is not really used and can be removed.
>>>>
>>>> +1 for dropping Gelly.
>>>>
>>>> Cheers,
>>>> Till
>>>>
>>>> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com>
>>>> wrote:
>>>>
>>>>> Hi everyone,
>>>>>
>>>>> Flink is bundled with Gelly, a Graph API library [1]. This has been
>>>>> marked as approaching end-of-life for quite some time [2].
>>>>>
>>>>> Gelly is built on top of Flink's DataSet API, which is deprecated and
>>>>> slowly being phased out [3]. It only works on batch jobs. Based on the
>>>>> activity in the Dev and User mailing lists, I don't see a lot of questions
>>>>> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
>>>>> time and resources because we won't need to run tests for this anymore.
>>>>>
>>>>> I'm cross-posting this to the User mailing list to see if there are
>>>>> any users of Gelly at the moment.
>>>>>
>>>>> Let me know your thoughts.
>>>>>
>>>>> Martijn Visser | Product Manager
>>>>>
>>>>> martijn@ververica.com
>>>>>
>>>>>
>>>>> [1]
>>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>>>>>
>>>>> [2] https://flink.apache.org/roadmap.html
>>>>>
>>>>> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>>>>>
>>>>> <https://www.ververica.com/>
>>>>>
>>>>>
>>>>> Follow us @VervericaData
>>>>>
>>>>> --
>>>>>
>>>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>>>> Conference
>>>>>
>>>>> Stream Processing | Event Driven | Real Time
>>>>>
>>>>>
>>
>> --
>> best,
>> Zhipeng
>>
>>

-- 
best,
Zhipeng

Re: [DISCUSS] Drop Gelly

Posted by Martijn Visser <ma...@ververica.com>.
Hi Zhipeng,

Good that you've reached out, I wasn't aware that Gelly is being used in
Alink. Are you proposing to write a new graph library as a successor of
Gelly and bundle that with Alink?

Best regards,

Martijn

On Tue, 4 Jan 2022 at 02:57, Zhipeng Zhang <zh...@gmail.com> wrote:

> Hi everyone,
>
> Thanks for starting the discussion :)
>
> We (Alink team [1]) are actually using part of the Gelly library to
> support graph algorithms (connected component, single source shortest path,
> etc.) for users in Alibaba Inc.
>
> As DataSet API is going to be dropped, shall we also provide a new graph
> library based on DataStream runtime (similar as we did for machine
> learning)?
>
> [1] https://github.com/Alibaba/alink
>
> David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:
>
>> Most of the inquiries I've had about Gelly in recent memory have been
>> from folks looking for a streaming solution, and it's only been a handful.
>>
>> +1 for dropping Gelly
>>
>> David
>>
>> On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org>
>> wrote:
>>
>>> I haven't seen any changes or requests to/for Gelly in ages. Hence, I
>>> would assume that it is not really used and can be removed.
>>>
>>> +1 for dropping Gelly.
>>>
>>> Cheers,
>>> Till
>>>
>>> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com>
>>> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> Flink is bundled with Gelly, a Graph API library [1]. This has been
>>>> marked as approaching end-of-life for quite some time [2].
>>>>
>>>> Gelly is built on top of Flink's DataSet API, which is deprecated and
>>>> slowly being phased out [3]. It only works on batch jobs. Based on the
>>>> activity in the Dev and User mailing lists, I don't see a lot of questions
>>>> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
>>>> time and resources because we won't need to run tests for this anymore.
>>>>
>>>> I'm cross-posting this to the User mailing list to see if there are any
>>>> users of Gelly at the moment.
>>>>
>>>> Let me know your thoughts.
>>>>
>>>> Martijn Visser | Product Manager
>>>>
>>>> martijn@ververica.com
>>>>
>>>>
>>>> [1]
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>>>>
>>>> [2] https://flink.apache.org/roadmap.html
>>>>
>>>> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>>>>
>>>> <https://www.ververica.com/>
>>>>
>>>>
>>>> Follow us @VervericaData
>>>>
>>>> --
>>>>
>>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>>> Conference
>>>>
>>>> Stream Processing | Event Driven | Real Time
>>>>
>>>>
>
> --
> best,
> Zhipeng
>
>

Re: [DISCUSS] Drop Gelly

Posted by Martijn Visser <ma...@ververica.com>.
Hi Zhipeng,

Good that you've reached out, I wasn't aware that Gelly is being used in
Alink. Are you proposing to write a new graph library as a successor of
Gelly and bundle that with Alink?

Best regards,

Martijn

On Tue, 4 Jan 2022 at 02:57, Zhipeng Zhang <zh...@gmail.com> wrote:

> Hi everyone,
>
> Thanks for starting the discussion :)
>
> We (Alink team [1]) are actually using part of the Gelly library to
> support graph algorithms (connected component, single source shortest path,
> etc.) for users in Alibaba Inc.
>
> As DataSet API is going to be dropped, shall we also provide a new graph
> library based on DataStream runtime (similar as we did for machine
> learning)?
>
> [1] https://github.com/Alibaba/alink
>
> David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:
>
>> Most of the inquiries I've had about Gelly in recent memory have been
>> from folks looking for a streaming solution, and it's only been a handful.
>>
>> +1 for dropping Gelly
>>
>> David
>>
>> On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org>
>> wrote:
>>
>>> I haven't seen any changes or requests to/for Gelly in ages. Hence, I
>>> would assume that it is not really used and can be removed.
>>>
>>> +1 for dropping Gelly.
>>>
>>> Cheers,
>>> Till
>>>
>>> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com>
>>> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> Flink is bundled with Gelly, a Graph API library [1]. This has been
>>>> marked as approaching end-of-life for quite some time [2].
>>>>
>>>> Gelly is built on top of Flink's DataSet API, which is deprecated and
>>>> slowly being phased out [3]. It only works on batch jobs. Based on the
>>>> activity in the Dev and User mailing lists, I don't see a lot of questions
>>>> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
>>>> time and resources because we won't need to run tests for this anymore.
>>>>
>>>> I'm cross-posting this to the User mailing list to see if there are any
>>>> users of Gelly at the moment.
>>>>
>>>> Let me know your thoughts.
>>>>
>>>> Martijn Visser | Product Manager
>>>>
>>>> martijn@ververica.com
>>>>
>>>>
>>>> [1]
>>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>>>>
>>>> [2] https://flink.apache.org/roadmap.html
>>>>
>>>> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>>>>
>>>> <https://www.ververica.com/>
>>>>
>>>>
>>>> Follow us @VervericaData
>>>>
>>>> --
>>>>
>>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>>> Conference
>>>>
>>>> Stream Processing | Event Driven | Real Time
>>>>
>>>>
>
> --
> best,
> Zhipeng
>
>

Re: [DISCUSS] Drop Gelly

Posted by Zhipeng Zhang <zh...@gmail.com>.
Hi everyone,

Thanks for starting the discussion :)

We (Alink team [1]) are actually using part of the Gelly library to support
graph algorithms (connected component, single source shortest path, etc.)
for users in Alibaba Inc.

As DataSet API is going to be dropped, shall we also provide a new graph
library based on DataStream runtime (similar as we did for machine
learning)?

[1] https://github.com/Alibaba/alink

David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:

> Most of the inquiries I've had about Gelly in recent memory have been from
> folks looking for a streaming solution, and it's only been a handful.
>
> +1 for dropping Gelly
>
> David
>
> On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org> wrote:
>
>> I haven't seen any changes or requests to/for Gelly in ages. Hence, I
>> would assume that it is not really used and can be removed.
>>
>> +1 for dropping Gelly.
>>
>> Cheers,
>> Till
>>
>> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com>
>> wrote:
>>
>>> Hi everyone,
>>>
>>> Flink is bundled with Gelly, a Graph API library [1]. This has been
>>> marked as approaching end-of-life for quite some time [2].
>>>
>>> Gelly is built on top of Flink's DataSet API, which is deprecated and
>>> slowly being phased out [3]. It only works on batch jobs. Based on the
>>> activity in the Dev and User mailing lists, I don't see a lot of questions
>>> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
>>> time and resources because we won't need to run tests for this anymore.
>>>
>>> I'm cross-posting this to the User mailing list to see if there are any
>>> users of Gelly at the moment.
>>>
>>> Let me know your thoughts.
>>>
>>> Martijn Visser | Product Manager
>>>
>>> martijn@ververica.com
>>>
>>>
>>> [1]
>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>>>
>>> [2] https://flink.apache.org/roadmap.html
>>>
>>> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>>>
>>> <https://www.ververica.com/>
>>>
>>>
>>> Follow us @VervericaData
>>>
>>> --
>>>
>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>> Conference
>>>
>>> Stream Processing | Event Driven | Real Time
>>>
>>>

-- 
best,
Zhipeng

Re: [DISCUSS] Drop Gelly

Posted by Zhipeng Zhang <zh...@gmail.com>.
Hi everyone,

Thanks for starting the discussion :)

We (Alink team [1]) are actually using part of the Gelly library to support
graph algorithms (connected component, single source shortest path, etc.)
for users in Alibaba Inc.

As DataSet API is going to be dropped, shall we also provide a new graph
library based on DataStream runtime (similar as we did for machine
learning)?

[1] https://github.com/Alibaba/alink

David Anderson <da...@apache.org> 于2022年1月4日周二 00:01写道:

> Most of the inquiries I've had about Gelly in recent memory have been from
> folks looking for a streaming solution, and it's only been a handful.
>
> +1 for dropping Gelly
>
> David
>
> On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org> wrote:
>
>> I haven't seen any changes or requests to/for Gelly in ages. Hence, I
>> would assume that it is not really used and can be removed.
>>
>> +1 for dropping Gelly.
>>
>> Cheers,
>> Till
>>
>> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com>
>> wrote:
>>
>>> Hi everyone,
>>>
>>> Flink is bundled with Gelly, a Graph API library [1]. This has been
>>> marked as approaching end-of-life for quite some time [2].
>>>
>>> Gelly is built on top of Flink's DataSet API, which is deprecated and
>>> slowly being phased out [3]. It only works on batch jobs. Based on the
>>> activity in the Dev and User mailing lists, I don't see a lot of questions
>>> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
>>> time and resources because we won't need to run tests for this anymore.
>>>
>>> I'm cross-posting this to the User mailing list to see if there are any
>>> users of Gelly at the moment.
>>>
>>> Let me know your thoughts.
>>>
>>> Martijn Visser | Product Manager
>>>
>>> martijn@ververica.com
>>>
>>>
>>> [1]
>>> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>>>
>>> [2] https://flink.apache.org/roadmap.html
>>>
>>> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>>>
>>> <https://www.ververica.com/>
>>>
>>>
>>> Follow us @VervericaData
>>>
>>> --
>>>
>>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>>> Conference
>>>
>>> Stream Processing | Event Driven | Real Time
>>>
>>>

-- 
best,
Zhipeng

Re: [DISCUSS] Drop Gelly

Posted by David Anderson <da...@apache.org>.
Most of the inquiries I've had about Gelly in recent memory have been from
folks looking for a streaming solution, and it's only been a handful.

+1 for dropping Gelly

David

On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org> wrote:

> I haven't seen any changes or requests to/for Gelly in ages. Hence, I
> would assume that it is not really used and can be removed.
>
> +1 for dropping Gelly.
>
> Cheers,
> Till
>
> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com>
> wrote:
>
>> Hi everyone,
>>
>> Flink is bundled with Gelly, a Graph API library [1]. This has been
>> marked as approaching end-of-life for quite some time [2].
>>
>> Gelly is built on top of Flink's DataSet API, which is deprecated and
>> slowly being phased out [3]. It only works on batch jobs. Based on the
>> activity in the Dev and User mailing lists, I don't see a lot of questions
>> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
>> time and resources because we won't need to run tests for this anymore.
>>
>> I'm cross-posting this to the User mailing list to see if there are any
>> users of Gelly at the moment.
>>
>> Let me know your thoughts.
>>
>> Martijn Visser | Product Manager
>>
>> martijn@ververica.com
>>
>>
>> [1]
>> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>>
>> [2] https://flink.apache.org/roadmap.html
>>
>> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>>
>> <https://www.ververica.com/>
>>
>>
>> Follow us @VervericaData
>>
>> --
>>
>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>> Conference
>>
>> Stream Processing | Event Driven | Real Time
>>
>>

Re: [DISCUSS] Drop Gelly

Posted by David Anderson <da...@apache.org>.
Most of the inquiries I've had about Gelly in recent memory have been from
folks looking for a streaming solution, and it's only been a handful.

+1 for dropping Gelly

David

On Mon, Jan 3, 2022 at 2:41 PM Till Rohrmann <tr...@apache.org> wrote:

> I haven't seen any changes or requests to/for Gelly in ages. Hence, I
> would assume that it is not really used and can be removed.
>
> +1 for dropping Gelly.
>
> Cheers,
> Till
>
> On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com>
> wrote:
>
>> Hi everyone,
>>
>> Flink is bundled with Gelly, a Graph API library [1]. This has been
>> marked as approaching end-of-life for quite some time [2].
>>
>> Gelly is built on top of Flink's DataSet API, which is deprecated and
>> slowly being phased out [3]. It only works on batch jobs. Based on the
>> activity in the Dev and User mailing lists, I don't see a lot of questions
>> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
>> time and resources because we won't need to run tests for this anymore.
>>
>> I'm cross-posting this to the User mailing list to see if there are any
>> users of Gelly at the moment.
>>
>> Let me know your thoughts.
>>
>> Martijn Visser | Product Manager
>>
>> martijn@ververica.com
>>
>>
>> [1]
>> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>>
>> [2] https://flink.apache.org/roadmap.html
>>
>> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>>
>> <https://www.ververica.com/>
>>
>>
>> Follow us @VervericaData
>>
>> --
>>
>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>> Conference
>>
>> Stream Processing | Event Driven | Real Time
>>
>>

Re: [DISCUSS] Drop Gelly

Posted by Till Rohrmann <tr...@apache.org>.
I haven't seen any changes or requests to/for Gelly in ages. Hence, I would
assume that it is not really used and can be removed.

+1 for dropping Gelly.

Cheers,
Till

On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com> wrote:

> Hi everyone,
>
> Flink is bundled with Gelly, a Graph API library [1]. This has been marked
> as approaching end-of-life for quite some time [2].
>
> Gelly is built on top of Flink's DataSet API, which is deprecated and
> slowly being phased out [3]. It only works on batch jobs. Based on the
> activity in the Dev and User mailing lists, I don't see a lot of questions
> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
> time and resources because we won't need to run tests for this anymore.
>
> I'm cross-posting this to the User mailing list to see if there are any
> users of Gelly at the moment.
>
> Let me know your thoughts.
>
> Martijn Visser | Product Manager
>
> martijn@ververica.com
>
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>
> [2] https://flink.apache.org/roadmap.html
>
> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>
> <https://www.ververica.com/>
>
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
>

Re: [DISCUSS] Drop Gelly

Posted by Lukáš Drbal <lu...@gmail.com>.
Hi everyone,

We (team in seznam.cz) are actually using the Gelly library for batch
anomaly detection in our graphs. It will be very nice to somehow keep this
functionality, maybe in a separate repository. Is there any replacement?


Best,
Lukas

On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com> wrote:

> Hi everyone,
>
> Flink is bundled with Gelly, a Graph API library [1]. This has been marked
> as approaching end-of-life for quite some time [2].
>
> Gelly is built on top of Flink's DataSet API, which is deprecated and
> slowly being phased out [3]. It only works on batch jobs. Based on the
> activity in the Dev and User mailing lists, I don't see a lot of questions
> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
> time and resources because we won't need to run tests for this anymore.
>
> I'm cross-posting this to the User mailing list to see if there are any
> users of Gelly at the moment.
>
> Let me know your thoughts.
>
> Martijn Visser | Product Manager
>
> martijn@ververica.com
>
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>
> [2] https://flink.apache.org/roadmap.html
>
> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>
> <https://www.ververica.com/>
>
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
>

Re: [DISCUSS] Drop Gelly

Posted by Till Rohrmann <tr...@apache.org>.
I haven't seen any changes or requests to/for Gelly in ages. Hence, I would
assume that it is not really used and can be removed.

+1 for dropping Gelly.

Cheers,
Till

On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com> wrote:

> Hi everyone,
>
> Flink is bundled with Gelly, a Graph API library [1]. This has been marked
> as approaching end-of-life for quite some time [2].
>
> Gelly is built on top of Flink's DataSet API, which is deprecated and
> slowly being phased out [3]. It only works on batch jobs. Based on the
> activity in the Dev and User mailing lists, I don't see a lot of questions
> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
> time and resources because we won't need to run tests for this anymore.
>
> I'm cross-posting this to the User mailing list to see if there are any
> users of Gelly at the moment.
>
> Let me know your thoughts.
>
> Martijn Visser | Product Manager
>
> martijn@ververica.com
>
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>
> [2] https://flink.apache.org/roadmap.html
>
> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>
> <https://www.ververica.com/>
>
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
>

Re: [DISCUSS] Drop Gelly

Posted by Lukáš Drbal <lu...@gmail.com>.
Hi everyone,

We (team in seznam.cz) are actually using the Gelly library for batch
anomaly detection in our graphs. It will be very nice to somehow keep this
functionality, maybe in a separate repository. Is there any replacement?


Best,
Lukas

On Mon, Jan 3, 2022 at 2:20 PM Martijn Visser <ma...@ververica.com> wrote:

> Hi everyone,
>
> Flink is bundled with Gelly, a Graph API library [1]. This has been marked
> as approaching end-of-life for quite some time [2].
>
> Gelly is built on top of Flink's DataSet API, which is deprecated and
> slowly being phased out [3]. It only works on batch jobs. Based on the
> activity in the Dev and User mailing lists, I don't see a lot of questions
> popping up regarding the usage of Gelly. Removing Gelly would reduce CI
> time and resources because we won't need to run tests for this anymore.
>
> I'm cross-posting this to the User mailing list to see if there are any
> users of Gelly at the moment.
>
> Let me know your thoughts.
>
> Martijn Visser | Product Manager
>
> martijn@ververica.com
>
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-stable/docs/libs/gelly/overview/
>
> [2] https://flink.apache.org/roadmap.html
>
> [3] https://lists.apache.org/thread/b2y3xx3thbcbtzdphoct5wvzwogs9sqz
>
> <https://www.ververica.com/>
>
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
>