You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Antonio Martínez Carratalá <am...@alto-analytics.com> on 2020/07/13 13:49:20 UTC

Flink DataSet Iterate updating additional variable

Hello

I'm trying to implement the ForceAtlas2 (graph layout) algorithm in Flink
using datasets, it is an iterative algorithm and I have most of it ready,
but there is something I don't know how to do. Apart from the dataset with
the coordinates (x,y) of each node I need an additional variable to
regulate the speed, very simplified it would be something like this:

DataSet<Node> nodes = env.fromCollection(nodesList);
Double speed = 1.0;
nodes.iterate(100) {
   nodes = nodes with forces to apply calculated;
   speed = speed regulated based on previous value and forces calculated;
   nodes = nodes coordinates updated with forces * speed;
} closesWith(nodes)

In my case, the nodes are the dataset I'm iterating over and it is working
perfect if I forget about speed, but I don't know how to keep speed
variable updated in every iteration to be able to use it in the next one

Any suggestions? Thanks

Re: Flink DataSet Iterate updating additional variable

Posted by Khachatryan Roman <kh...@gmail.com>.
Hi Antonio,

Yes, you are right. Revisiting your question, I'm wondering whether it's
possible to partition speeds and nodes in the same way (stably across
iterations)? (I'm assuming a distributed setup)
If not, each iteration would have to wait for *all* subtasks of the
previous iteration to finish, right?
Which will likely neglect the benefits of the iterative approach.

Regards,
Roman


On Tue, Jul 14, 2020 at 9:36 AM Antonio Martínez Carratalá <
amartinez@alto-analytics.com> wrote:

> Hi Roman,
>
> Thank you for your quick reply, but as far as I know broadcast variables
> cannot be written, my problem is that I need to update the value of the
> speed variable to use it in the next iteration.
>
> Iterate only has one input dataset and propagates it to the next iteration
> using closeWith(), but I need another variable (maybe dataset) to be
> propagated too, is this possible in some way?
>
> Thanks
>
> On Mon, Jul 13, 2020 at 8:47 PM Khachatryan Roman <
> khachatryan.roman@gmail.com> wrote:
>
>> Hi Antonio,
>>
>> Please take a look at broadcast variables:
>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/#broadcast-variables
>>
>> Regards,
>> Roman
>>
>>
>> On Mon, Jul 13, 2020 at 3:49 PM Antonio Martínez Carratalá <
>> amartinez@alto-analytics.com> wrote:
>>
>>> Hello
>>>
>>> I'm trying to implement the ForceAtlas2 (graph layout) algorithm in
>>> Flink using datasets, it is an iterative algorithm and I have most of it
>>> ready, but there is something I don't know how to do. Apart from the
>>> dataset with the coordinates (x,y) of each node I need an additional
>>> variable to regulate the speed, very simplified it would be something like
>>> this:
>>>
>>> DataSet<Node> nodes = env.fromCollection(nodesList);
>>> Double speed = 1.0;
>>> nodes.iterate(100) {
>>>    nodes = nodes with forces to apply calculated;
>>>    speed = speed regulated based on previous value and forces calculated;
>>>    nodes = nodes coordinates updated with forces * speed;
>>> } closesWith(nodes)
>>>
>>> In my case, the nodes are the dataset I'm iterating over and it is
>>> working perfect if I forget about speed, but I don't know how to keep speed
>>> variable updated in every iteration to be able to use it in the next one
>>>
>>> Any suggestions? Thanks
>>>
>>>
>>>
>>>
>

Re: Flink DataSet Iterate updating additional variable

Posted by Antonio Martínez Carratalá <am...@alto-analytics.com>.
Hi Roman,

Thank you for your quick reply, but as far as I know broadcast variables
cannot be written, my problem is that I need to update the value of the
speed variable to use it in the next iteration.

Iterate only has one input dataset and propagates it to the next iteration
using closeWith(), but I need another variable (maybe dataset) to be
propagated too, is this possible in some way?

Thanks

On Mon, Jul 13, 2020 at 8:47 PM Khachatryan Roman <
khachatryan.roman@gmail.com> wrote:

> Hi Antonio,
>
> Please take a look at broadcast variables:
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/#broadcast-variables
>
> Regards,
> Roman
>
>
> On Mon, Jul 13, 2020 at 3:49 PM Antonio Martínez Carratalá <
> amartinez@alto-analytics.com> wrote:
>
>> Hello
>>
>> I'm trying to implement the ForceAtlas2 (graph layout) algorithm in Flink
>> using datasets, it is an iterative algorithm and I have most of it ready,
>> but there is something I don't know how to do. Apart from the dataset with
>> the coordinates (x,y) of each node I need an additional variable to
>> regulate the speed, very simplified it would be something like this:
>>
>> DataSet<Node> nodes = env.fromCollection(nodesList);
>> Double speed = 1.0;
>> nodes.iterate(100) {
>>    nodes = nodes with forces to apply calculated;
>>    speed = speed regulated based on previous value and forces calculated;
>>    nodes = nodes coordinates updated with forces * speed;
>> } closesWith(nodes)
>>
>> In my case, the nodes are the dataset I'm iterating over and it is
>> working perfect if I forget about speed, but I don't know how to keep speed
>> variable updated in every iteration to be able to use it in the next one
>>
>> Any suggestions? Thanks
>>
>>
>>
>>

Re: Flink DataSet Iterate updating additional variable

Posted by Khachatryan Roman <kh...@gmail.com>.
Hi Antonio,

Please take a look at broadcast variables:
https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/#broadcast-variables

Regards,
Roman


On Mon, Jul 13, 2020 at 3:49 PM Antonio Martínez Carratalá <
amartinez@alto-analytics.com> wrote:

> Hello
>
> I'm trying to implement the ForceAtlas2 (graph layout) algorithm in Flink
> using datasets, it is an iterative algorithm and I have most of it ready,
> but there is something I don't know how to do. Apart from the dataset with
> the coordinates (x,y) of each node I need an additional variable to
> regulate the speed, very simplified it would be something like this:
>
> DataSet<Node> nodes = env.fromCollection(nodesList);
> Double speed = 1.0;
> nodes.iterate(100) {
>    nodes = nodes with forces to apply calculated;
>    speed = speed regulated based on previous value and forces calculated;
>    nodes = nodes coordinates updated with forces * speed;
> } closesWith(nodes)
>
> In my case, the nodes are the dataset I'm iterating over and it is working
> perfect if I forget about speed, but I don't know how to keep speed
> variable updated in every iteration to be able to use it in the next one
>
> Any suggestions? Thanks
>
>
>
>