You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Ventura Del Monte <ve...@gmail.com> on 2015/05/11 18:36:52 UTC

Broadcasted Variable not updated in a Iterative Data Set

Hello,

I am writing a Flink application whose purpose is to train neural networks.
I have implemented a version of SDG based on mini batches using the
IterativeDataSet class. In a nutshell, my code groups the input features in
a set called mini batch (i have as many mini batches as the parallelism
degree), then it calculates a gradient over each mini batch in a map tasks
and then those gradients are merged in a reduce task.
Unfortunately it seems the trained param is not broadcasted at the end of
each epoch and it looks like that it starts the next epoch using the
initial parameter. I know it is probably a programming error which I cannot
see at the moment, but I was wondering if there were some guidelines for
broadcasting an object in the correct way.
Is it asking too much if you could have a look at my code, please? Thank
you in advance.

Regards,
Ventura

Re: Broadcasted Variable not updated in a Iterative Data Set

Posted by Stephan Ewen <se...@apache.org>.
Happy to hear that it works!

On Wed, May 13, 2015 at 12:42 AM, Ventura Del Monte <
venturadelmonte@gmail.com> wrote:

> Thank you for your clarification. At the end I figured out, it was a silly
> programming error, now it just works fine. Thank you again for your
> explanation.
>
>
> 2015-05-11 22:31 GMT+02:00 Stephan Ewen <se...@apache.org>:
>
>> Can you share the respective part of the code, then we can have a look?
>>
>> In general, a DataSet is re-broadcasted in each iteration, if it depends
>> on the IterativeDataSet.
>> You have to re-grab it from the RuntimeContext in each superstep. If you
>> call getBroadcastSet() in the open() method, then it will get executed in
>> each superstep (open gets called at the beginning of each superstep).
>>
>> Greetings,
>> Stephan
>>
>>
>> On Mon, May 11, 2015 at 6:36 PM, Ventura Del Monte <
>> venturadelmonte@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I am writing a Flink application whose purpose is to train neural
>>> networks. I have implemented a version of SDG based on mini batches using
>>> the IterativeDataSet class. In a nutshell, my code groups the input
>>> features in a set called mini batch (i have as many mini batches as the
>>> parallelism degree), then it calculates a gradient over each mini batch in
>>> a map tasks and then those gradients are merged in a reduce task.
>>> Unfortunately it seems the trained param is not broadcasted at the end
>>> of each epoch and it looks like that it starts the next epoch using the
>>> initial parameter. I know it is probably a programming error which I cannot
>>> see at the moment, but I was wondering if there were some guidelines for
>>> broadcasting an object in the correct way.
>>> Is it asking too much if you could have a look at my code, please? Thank
>>> you in advance.
>>>
>>> Regards,
>>> Ventura
>>>
>>
>>
>

Re: Broadcasted Variable not updated in a Iterative Data Set

Posted by Ventura Del Monte <ve...@gmail.com>.
Thank you for your clarification. At the end I figured out, it was a silly
programming error, now it just works fine. Thank you again for your
explanation.


2015-05-11 22:31 GMT+02:00 Stephan Ewen <se...@apache.org>:

> Can you share the respective part of the code, then we can have a look?
>
> In general, a DataSet is re-broadcasted in each iteration, if it depends
> on the IterativeDataSet.
> You have to re-grab it from the RuntimeContext in each superstep. If you
> call getBroadcastSet() in the open() method, then it will get executed in
> each superstep (open gets called at the beginning of each superstep).
>
> Greetings,
> Stephan
>
>
> On Mon, May 11, 2015 at 6:36 PM, Ventura Del Monte <
> venturadelmonte@gmail.com> wrote:
>
>> Hello,
>>
>> I am writing a Flink application whose purpose is to train neural
>> networks. I have implemented a version of SDG based on mini batches using
>> the IterativeDataSet class. In a nutshell, my code groups the input
>> features in a set called mini batch (i have as many mini batches as the
>> parallelism degree), then it calculates a gradient over each mini batch in
>> a map tasks and then those gradients are merged in a reduce task.
>> Unfortunately it seems the trained param is not broadcasted at the end of
>> each epoch and it looks like that it starts the next epoch using the
>> initial parameter. I know it is probably a programming error which I cannot
>> see at the moment, but I was wondering if there were some guidelines for
>> broadcasting an object in the correct way.
>> Is it asking too much if you could have a look at my code, please? Thank
>> you in advance.
>>
>> Regards,
>> Ventura
>>
>
>

Re: Broadcasted Variable not updated in a Iterative Data Set

Posted by Stephan Ewen <se...@apache.org>.
Can you share the respective part of the code, then we can have a look?

In general, a DataSet is re-broadcasted in each iteration, if it depends on
the IterativeDataSet.
You have to re-grab it from the RuntimeContext in each superstep. If you
call getBroadcastSet() in the open() method, then it will get executed in
each superstep (open gets called at the beginning of each superstep).

Greetings,
Stephan


On Mon, May 11, 2015 at 6:36 PM, Ventura Del Monte <
venturadelmonte@gmail.com> wrote:

> Hello,
>
> I am writing a Flink application whose purpose is to train neural
> networks. I have implemented a version of SDG based on mini batches using
> the IterativeDataSet class. In a nutshell, my code groups the input
> features in a set called mini batch (i have as many mini batches as the
> parallelism degree), then it calculates a gradient over each mini batch in
> a map tasks and then those gradients are merged in a reduce task.
> Unfortunately it seems the trained param is not broadcasted at the end of
> each epoch and it looks like that it starts the next epoch using the
> initial parameter. I know it is probably a programming error which I cannot
> see at the moment, but I was wondering if there were some guidelines for
> broadcasting an object in the correct way.
> Is it asking too much if you could have a look at my code, please? Thank
> you in advance.
>
> Regards,
> Ventura
>