You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Maximilian Alber <al...@gmail.com> on 2014/12/11 12:15:07 UTC
Understanding the behavior
Hi Flinksters,
after mapping a data set, the only value seems to disappear. I cannot
explain this behavior. Maybe someone can help me?
In this code I have 4 versions, the first two do basically nothing, but
ensure us that there is actually a value inside the dataset.
Version 2 maps the vector to a new vector. But the result set is empty.
Version 3 the same.
What I would like to achieve is version 3 aka change the id value of the
vector. But somehow the vector disappears and the result is always an empty
set.
val startWidth =
env.fromCollection[Vector](Seq(Vector.ones(config.dimensions) *
config.startWidth)) map {x => new Vector(0, x.values)}
val startUpdate =
env.fromCollection[Vector](Seq(Vector.ones(config.dimensions) * 0.01F)) map
{x => new Vector(1, x.values)}
val startLastGradient =
env.fromCollection[Vector](Seq(Vector.zeros(config.dimensions))) map {x =>
new Vector(2, x.values)}
var stepSet = startWidth union startUpdate union startLastGradient
stepSet = stepSet.iterate(1){
stepSet =>
// version 1
val width = stepSet filter {_.id == 0};// works
// version 2
val width = stepSet filter {_.id == 0} map {x => x};// works
// version 3
val width = stepSet filter {_.id == 0} map {x => new Vector(-1,
x.values)};// does not work
// version 4
val width = stepSet filter {_.id == 0} map {x: Vector => new Vector(23,
Array(1.0F, 2.0F))};// does not work
width
}
I append you jar, source code and input files.
The program writes into the the out_file the width dataset.
You can change the code "versions" at line 353 cont.
May call the program with (you need to update jar, in_file, random_file,
set out_file as you want):
flink run the_jar_file '-c', 'bumpboost.Job', 'in_file=/tmp/tmpdW3O98',
'out_file=/tmp/tmp2RISRF', 'random_file=/tmp/tmpEN9XU7', 'dimensions=1',
'N=100', 'iterations=30', 'multi_bump_boost=1',
'gradient_descent_iterations=30', 'cache=False', 'start_width=1.0',
'min_width=-4', 'max_width=6', 'min_width_update=1e-08',
'max_width_update=10'
Thank you!
Cheers,
Max
Re: Understanding the behavior
Posted by Maximilian Alber <al...@gmail.com>.
Hi!
Damn, I thought I set the iteration count at 1, but I did it in the wrong
place. My bad.
Thanks!!
Cheers,
max
On Thu, Dec 11, 2014 at 12:41 PM, Aljoscha Krettek <al...@apache.org>
wrote:
> Hi,
> the reasons is that you filter out the items. In your first iteration,
> a new element is created that has a -1 and a 23 as the first field,
> respectively for version 3 and version 4. In the second iteration, you
> filter out all elements that do not have a "0" as the first field.
> Thus you arrive at an empty set.
>
> Cheers,
> Aljoscha
>
> On Thu, Dec 11, 2014 at 12:15 PM, Maximilian Alber
> <al...@gmail.com> wrote:
> > Hi Flinksters,
> >
> > after mapping a data set, the only value seems to disappear. I cannot
> > explain this behavior. Maybe someone can help me?
> >
> > In this code I have 4 versions, the first two do basically nothing, but
> > ensure us that there is actually a value inside the dataset.
> > Version 2 maps the vector to a new vector. But the result set is empty.
> > Version 3 the same.
> >
> > What I would like to achieve is version 3 aka change the id value of the
> > vector. But somehow the vector disappears and the result is always an
> empty
> > set.
> >
> > val startWidth =
> > env.fromCollection[Vector](Seq(Vector.ones(config.dimensions) *
> > config.startWidth)) map {x => new Vector(0, x.values)}
> > val startUpdate =
> > env.fromCollection[Vector](Seq(Vector.ones(config.dimensions) * 0.01F))
> map
> > {x => new Vector(1, x.values)}
> > val startLastGradient =
> > env.fromCollection[Vector](Seq(Vector.zeros(config.dimensions))) map {x
> =>
> > new Vector(2, x.values)}
> >
> > var stepSet = startWidth union startUpdate union startLastGradient
> > stepSet = stepSet.iterate(1){
> > stepSet =>
> > // version 1
> > val width = stepSet filter {_.id == 0};// works
> > // version 2
> > val width = stepSet filter {_.id == 0} map {x => x};// works
> > // version 3
> > val width = stepSet filter {_.id == 0} map {x => new Vector(-1,
> > x.values)};// does not work
> > // version 4
> > val width = stepSet filter {_.id == 0} map {x: Vector => new
> Vector(23,
> > Array(1.0F, 2.0F))};// does not work
> > width
> > }
> >
> >
> > I append you jar, source code and input files.
> > The program writes into the the out_file the width dataset.
> > You can change the code "versions" at line 353 cont.
> >
> > May call the program with (you need to update jar, in_file, random_file,
> set
> > out_file as you want):
> > flink run the_jar_file '-c', 'bumpboost.Job', 'in_file=/tmp/tmpdW3O98',
> > 'out_file=/tmp/tmp2RISRF', 'random_file=/tmp/tmpEN9XU7', 'dimensions=1',
> > 'N=100', 'iterations=30', 'multi_bump_boost=1',
> > 'gradient_descent_iterations=30', 'cache=False', 'start_width=1.0',
> > 'min_width=-4', 'max_width=6', 'min_width_update=1e-08',
> > 'max_width_update=10'
> >
> > Thank you!
> > Cheers,
> > Max
>
Re: Understanding the behavior
Posted by Aljoscha Krettek <al...@apache.org>.
Hi,
the reasons is that you filter out the items. In your first iteration,
a new element is created that has a -1 and a 23 as the first field,
respectively for version 3 and version 4. In the second iteration, you
filter out all elements that do not have a "0" as the first field.
Thus you arrive at an empty set.
Cheers,
Aljoscha
On Thu, Dec 11, 2014 at 12:15 PM, Maximilian Alber
<al...@gmail.com> wrote:
> Hi Flinksters,
>
> after mapping a data set, the only value seems to disappear. I cannot
> explain this behavior. Maybe someone can help me?
>
> In this code I have 4 versions, the first two do basically nothing, but
> ensure us that there is actually a value inside the dataset.
> Version 2 maps the vector to a new vector. But the result set is empty.
> Version 3 the same.
>
> What I would like to achieve is version 3 aka change the id value of the
> vector. But somehow the vector disappears and the result is always an empty
> set.
>
> val startWidth =
> env.fromCollection[Vector](Seq(Vector.ones(config.dimensions) *
> config.startWidth)) map {x => new Vector(0, x.values)}
> val startUpdate =
> env.fromCollection[Vector](Seq(Vector.ones(config.dimensions) * 0.01F)) map
> {x => new Vector(1, x.values)}
> val startLastGradient =
> env.fromCollection[Vector](Seq(Vector.zeros(config.dimensions))) map {x =>
> new Vector(2, x.values)}
>
> var stepSet = startWidth union startUpdate union startLastGradient
> stepSet = stepSet.iterate(1){
> stepSet =>
> // version 1
> val width = stepSet filter {_.id == 0};// works
> // version 2
> val width = stepSet filter {_.id == 0} map {x => x};// works
> // version 3
> val width = stepSet filter {_.id == 0} map {x => new Vector(-1,
> x.values)};// does not work
> // version 4
> val width = stepSet filter {_.id == 0} map {x: Vector => new Vector(23,
> Array(1.0F, 2.0F))};// does not work
> width
> }
>
>
> I append you jar, source code and input files.
> The program writes into the the out_file the width dataset.
> You can change the code "versions" at line 353 cont.
>
> May call the program with (you need to update jar, in_file, random_file, set
> out_file as you want):
> flink run the_jar_file '-c', 'bumpboost.Job', 'in_file=/tmp/tmpdW3O98',
> 'out_file=/tmp/tmp2RISRF', 'random_file=/tmp/tmpEN9XU7', 'dimensions=1',
> 'N=100', 'iterations=30', 'multi_bump_boost=1',
> 'gradient_descent_iterations=30', 'cache=False', 'start_width=1.0',
> 'min_width=-4', 'max_width=6', 'min_width_update=1e-08',
> 'max_width_update=10'
>
> Thank you!
> Cheers,
> Max