You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Lydia Ickler <ic...@googlemail.com> on 2016/03/17 14:06:25 UTC

Help with DeltaIteration

Hi,
I have a question regarding the Delta Iteration. 
I basically want to iterate as long as the former and the new calculated set are different. Stop if they are the same.

Right now I get a result set that has entries with duplicate „row“ indices which should not be the case.
I guess I am doing something wrong in the iteration.closeWith(intermediate, diffs); Maybe I am sending only parts of the set but for the Multiplication (ProjectJoinResultMapper()) I need the whole DataSet.
Could somebody please hint me in the right direction?

Thanks in advance!

This is what I have right now:
DataSet<Tuple3<Integer, Integer, Double>> initial = matrixA.groupBy(0).sum();

//normalize by maximum value
initial = initial.cross(initial.sum(2)).map(new normalizeByMax());
DeltaIteration<Tuple3<Integer, Integer, Double>,Tuple3<Integer, Integer, Double>> iteration = initial.iterateDelta(initial, 1, 0,1);

DataSet<Tuple3<Integer, Integer, Double>> intermediate = matrixA.join(iteration.getWorkset()).where(1).equalTo(0)
        .map(new ProjectJoinResultMapper()).groupBy(0, 1).sum(2).groupBy(0).sum(2).cross(matrixA.join(iteration.getWorkset()).where(1).equalTo(0)
                .map(new ProjectJoinResultMapper()).groupBy(0, 1).sum(2).groupBy(0).sum(2).max(2)).map(new normalizeByMax());
DataSet<Tuple3<Integer, Integer, Double>> diffs = intermediate.join(iteration.getSolutionSet()).where(0,1).equalTo(0,1).with(new ComponentIdFilter());
DataSet<Tuple3<Integer, Integer, Double>> result = iteration.closeWith(intermediate, diffs);
public static final class ComponentIdFilter implements FlatJoinFunction<Tuple3<Integer, Integer, Double>,Tuple3<Integer, Integer, Double>,Tuple3<Integer, Integer, Double>> {

    public void join(Tuple3<Integer, Integer, Double> candidate, Tuple3<Integer, Integer, Double> old, Collector<Tuple3<Integer, Integer, Double>> out) {

        if(!candidate.f2.equals(old.f2)){
            out.collect(candidate);
        }
    }
}


Re: Help with DeltaIteration

Posted by Balaji Rajagopalan <ba...@olacabs.com>.
The easier way to debug this would be have prints in the
projectjoinresultmapper and see what data you are getting. It is possible
your original dataset has duplicate rows ?

On Thu, Mar 17, 2016 at 6:36 PM, Lydia Ickler <ic...@googlemail.com>
wrote:

> Hi,
> I have a question regarding the Delta Iteration.
> I basically want to iterate as long as the former and the new calculated
> set are different. Stop if they are the same.
>
> Right now I get a result set that has entries with duplicate „row“ indices
> which should not be the case.
> I guess I am doing something wrong in the
> iteration.closeWith(intermediate, diffs); Maybe I am sending only parts
> of the set but for the Multiplication (ProjectJoinResultMapper()) I need
> the whole DataSet.
> Could somebody please hint me in the right direction?
>
> Thanks in advance!
>
> This is what I have right now:
>
> DataSet<Tuple3<Integer, Integer, Double>> initial = matrixA.groupBy(0).sum();
>
> //normalize by maximum value
> initial = initial.cross(initial.sum(2)).map(new normalizeByMax());
>
> DeltaIteration<Tuple3<Integer, Integer, Double>,Tuple3<Integer, Integer, Double>> iteration = initial.iterateDelta(initial, 1, 0,1);
>
> DataSet<Tuple3<Integer, Integer, Double>> intermediate = matrixA.join(iteration.getWorkset()).where(1).equalTo(0)
>         .map(new ProjectJoinResultMapper()).groupBy(0, 1).sum(2).groupBy(0).sum(2).cross(matrixA.join(iteration.getWorkset()).where(1).equalTo(0)
>                 .map(new ProjectJoinResultMapper()).groupBy(0, 1).sum(2).groupBy(0).sum(2).max(2)).map(new normalizeByMax());
> DataSet<Tuple3<Integer, Integer, Double>> diffs = intermediate.join(iteration.getSolutionSet()).where(0,1).equalTo(0,1).with(new ComponentIdFilter());
> DataSet<Tuple3<Integer, Integer, Double>> result = iteration.closeWith(intermediate, diffs);
>
> public static final class ComponentIdFilter implements FlatJoinFunction<Tuple3<Integer, Integer, Double>,Tuple3<Integer, Integer, Double>,Tuple3<Integer, Integer, Double>> {
>
>     public void join(Tuple3<Integer, Integer, Double> candidate, Tuple3<Integer, Integer, Double> old, Collector<Tuple3<Integer, Integer, Double>> out) {
>
>         if(!candidate.f2.equals(old.f2)){
>             out.collect(candidate);
>         }
>     }
> }
>
>
>