You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Riccardo Diomedi <ri...@gmail.com> on 2016/04/28 16:52:28 UTC

aggregation problem

Hi everybody

In a DeltaIteration I have a DataSet<Tuple3<K, V, HashSet<K>>> where, at a certain point of the iteration, i need to count the total number of tuples and the total number of elements in the HashSet of each tuple, and then send both value to the ConvergenceCriterion function.

Example:

this is the content of my DataSet:
(1,2,[2,3])
(2,1,[3,4])
(3,2,[4,5])

i should have:
first count: 3 (1,2,3)
second count: 4 (2,3,4,5)

i tried to iterate the dataset through a flatMap and exploit so an aggregator, putting an HashSet into it(Aggregator), but it didn’t work!

Do you have any suggestion??

thanks 

Riccardo

Re: aggregation problem

Posted by Vasiliki Kalavri <va...@gmail.com>.
Hi Riccardo,

can you please be a bit more specific? What do you mean by "it didn't
work"? Did it crash? Did it give you a wrong value? Something else?

-Vasia.

On 28 April 2016 at 16:52, Riccardo Diomedi <ri...@gmail.com>
wrote:

> Hi everybody
>
> In a DeltaIteration I have a DataSet<Tuple3<K, V, HashSet<K>>> where, at a
> certain point of the iteration, i need to count the total number of tuples
> and the total number of elements in the HashSet of each tuple, and then
> send both value to the ConvergenceCriterion function.
>
> Example:
>
> this is the content of my DataSet:
> (*1*,2,*[2,3]*)
> (*2*,1,*[3,4]*)
> (*3*,2,[*4,5]*)
>
> i should have:
> first count: *3* (1,2,3)
> second count: *4* (2,3,4,5)
>
> i tried to iterate the dataset through a flatMap and exploit so an
> aggregator, putting an HashSet into it(Aggregator), but it didn’t work!
>
> Do you have any suggestion??
>
> thanks
>
> Riccardo
>