You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by subash basnet <ya...@gmail.com> on 2016/05/03 12:35:59 UTC
How to perform Broadcast and groupBy in DataStream like DataSet
Hello all,
How could we perform *withBroadcastSet* and *groupBy* in DataStream like
that of DataSet in the below KMeans code:
DataSet<Centroid> newCentroids = points
.map(new SelectNearestCenter()).*withBroadcastSet*(loop, "centroids")
.map(new CountAppender()).*groupBy*(0).reduce(new CentroidAccumulator())
.map(new CentroidAverager());
DataStream<Centroid> newCentroids = points.map(new
SelectNearestCenter()).???
Best Regards,
Subash Basnet
Re: How to perform Broadcast and groupBy in DataStream like DataSet
Posted by subash basnet <ya...@gmail.com>.
Hello Stefano,
Thank you, I found out that just sometime ago that I could use keyBy, but I
couldn't find how to set and getBroadcastVariable in datastream like that
of dataset.
For example in below code we get collection of *centroids* via broadcast.
Eg: In KMeans.java
class X extends MapFunctions<>{
private Collection<Centroid> *centroids*;
public void open(Configuration parameters) throws Exception {
this.*centroids* = getRuntimeContext().getBroadcastVariable("centroids");
}
for (Centroid cent : *centroids*) {
}
}
Best Regards,
Subash Basnet
On Tue, May 3, 2016 at 4:04 PM, Stefano Baghino <
stefano.baghino@radicalbit.io> wrote:
> I'm not sure in regards of "withBroadcastSet", but in the DataStream you
> "keyBy" instead of "groupBy".
>
> On Tue, May 3, 2016 at 12:35 PM, subash basnet <ya...@gmail.com> wrote:
>
>> Hello all,
>>
>> How could we perform *withBroadcastSet* and *groupBy* in DataStream like
>> that of DataSet in the below KMeans code:
>>
>> DataSet<Centroid> newCentroids = points
>> .map(new SelectNearestCenter()).*withBroadcastSet*(loop, "centroids")
>> .map(new CountAppender()).*groupBy*(0).reduce(new CentroidAccumulator())
>> .map(new CentroidAverager());
>>
>>
>> DataStream<Centroid> newCentroids = points.map(new
>> SelectNearestCenter()).???
>>
>>
>> Best Regards,
>> Subash Basnet
>>
>
>
>
> --
> BR,
> Stefano Baghino
>
> Software Engineer @ Radicalbit
>
Re: How to perform Broadcast and groupBy in DataStream like DataSet
Posted by Stefano Baghino <st...@radicalbit.io>.
I'm not sure in regards of "withBroadcastSet", but in the DataStream you
"keyBy" instead of "groupBy".
On Tue, May 3, 2016 at 12:35 PM, subash basnet <ya...@gmail.com> wrote:
> Hello all,
>
> How could we perform *withBroadcastSet* and *groupBy* in DataStream like
> that of DataSet in the below KMeans code:
>
> DataSet<Centroid> newCentroids = points
> .map(new SelectNearestCenter()).*withBroadcastSet*(loop, "centroids")
> .map(new CountAppender()).*groupBy*(0).reduce(new CentroidAccumulator())
> .map(new CentroidAverager());
>
>
> DataStream<Centroid> newCentroids = points.map(new
> SelectNearestCenter()).???
>
>
> Best Regards,
> Subash Basnet
>
--
BR,
Stefano Baghino
Software Engineer @ Radicalbit