You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hama.apache.org by Apurv Verma <da...@gmail.com> on 2012/07/04 22:13:42 UTC

Expressing MapReduce with BSP

Hello,
 Here is a simplistic WordCount example I wrote with hama. There are a few
TODOs left but it works fine, Its fully scalable when all TODOs are
complete.

http://code.google.com/p/anahad/source/browse/trunk/src/main/java/org/anahata/bsp/WordCount.java

Comments welcome :)

--
thanks and regards,

Apurv Verma
India

Re: Expressing MapReduce with BSP

Posted by Thomas Jungblut <th...@gmail.com>.
Hi Apurv,

cool implementation. Also solves the problem the normal wordcount example
has by emitting every word with frequency 1 (large communication overhead
between map and reduce stage).
I would use the Guava MultiMap instead of the Java HashMap because it has
the cool count and auto increment feature.

Why the overhead of merging and sorting for yourself? You could use the
sorted message queue in Hama 0.5.0, this isn't disk based so you will not
have that scalability that you want to target but drastically reduce the
complexity of your code.
If you are working on it anyways, you could create a disk based sorted
queue which does this merging of the messages implicitly.

2012/7/5 Praveen Sripati <pr...@gmail.com>

> Apurv,
>
> Not sure of you have seen this paper or not,  but it concludes that
> effectively all MR jobs can be expressed as BSP jobs and other way. It also
> mentions when to go for BSP vs MR.
>
> http://arxiv.org/abs/1203.2081
>
> Thanks,
> Praveen
>
>
> On Thu, Jul 5, 2012 at 1:43 AM, Apurv Verma <da...@gmail.com> wrote:
>
> > Hello,
> >  Here is a simplistic WordCount example I wrote with hama. There are a
> few
> > TODOs left but it works fine, Its fully scalable when all TODOs are
> > complete.
> >
> >
> >
> http://code.google.com/p/anahad/source/browse/trunk/src/main/java/org/anahata/bsp/WordCount.java
> >
> > Comments welcome :)
> >
> > --
> > thanks and regards,
> >
> > Apurv Verma
> > India
> >
>

Re: Expressing MapReduce with BSP

Posted by Praveen Sripati <pr...@gmail.com>.
Apurv,

Not sure of you have seen this paper or not,  but it concludes that
effectively all MR jobs can be expressed as BSP jobs and other way. It also
mentions when to go for BSP vs MR.

http://arxiv.org/abs/1203.2081

Thanks,
Praveen


On Thu, Jul 5, 2012 at 1:43 AM, Apurv Verma <da...@gmail.com> wrote:

> Hello,
>  Here is a simplistic WordCount example I wrote with hama. There are a few
> TODOs left but it works fine, Its fully scalable when all TODOs are
> complete.
>
>
> http://code.google.com/p/anahad/source/browse/trunk/src/main/java/org/anahata/bsp/WordCount.java
>
> Comments welcome :)
>
> --
> thanks and regards,
>
> Apurv Verma
> India
>

Re: Expressing MapReduce with BSP

Posted by Tommaso Teofili <to...@gmail.com>.
Hi Apurv,

I just had a quick look, it looks like it's a basic algorithm but still
(IMHO) would make sense to add it to our examples (as it'd be probably
still the only one using IO).
My 2 cents,

Tommaso

2012/7/4 Apurv Verma <da...@gmail.com>

> Hello,
>  Here is a simplistic WordCount example I wrote with hama. There are a few
> TODOs left but it works fine, Its fully scalable when all TODOs are
> complete.
>
>
> http://code.google.com/p/anahad/source/browse/trunk/src/main/java/org/anahata/bsp/WordCount.java
>
> Comments welcome :)
>
> --
> thanks and regards,
>
> Apurv Verma
> India
>