You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Keren Ouaknine <ke...@gmail.com> on 2014/08/30 03:52:00 UTC

map-side and reduce-side join implementations

Hi,

I am looking for implementations of a simple two way join algorithm based
on the new api (mapreduce) map-side join and reduce-side join. (cause one
join has two large datasets and the other has a large and a small dataset
so that would map to reduce side-join and map-side join respectively).

I started with the Def Guide book, but it doesn't have a map-side example
(just explanations on the algorithm). I looked on the web and found
examples of three way-joins only.

Any pointers?

Thanks,
Keren

-- 
Keren Ouaknine
www.kereno.com

Re: map-side and reduce-side join implementations

Posted by Pedro Magalhaes <pe...@gmail.com>.
Keren,

The map side join can be implementes using CompositeInputFormat or
DistributedCache.

If u googled these two words you can find some implementations.

Hope that it helps.


Em sexta-feira, 29 de agosto de 2014, Keren Ouaknine <ke...@gmail.com>
escreveu:

> Hi,
>
> I am looking for implementations of a simple two way join algorithm based
> on the new api (mapreduce) map-side join and reduce-side join. (cause one
> join has two large datasets and the other has a large and a small dataset
> so that would map to reduce side-join and map-side join respectively).
>
> I started with the Def Guide book, but it doesn't have a map-side example
> (just explanations on the algorithm). I looked on the web and found
> examples of three way-joins only.
>
> Any pointers?
>
> Thanks,
> Keren
>
> --
> Keren Ouaknine
> www.kereno.com
>