You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Fatih Haltas <fa...@nyu.edu> on 2013/03/20 13:40:05 UTC

Combiner in Secondary Sort

Hi Everyone,

I am trying to implement Secondary Sort Algorithm on mydata. But I am
having a trouble with my Combiner.

When I donot use Combiner, grouping is done well, I mean one reduce task is
running for every pair, sharing the same first element.

However, when I set Combiner as Reducer class itself, grouping in the
Combiner is not done according to my custom GroupingComparator class though.

How can i overrride Combiner class's grouping?
Or, is there any diagram, showing the sequence of workflow(both in map
phase, calls of mapper classes and calls of combiner classes and so on
sequentially) in more detailed?

Thanks very much.

Re: Combiner in Secondary Sort

Posted by Fatih Haltas <fa...@nyu.edu>.
Thanks very much Harsh


On Thu, Mar 21, 2013 at 4:50 AM, Harsh J <ha...@cloudera.com> wrote:

> You're probably running into
> https://issues.apache.org/jira/browse/MAPREDUCE-3310. There was also a
> discussion on this IIRC but I cannot find the archive now. One aside
> mention on this from Chris is at
> http://search-hadoop.com/m/RH5AP11ob2o1.
>
> On Wed, Mar 20, 2013 at 6:10 PM, Fatih Haltas <fa...@nyu.edu>
> wrote:
> > Hi Everyone,
> >
> > I am trying to implement Secondary Sort Algorithm on mydata. But I am
> having
> > a trouble with my Combiner.
> >
> > When I donot use Combiner, grouping is done well, I mean one reduce task
> is
> > running for every pair, sharing the same first element.
> >
> > However, when I set Combiner as Reducer class itself, grouping in the
> > Combiner is not done according to my custom GroupingComparator class
> though.
> >
> > How can i overrride Combiner class's grouping?
> > Or, is there any diagram, showing the sequence of workflow(both in map
> > phase, calls of mapper classes and calls of combiner classes and so on
> > sequentially) in more detailed?
> >
> > Thanks very much.
>
>
>
> --
> Harsh J
>

Re: Combiner in Secondary Sort

Posted by Fatih Haltas <fa...@nyu.edu>.
Thanks very much Harsh


On Thu, Mar 21, 2013 at 4:50 AM, Harsh J <ha...@cloudera.com> wrote:

> You're probably running into
> https://issues.apache.org/jira/browse/MAPREDUCE-3310. There was also a
> discussion on this IIRC but I cannot find the archive now. One aside
> mention on this from Chris is at
> http://search-hadoop.com/m/RH5AP11ob2o1.
>
> On Wed, Mar 20, 2013 at 6:10 PM, Fatih Haltas <fa...@nyu.edu>
> wrote:
> > Hi Everyone,
> >
> > I am trying to implement Secondary Sort Algorithm on mydata. But I am
> having
> > a trouble with my Combiner.
> >
> > When I donot use Combiner, grouping is done well, I mean one reduce task
> is
> > running for every pair, sharing the same first element.
> >
> > However, when I set Combiner as Reducer class itself, grouping in the
> > Combiner is not done according to my custom GroupingComparator class
> though.
> >
> > How can i overrride Combiner class's grouping?
> > Or, is there any diagram, showing the sequence of workflow(both in map
> > phase, calls of mapper classes and calls of combiner classes and so on
> > sequentially) in more detailed?
> >
> > Thanks very much.
>
>
>
> --
> Harsh J
>

Re: Combiner in Secondary Sort

Posted by Fatih Haltas <fa...@nyu.edu>.
Thanks very much Harsh


On Thu, Mar 21, 2013 at 4:50 AM, Harsh J <ha...@cloudera.com> wrote:

> You're probably running into
> https://issues.apache.org/jira/browse/MAPREDUCE-3310. There was also a
> discussion on this IIRC but I cannot find the archive now. One aside
> mention on this from Chris is at
> http://search-hadoop.com/m/RH5AP11ob2o1.
>
> On Wed, Mar 20, 2013 at 6:10 PM, Fatih Haltas <fa...@nyu.edu>
> wrote:
> > Hi Everyone,
> >
> > I am trying to implement Secondary Sort Algorithm on mydata. But I am
> having
> > a trouble with my Combiner.
> >
> > When I donot use Combiner, grouping is done well, I mean one reduce task
> is
> > running for every pair, sharing the same first element.
> >
> > However, when I set Combiner as Reducer class itself, grouping in the
> > Combiner is not done according to my custom GroupingComparator class
> though.
> >
> > How can i overrride Combiner class's grouping?
> > Or, is there any diagram, showing the sequence of workflow(both in map
> > phase, calls of mapper classes and calls of combiner classes and so on
> > sequentially) in more detailed?
> >
> > Thanks very much.
>
>
>
> --
> Harsh J
>

Re: Combiner in Secondary Sort

Posted by Fatih Haltas <fa...@nyu.edu>.
Thanks very much Harsh


On Thu, Mar 21, 2013 at 4:50 AM, Harsh J <ha...@cloudera.com> wrote:

> You're probably running into
> https://issues.apache.org/jira/browse/MAPREDUCE-3310. There was also a
> discussion on this IIRC but I cannot find the archive now. One aside
> mention on this from Chris is at
> http://search-hadoop.com/m/RH5AP11ob2o1.
>
> On Wed, Mar 20, 2013 at 6:10 PM, Fatih Haltas <fa...@nyu.edu>
> wrote:
> > Hi Everyone,
> >
> > I am trying to implement Secondary Sort Algorithm on mydata. But I am
> having
> > a trouble with my Combiner.
> >
> > When I donot use Combiner, grouping is done well, I mean one reduce task
> is
> > running for every pair, sharing the same first element.
> >
> > However, when I set Combiner as Reducer class itself, grouping in the
> > Combiner is not done according to my custom GroupingComparator class
> though.
> >
> > How can i overrride Combiner class's grouping?
> > Or, is there any diagram, showing the sequence of workflow(both in map
> > phase, calls of mapper classes and calls of combiner classes and so on
> > sequentially) in more detailed?
> >
> > Thanks very much.
>
>
>
> --
> Harsh J
>

Re: Combiner in Secondary Sort

Posted by Harsh J <ha...@cloudera.com>.
You're probably running into
https://issues.apache.org/jira/browse/MAPREDUCE-3310. There was also a
discussion on this IIRC but I cannot find the archive now. One aside
mention on this from Chris is at
http://search-hadoop.com/m/RH5AP11ob2o1.

On Wed, Mar 20, 2013 at 6:10 PM, Fatih Haltas <fa...@nyu.edu> wrote:
> Hi Everyone,
>
> I am trying to implement Secondary Sort Algorithm on mydata. But I am having
> a trouble with my Combiner.
>
> When I donot use Combiner, grouping is done well, I mean one reduce task is
> running for every pair, sharing the same first element.
>
> However, when I set Combiner as Reducer class itself, grouping in the
> Combiner is not done according to my custom GroupingComparator class though.
>
> How can i overrride Combiner class's grouping?
> Or, is there any diagram, showing the sequence of workflow(both in map
> phase, calls of mapper classes and calls of combiner classes and so on
> sequentially) in more detailed?
>
> Thanks very much.



--
Harsh J

Re: Combiner in Secondary Sort

Posted by Harsh J <ha...@cloudera.com>.
You're probably running into
https://issues.apache.org/jira/browse/MAPREDUCE-3310. There was also a
discussion on this IIRC but I cannot find the archive now. One aside
mention on this from Chris is at
http://search-hadoop.com/m/RH5AP11ob2o1.

On Wed, Mar 20, 2013 at 6:10 PM, Fatih Haltas <fa...@nyu.edu> wrote:
> Hi Everyone,
>
> I am trying to implement Secondary Sort Algorithm on mydata. But I am having
> a trouble with my Combiner.
>
> When I donot use Combiner, grouping is done well, I mean one reduce task is
> running for every pair, sharing the same first element.
>
> However, when I set Combiner as Reducer class itself, grouping in the
> Combiner is not done according to my custom GroupingComparator class though.
>
> How can i overrride Combiner class's grouping?
> Or, is there any diagram, showing the sequence of workflow(both in map
> phase, calls of mapper classes and calls of combiner classes and so on
> sequentially) in more detailed?
>
> Thanks very much.



--
Harsh J

Re: Combiner in Secondary Sort

Posted by Harsh J <ha...@cloudera.com>.
You're probably running into
https://issues.apache.org/jira/browse/MAPREDUCE-3310. There was also a
discussion on this IIRC but I cannot find the archive now. One aside
mention on this from Chris is at
http://search-hadoop.com/m/RH5AP11ob2o1.

On Wed, Mar 20, 2013 at 6:10 PM, Fatih Haltas <fa...@nyu.edu> wrote:
> Hi Everyone,
>
> I am trying to implement Secondary Sort Algorithm on mydata. But I am having
> a trouble with my Combiner.
>
> When I donot use Combiner, grouping is done well, I mean one reduce task is
> running for every pair, sharing the same first element.
>
> However, when I set Combiner as Reducer class itself, grouping in the
> Combiner is not done according to my custom GroupingComparator class though.
>
> How can i overrride Combiner class's grouping?
> Or, is there any diagram, showing the sequence of workflow(both in map
> phase, calls of mapper classes and calls of combiner classes and so on
> sequentially) in more detailed?
>
> Thanks very much.



--
Harsh J

Re: Combiner in Secondary Sort

Posted by Harsh J <ha...@cloudera.com>.
You're probably running into
https://issues.apache.org/jira/browse/MAPREDUCE-3310. There was also a
discussion on this IIRC but I cannot find the archive now. One aside
mention on this from Chris is at
http://search-hadoop.com/m/RH5AP11ob2o1.

On Wed, Mar 20, 2013 at 6:10 PM, Fatih Haltas <fa...@nyu.edu> wrote:
> Hi Everyone,
>
> I am trying to implement Secondary Sort Algorithm on mydata. But I am having
> a trouble with my Combiner.
>
> When I donot use Combiner, grouping is done well, I mean one reduce task is
> running for every pair, sharing the same first element.
>
> However, when I set Combiner as Reducer class itself, grouping in the
> Combiner is not done according to my custom GroupingComparator class though.
>
> How can i overrride Combiner class's grouping?
> Or, is there any diagram, showing the sequence of workflow(both in map
> phase, calls of mapper classes and calls of combiner classes and so on
> sequentially) in more detailed?
>
> Thanks very much.



--
Harsh J