You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Feng Jiang <fe...@gmail.com> on 2006/12/20 07:07:10 UTC

Re: Some new requests about mapreduce

I saw this bug has been fixed in 0.90. but I still think it can be improved.
can i just attach a new patch under the same issue?

thanks,

Feng

On 11/8/06, Feng Jiang <fe...@gmail.com> wrote:
>
> Thanks. I have attached the patch:
> http://issues.apache.org/jira/secure/ManageAttachments.jspa?id=12354993
>
> Best regards,
> Feng
>
> On 11/8/06, Doug Cutting <cu...@apache.org> wrote:
> >
> > Feng Jiang wrote:
> > > I think what I am concerning is different with the request485. I mean,
> > > if the input of Reduce phase is :
> > >
> > > K2, V3
> > > K2, V2
> > > K1, V5
> > > K1, V3
> > > K1, V4
> > >
> > > in the current hadoop, the reduce output could be:
> > > K1, (V5, V3, V4)
> > > K2, (V3, V2)
> > >
> > > But I hope hadoop supports job.setOutputValueComparatorClass
> > (theClass),
> > > so that i can make values are in order, and the output could be:
> > > K1, (V3, V4, V5)
> > > K2, (V2, V3)
> >
> > Yes, that is different.  One can currently achieve what you're after by
> > including values in keys.  The only real difference between keys and
> > values is that values are not used for sorting, and some optimizations
> > are made because of that.  But if you need to sort by value as well as
> > key, then you can use compound key that includes both, and a null value.
> >    Note that with block compression, repeated keys should not use too
> > much space.  Does that suffice?
> >
> > Another related issue is http://issues.apache.org/jira/browse/HADOOP-475
> > .
> >
> > > but I have written the GenericWritable, which is a abstract class to
> > > help user wrap different Writable instances with only one byte cost.
> > The
> > > GenericObject is a demo showing how to use GenericWritable. Both of
> > them
> > > are attached within this email.
> >
> > The attachment did not make it.  Can you please attach these to a Jira
> > issue, as a patch file?
> >
> > http://wiki.apache.org/lucene-hadoop/HowToContribute
> >
> > Thanks!
> >
> > Doug
> >
>
>

Re: Some new requests about mapreduce

Posted by Doug Cutting <cu...@apache.org>.

Feng Jiang wrote:
> I saw this bug has been fixed in 0.90. but I still think it can be 
> improved.
> can i just attach a new patch under the same issue?

Please create a new issue, linked to the old issue.

Thanks,

Doug