You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Feng Jiang <fe...@gmail.com> on 2006/12/20 07:07:10 UTC
Re: Some new requests about mapreduce
I saw this bug has been fixed in 0.90. but I still think it can be improved.
can i just attach a new patch under the same issue?
thanks,
Feng
On 11/8/06, Feng Jiang <fe...@gmail.com> wrote:
>
> Thanks. I have attached the patch:
> http://issues.apache.org/jira/secure/ManageAttachments.jspa?id=12354993
>
> Best regards,
> Feng
>
> On 11/8/06, Doug Cutting <cu...@apache.org> wrote:
> >
> > Feng Jiang wrote:
> > > I think what I am concerning is different with the request485. I mean,
> > > if the input of Reduce phase is :
> > >
> > > K2, V3
> > > K2, V2
> > > K1, V5
> > > K1, V3
> > > K1, V4
> > >
> > > in the current hadoop, the reduce output could be:
> > > K1, (V5, V3, V4)
> > > K2, (V3, V2)
> > >
> > > But I hope hadoop supports job.setOutputValueComparatorClass
> > (theClass),
> > > so that i can make values are in order, and the output could be:
> > > K1, (V3, V4, V5)
> > > K2, (V2, V3)
> >
> > Yes, that is different. One can currently achieve what you're after by
> > including values in keys. The only real difference between keys and
> > values is that values are not used for sorting, and some optimizations
> > are made because of that. But if you need to sort by value as well as
> > key, then you can use compound key that includes both, and a null value.
> > Note that with block compression, repeated keys should not use too
> > much space. Does that suffice?
> >
> > Another related issue is http://issues.apache.org/jira/browse/HADOOP-475
> > .
> >
> > > but I have written the GenericWritable, which is a abstract class to
> > > help user wrap different Writable instances with only one byte cost.
> > The
> > > GenericObject is a demo showing how to use GenericWritable. Both of
> > them
> > > are attached within this email.
> >
> > The attachment did not make it. Can you please attach these to a Jira
> > issue, as a patch file?
> >
> > http://wiki.apache.org/lucene-hadoop/HowToContribute
> >
> > Thanks!
> >
> > Doug
> >
>
>
Re: Some new requests about mapreduce
Posted by Doug Cutting <cu...@apache.org>.
Feng Jiang wrote:
> I saw this bug has been fixed in 0.90. but I still think it can be
> improved.
> can i just attach a new patch under the same issue?
Please create a new issue, linked to the old issue.
Thanks,
Doug