You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by leibnitz <se...@gmail.com> on 2011/04/11 11:26:54 UTC

Re: how to sort the output by value in reduce instead of by key?

can anyone get me a tips ?

--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-sort-the-output-by-value-in-reduce-instead-of-by-key-tp2805541p2805922.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Re: how to sort the output by value in reduce instead of by key?

Posted by leibnitz <se...@gmail.com>.
thanks all.
to : Josh,i think you are right.i have previously  tried to use a group key
by field1+ip at reduce.but it is failed(not sort).
i will check your point:)

--
View this message in context: http://lucene.472066.n3.nabble.com/how-to-sort-the-output-by-value-in-reduce-instead-of-by-key-tp2805541p2809859.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

Re: how to sort the output by value in reduce instead of by key?

Posted by Josh Patterson <jo...@cloudera.com>.
Leibnitz,
I think you are looking for "secondary sort" in this case where the
data arrives in some sort of order at the reducer as opposed to "in a
group by key". Is that the case?

For a look at secondary sort I've got a few blog articles:

http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-1/
http://www.cloudera.com/blog/2011/03/simple-moving-average-secondary-sort-and-mapreduce-part-2/
http://www.cloudera.com/blog/2011/04/simple-moving-average-secondary-sort-and-mapreduce-part-3/

and part 3 includes source code on github.com:

https://github.com/jpatanooga/Caduceus

Hope that helps,

Josh



On Mon, Apr 11, 2011 at 5:26 AM, leibnitz <se...@gmail.com> wrote:
> can anyone get me a tips ?
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/how-to-sort-the-output-by-value-in-reduce-instead-of-by-key-tp2805541p2805922.html
> Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
>



-- 
Twitter: @jpatanooga
Solution Architect @ Cloudera
hadoop: http://www.cloudera.com
blog: http://jpatterson.floe.tv