You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Arnab Guin (JIRA)" <ji...@apache.org> on 2013/10/22 01:34:42 UTC

[jira] [Created] (MAPREDUCE-5591) K-ranker

Arnab Guin created MAPREDUCE-5591:
-------------------------------------

             Summary: K-ranker 
                 Key: MAPREDUCE-5591
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5591
             Project: Hadoop Map/Reduce
          Issue Type: Task
          Components: examples
    Affects Versions: 2.2.0
            Reporter: Arnab Guin
         Attachments: k-ranking.tgz

Hi,

I recently wrote some code to find the max K integers corresponding a group. 

Given one of more input files containing input lines of the following form:

"key",value

where key is a string
      value is any integer

the program prints the top K elements corresponding to each key.

eg.

"a",1
"b",1
"a",2
"a",5
"b",17
"c",5
"b",6

if k = 2, the program prints

"a" [2,5]
"b" [6,17]
"c" [5]

Compile steps:
mvn clean
mvn package javadoc:javadoc

Run steps:

hadoop jar <ranking jar file>  <main class> <K> <input directory> <output directory>
eg. hadoop jar target/ranking-1.0-SNAPSHOT.jar  org.ml.MaxKRanker 5 data/input data/output

Wanted to know if there is a component (examples maybe) where the code can be contributed. Also open to any suggestions for improvements.

Thanks,
Arnab



--
This message was sent by Atlassian JIRA
(v6.1#6144)