You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Sandeep Dey <de...@gmail.com> on 2007/12/17 23:19:35 UTC

hadoop pipes with java mapper and c++ reducer

Hi,

Can hadoop 0.15.1 pipes use a java mapper and a c++ reducer ??
The api page http://lucene.apache.org/hadoop/docs/r0.15.1/api/org/apache/hadoop/mapred/pipes/package-summary.html
suggests that
"The job may consist of any combination of Java and C++ RecordReaders,
Mappers, Paritioner, Combiner, Reducer, and RecordWriter."


I wrote a simple sample code with a simple map class (for wordcount)

public class mapclass extends MapReduceBase implements Mapper {

    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(WritableComparable key, Writable value,
                    OutputCollector output,
                    Reporter reporter) throws IOException {
      String line = ((Text)value).toString();
      StringTokenizer itr = new StringTokenizer(line);
      while (itr.hasMoreTokens()) {
        word.set(itr.nextToken());
        output.collect(word, one);
      }
    }
  }

zipped into a jar file (after compilation) and used it as the mapper e.g

hadoop pipes -input testdata -output output-dir -jar mapredclasses.jar
-map mapclass -reduce reduceclass

(the reduce class is another java class to be used as the reducer)

hadoop threw the exception with the trace
Exception in thread "main" java.lang.ClassNotFoundException: mapclass
        at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:268)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
        at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:242)
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:524)
        at org.apache.hadoop.mapred.pipes.Submitter.getClass(Submitter.java:309)
        at org.apache.hadoop.mapred.pipes.Submitter.main(Submitter.java:365)


I searched a lot but in vain :( . It seems like i am somehow missing
some thing obvious :) .Can you please tell me if hadoop pipe supports
map and reduce classes in different languages (as the api suggests)
and if yes, how to go about using it ?

Thanks,
Sandeep