You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Kadu canGica Eduardo <ka...@gmail.com> on 2012/01/04 18:07:01 UTC

Error using Streaming and old API

Hi,
i'm using streaming with python and i made my own inputformat using the new
API but when i run my job i got the error message:
"org.fasta.InputFormat.FastaInputFormat not
org.apache.hadoop.mapred.InputFormat".

Well, as far as i know the new API was introduced in 0.20 (i'm using
0.20.203.0), but it seems that streaming don't work with it.

Is there any way to fix this without rewriting all my code of inputformat
with the old API?

Thanks in advance.
Carlos.


package org.fasta.InputFormat;
import java.io.IOException;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.compress.CompressionCodec;
import org.apache.hadoop.io.compress.CompressionCodecFactory;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.JobContext;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

public class FastaInputFormat extends FileInputFormat<LongWritable,Text> {

public static String addFistQualityValueProperty =
"fastainputformat.addFistQualityValue";

@Override
protected boolean isSplitable(JobContext context, Path file) {
CompressionCodec codec = new CompressionCodecFactory(context
.getConfiguration()).getCodec(file);
return codec == null;
}

@Override
public RecordReader<LongWritable, Text> createRecordReader(InputSplit split,
TaskAttemptContext context) throws IOException,
InterruptedException {

return new FastaRecordReader();
}
}

Re: Error using Streaming and old API

Posted by Robert Evans <ev...@yahoo-inc.com>.
I believe that streaming still only uses the older mapred API and does not support the API in the mapreduce java package.

--Bobby Evans

On 1/4/12 11:07 AM, "Kadu canGica Eduardo" <ka...@gmail.com> wrote:

Hi,
i'm using streaming with python and i made my own inputformat using the new API but when i run my job i got the error message: "org.fasta.InputFormat.FastaInputFormat not org.apache.hadoop.mapred.InputFormat".

Well, as far as i know the new API was introduced in 0.20 (i'm using 0.20.203.0), but it seems that streaming don't work with it.

Is there any way to fix this without rewriting all my code of inputformat with the old API?

Thanks in advance.
Carlos.


package org.fasta.InputFormat;
import java.io.IOException;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.compress.CompressionCodec;
import org.apache.hadoop.io.compress.CompressionCodecFactory;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.JobContext;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;

public class FastaInputFormat extends FileInputFormat<LongWritable,Text> {

public static String addFistQualityValueProperty = "fastainputformat.addFistQualityValue";

@Override
protected boolean isSplitable(JobContext context, Path file) {
CompressionCodec codec = new CompressionCodecFactory(context
.getConfiguration()).getCodec(file);
return codec == null;
}

@Override
public RecordReader<LongWritable, Text> createRecordReader(InputSplit split,
TaskAttemptContext context) throws IOException,
InterruptedException {

return new FastaRecordReader();
}
}