You are viewing a plain text version of this content. The canonical link for it is here.
Posted to by Keith Wiley <> on 2014/01/17 19:56:12 UTC

EmptyInputFormat for Hadoop 2?

The version of EmptyInputFormat available in the tarball (I downloaded CDH4 if that matteres) uses mapred, not mapreduce, and therefore is not compatible with calls to setInputFormatClass(), so I attempted to extrapolate the pattern of the old code to an updated version.  The class I created can be passed to setInputFormatClass() without a compile error, and the Hadoop job runs...but the job uses 0 mappers!  The map class isn't called at all, a map slot isn't even allocated to the job.  Clearly, this was not my intent.  Any help?  Here's what I put together:

import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.mapreduce.InputFormat;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.JobContext;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;

 * InputFormat which simulates the absence of input data by returning zero split.
* @param <V>
* @param <K>
public class EmptyInputFormat<V, K> extends InputFormat<K, V> {
	public List<InputSplit> getSplits(JobContext arg0) throws IOException,
			InterruptedException {
		return new ArrayList<InputSplit>();
	public RecordReader<K, V> createRecordReader(InputSplit arg0,
			TaskAttemptContext arg1) throws IOException, InterruptedException {
	    return new RecordReader<K,V>() {

			public void close() throws IOException { }

			public K getCurrentKey() throws IOException, InterruptedException {
				return null;

			public V getCurrentValue() throws IOException, InterruptedException {
				return null;

			public float getProgress() throws IOException, InterruptedException {
				return 0;

			public void initialize(InputSplit arg0, TaskAttemptContext arg1)
					throws IOException, InterruptedException {

			public boolean nextKeyValue() throws IOException,
					InterruptedException {
				return false;

Keith Wiley

"You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
scratch. All together this implies: He scratched the itch from the scratch that
itched but would never itch the scratch from the itch that scratched."
                                           --  Keith Wiley