You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Keith Wiley <kw...@keithwiley.com> on 2014/01/17 19:56:12 UTC

EmptyInputFormat for Hadoop 2?

The version of EmptyInputFormat available in the tarball (I downloaded CDH4 if that matteres) uses mapred, not mapreduce, and therefore is not compatible with calls to setInputFormatClass(), so I attempted to extrapolate the pattern of the old code to an updated version.  The class I created can be passed to setInputFormatClass() without a compile error, and the Hadoop job runs...but the job uses 0 mappers!  The map class isn't called at all, a map slot isn't even allocated to the job.  Clearly, this was not my intent.  Any help?  Here's what I put together:

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.mapreduce.InputFormat;
import org.apache.hadoop.mapreduce.InputSplit;
import org.apache.hadoop.mapreduce.JobContext;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.TaskAttemptContext;

/**
 * InputFormat which simulates the absence of input data by returning zero split.
* @param <V>
* @param <K>
 */
public class EmptyInputFormat<V, K> extends InputFormat<K, V> {
	@Override
	public List<InputSplit> getSplits(JobContext arg0) throws IOException,
			InterruptedException {
		return new ArrayList<InputSplit>();
	}
	
	@Override
	public RecordReader<K, V> createRecordReader(InputSplit arg0,
			TaskAttemptContext arg1) throws IOException, InterruptedException {
	    return new RecordReader<K,V>() {

			@Override
			public void close() throws IOException { }

			@Override
			public K getCurrentKey() throws IOException, InterruptedException {
				return null;
			}

			@Override
			public V getCurrentValue() throws IOException, InterruptedException {
				return null;
			}

			@Override
			public float getProgress() throws IOException, InterruptedException {
				return 0;
			}

			@Override
			public void initialize(InputSplit arg0, TaskAttemptContext arg1)
					throws IOException, InterruptedException {
				
			}

			@Override
			public boolean nextKeyValue() throws IOException,
					InterruptedException {
				return false;
			}
			
	    };
	}
}

________________________________________________________________________________
Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com

"You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
scratch. All together this implies: He scratched the itch from the scratch that
itched but would never itch the scratch from the itch that scratched."
                                           --  Keith Wiley
________________________________________________________________________________