You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Xavier Stevens <Xa...@fox.com> on 2008/03/13 21:43:10 UTC

Problem retrieving entry from compressed MapFile

Currently I can retrieve entries if I use MapFileOutputFormat via conf.setOutputFormat with no compression specified.  But I was trying to do this:

public void configure(JobConf jobConf) {
...
this.writer = new MapFile.Writer(jobConf, fileSys, dirName, Text.class, Text.class, SequenceFile.CompressionType.BLOCK);
...
}

public void map(WritableComparable key, Writable value,
                        OutputCollector output, Reporter reporter) throws IOException {
...
writer.append(newkey,newvalue);
...
}

To use SequenceFile block compression.  Then later trying to retrieve the output values in a separate class:

public static void main(String[] args) throws Exception {
...
conf.setInputFormat(org.apache.hadoop.mapred.SequenceFileInputFormat.class);
...
MapFile.Reader[] readers = MapFileOutputFormat.getReaders(fileSys, inDataPath, defaults);
Partitioner part = (Partitioner)ReflectionUtils.newInstance(conf.getPartitionerClass(), conf);
Text entryValue = null;
entryValue = (Text)MapFileOutputFormat.getEntry(readers, part, new Text("mykey"), new Text());
if (entryValue != null) {
	System.out.println("My Entry's Value: ");
	System.out.println(entryValue.toString());
}
for (MapFile.Reader reader:readers) {
	if (reader != null) {
		reader.close();
	}
}
}

But when I use block compression I no longer get a result from MapFileOutputFormat.getEntry.  What am I doing wrong?  And/or is there a way for this to work using conf.setOutputFormat(MapFileOutputFormat.class) and conf.setMapOutputCompressionType(SequenceFile.CompressionType.BLOCK)?