You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Abhijit Sarkar <ab...@gmail.com> on 2013/08/12 05:28:47 UTC

RE: FileNotFoundException trying to uncompress local cache archive

Can someone please advise?

> From: abhijit.sarcar@gmail.com
> To: user@hadoop.apache.org
> Subject: FileNotFoundException trying to uncompress local cache archive
> Date: Sun, 11 Aug 2013 11:43:02 -0400
> 
> Hi,
> As a learning exercise for myself, I'm receiving a simple text file URI as an argument, compressing it using GzipCodec and placing it in the Distributed Cache. In the Reducer, I'm retrieving the archive, uncompressing it and process the text file. Well, at least that's the idea. My uncompression code is unable to find the local cache archive and throws FileNotFoundException. 
> I'm not using any GenericOptionsParser features like -copyFromLocal and trying to keep it all in the code.
> 
> Driver:
> public int run(String[] args) throws Exception {
> Configuration conf = getConf();
> 
> final URI compressedFileURI = compressFile(new Path(args[2]).toUri(), "gzip", conf); //implementation later
> 
> DistributedCache.addCacheArchive(compressedFileURI, conf);
> 
> Reducer:
> final Path[] cacheFiles = DistributedCache.getLocalCacheArchives(conf);
> 
> // some sanity check code
> cacheFileURI = uncompressFile(cacheFiles[0].toUri(), conf); //implementation later
> 
> Utility:
> public static URI compressFile(final URI uncompressedURI,
> 		final String codecName, final Configuration conf)
> 		throws IOException {
>         final FileSystem fs = FileSystem.get(conf);
> 	final CompressionCodec codec = new GzipCodec();
> 	final Path uncompressedPath = new Path(uncompressedURI);
> 
> 	String archiveName = addExtension(uncompressedPath.getName(),
> 			codec.getDefaultExtension(), true);
> 
> 	final Path archivePath = new Path(uncompressedPath.getParent(),
> 			archiveName);
> 
> 	final OutputStream outputStream = new FileOutputStream(archivePath
> 			.toUri().getPath());
> 	final InputStream inputStream = new FileInputStream(
> 			uncompressedURI.getPath());
> 	final CompressionOutputStream out = codec
> 			.createOutputStream(outputStream);
> 	org.apache.hadoop.io.IOUtils.copyBytes(inputStream, out, conf, false);
>         // clean up
> 
> public static URI uncompressFile(final URI archiveURI,
> 		final Configuration conf) throws IOException {
> 	final Path archivePath = new Path(archiveURI);
> 
> 	final FileSystem fs = FileSystem.get(conf);
> 
> 	final CompressionCodec codec = new CompressionCodecFactory(conf)
> 			.getCodec(archivePath);
> 	final Path uncompressedPath = new Path(
> 			CompressionCodecFactory.removeSuffix(archiveURI.getPath(),
> 					codec.getDefaultExtension()));
> 	
> 	final OutputStream outputStream = fs.create(uncompressedPath);
> 
> 	//FileNotFoundException
>         final InputStream inputStream = new FileInputStream(
> 			archiveURI.getPath());
> 
> Regards,
> Abhijit