You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Nathan Marz <na...@rapleaf.com> on 2009/02/11 03:06:43 UTC
Testing with Distributed Cache
I have some unit tests which run MapReduce jobs and test the inputs/
outputs in standalone mode. I recently started using DistributedCache
in one of these jobs, but now my tests fail with errors such as:
Caused by: java.io.IOException: Incomplete HDFS URI, no host: hdfs:///
tmp/file.data
at
org
.apache
.hadoop
.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:70)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:
1367)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:56)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1379)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
at
org
.apache
.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:
472)
at
org
.apache
.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:676)
Does anyone know of a way to get DistributedCache working in a test
environment?
Re: Testing with Distributed Cache
Posted by Amareshwari Sriramadasu <am...@yahoo-inc.com>.
Nathan Marz wrote:
> I have some unit tests which run MapReduce jobs and test the
> inputs/outputs in standalone mode. I recently started using
> DistributedCache in one of these jobs, but now my tests fail with
> errors such as:
>
> Caused by: java.io.IOException: Incomplete HDFS URI, no host:
> hdfs:///tmp/file.data
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:70)
>
> at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1367)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:56)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1379)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
> at
> org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:472)
>
> at
> org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:676)
>
>
>
> Does anyone know of a way to get DistributedCache working in a test
> environment?
You can look at the source code for
org.apache.hadoop.mapred.TestMiniMRDFSCaching.
And DistributedCache does not work with LocalJobRunner. see
http://issues.apache.org/jira/browse/HADOOP-2914
-Amareshwari