You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Nathan Marz <na...@rapleaf.com> on 2009/02/11 03:06:43 UTC

Testing with Distributed Cache

I have some unit tests which run MapReduce jobs and test the inputs/ 
outputs in standalone mode. I recently started using DistributedCache  
in one of these jobs, but now my tests fail with errors such as:

Caused by: java.io.IOException: Incomplete HDFS URI, no host: hdfs:/// 
tmp/file.data
	at  
org 
.apache 
.hadoop 
.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:70)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java: 
1367)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:56)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1379)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
	at  
org 
.apache 
.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java: 
472)
	at  
org 
.apache 
.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:676)


Does anyone know of a way to get DistributedCache working in a test  
environment?

Re: Testing with Distributed Cache

Posted by Amareshwari Sriramadasu <am...@yahoo-inc.com>.
Nathan Marz wrote:
> I have some unit tests which run MapReduce jobs and test the 
> inputs/outputs in standalone mode. I recently started using 
> DistributedCache in one of these jobs, but now my tests fail with 
> errors such as:
>
> Caused by: java.io.IOException: Incomplete HDFS URI, no host: 
> hdfs:///tmp/file.data
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:70) 
>
>     at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1367)
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:56)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1379)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>     at 
> org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:472) 
>
>     at 
> org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:676) 
>
>
>
> Does anyone know of a way to get DistributedCache working in a test 
> environment?
You can look at the source code for 
org.apache.hadoop.mapred.TestMiniMRDFSCaching.
And DistributedCache does not work with LocalJobRunner. see 
http://issues.apache.org/jira/browse/HADOOP-2914
-Amareshwari