You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Steve Lewis <lo...@gmail.com> on 2010/08/03 17:44:43 UTC

Problems accessing the local file system on AWS MapReduce

I am trying to use Distributes Cache on AWS Elastic Map educe

Using S3Fox I see I have two files in my bucket lordjoe
FeeFie.txt
and
Hello


In My main I say
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf,
args).getRemainingArgs();
Job job = new Job(conf, "WholeFiles");
.....
DistributedCache.addCacheFile(new
Path("s3n://lordjoe/Hello#Hello.x").toUri(), conf);
DistributedCache.addCacheFile(new
Path("s3n://lordjoe/FeeFie.txt#FeeFie.txt").toUri(), conf);
.....
boolean ans = job.waitForCompletion(true);
}


In my reducer I say
protected void reduce(final KEYIN key, final Iterable<VALUEIN> values, final
Context context) throws IOException, InterruptedException {
Configuration configuration = context.getConfiguration();
Path[] files = DistributedCache.getLocalCacheFiles(configuration);
if (files != null) {
for (int i = 0; i < files.length; i++) {
Path file = files *" *
-- 
Steven M. Lewis PhD
Institute for Systems Biology
Seattle WA