You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Sebastian Schelter <se...@zalando.de> on 2010/08/05 14:49:14 UTC

Mahout on ElasticMapReduce

Hi,

I'm currently evaluating ItemSimilarityJob and RecommenderJob on Elastic
MapReduce, it seems we have some small problems with S3, mostly due to
the fact that we need to use Filesystem.get(path.toUri(), conf) instead
of Filesystem.get(conf) in the code. I will create a patch for that the
next days.

I'm writing this mail because I encountered another problem I currently
can't solve. RecommenderJob is emulating MultipleInputs (which is
currently missing in Hadoop 0.20 AFAIK) by reading data from a combined
path that is built like that:

    new Path(prePartialMultiplyPath1 + "," + prePartialMultiplyPath2)

My Job always fails with this exception here:

    java.lang.IllegalArgumentException: Invalid hostname in URI
s3:/testingbucket-12345/tmp/prePartialMultiply2

Any ideas how to fix this?

Thanks,
Sebastian

###

Stacktrace (line numbers might not correspond with the latest version
from HEAD):


Exception in thread "main" java.lang.IllegalArgumentException: Invalid
hostname in URI s3:/testingbucket-12345/tmp/prePartialMultiply2
    at
org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41)
    at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.initialize(Jets3tNativeFileSystemStore.java:53)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at org.apache.hadoop.fs.s3native.$Proxy2.initialize(Unknown Source)
    at
org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:278)
    at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1418)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1443)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1431)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
    at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:203)
    at
org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:55)
    at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
    at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:908)
    at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:802)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
    at
org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:241)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at
org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:286)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)