You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Sebastian Schelter <ss...@googlemail.com> on 2010/08/05 14:51:08 UTC
Mahout on Elastic MapReduce
Hi,
I'm currently evaluating ItemSimilarityJob and RecommenderJob on Elastic
MapReduce, it seems we have some small problems with S3, mostly due to
the fact that we need to use Filesystem.get(path.toUri(), conf) instead
of Filesystem.get(conf) in the code. I will create a patch for that the
next days.
I'm writing this mail because I encountered another problem I currently
can't solve. RecommenderJob is emulating MultipleInputs (which is
currently missing in Hadoop 0.20 AFAIK) by reading data from a combined
path that is built like that:
new Path(prePartialMultiplyPath1 + "," + prePartialMultiplyPath2)
My Job always fails with this exception here:
java.lang.IllegalArgumentException: Invalid hostname in URI
s3:/testingbucket-12345/tmp/prePartialMultiply2
Any ideas how to fix this?
Thanks,
Sebastian
(Wrong email in the last mail, sry)
###
Stacktrace (line numbers might not correspond with the latest version
from HEAD):
Exception in thread "main" java.lang.IllegalArgumentException: Invalid
hostname in URI s3:/testingbucket-12345/tmp/prePartialMultiply2
at
org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41)
at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.initialize(Jets3tNativeFileSystemStore.java:53)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at org.apache.hadoop.fs.s3native.$Proxy2.initialize(Unknown Source)
at
org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:278)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1418)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1443)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1431)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:203)
at
org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:55)
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:908)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:802)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at
org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:241)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at
org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:286)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Re: Mahout on Elastic MapReduce
Posted by Andrew Hitchcock <ad...@gmail.com>.
I think you are missing a slash in the URI:
s3:/testingbucket-12345/tmp/prePartialMultiply2
What happens if you try:
s3://testingbucket-12345/tmp/prePartialMultiply2
Andrew
On Thu, Aug 5, 2010 at 5:51 AM, Sebastian Schelter
<ss...@googlemail.com> wrote:
> Hi,
>
> I'm currently evaluating ItemSimilarityJob and RecommenderJob on Elastic
> MapReduce, it seems we have some small problems with S3, mostly due to
> the fact that we need to use Filesystem.get(path.toUri(), conf) instead
> of Filesystem.get(conf) in the code. I will create a patch for that the
> next days.
>
> I'm writing this mail because I encountered another problem I currently
> can't solve. RecommenderJob is emulating MultipleInputs (which is
> currently missing in Hadoop 0.20 AFAIK) by reading data from a combined
> path that is built like that:
>
> new Path(prePartialMultiplyPath1 + "," + prePartialMultiplyPath2)
>
> My Job always fails with this exception here:
>
> java.lang.IllegalArgumentException: Invalid hostname in URI
> s3:/testingbucket-12345/tmp/prePartialMultiply2
>
> Any ideas how to fix this?
>
> Thanks,
> Sebastian
>
> (Wrong email in the last mail, sry)
>
> ###
>
> Stacktrace (line numbers might not correspond with the latest version
> from HEAD):
>
>
> Exception in thread "main" java.lang.IllegalArgumentException: Invalid
> hostname in URI s3:/testingbucket-12345/tmp/prePartialMultiply2
> at
> org.apache.hadoop.fs.s3.S3Credentials.initialize(S3Credentials.java:41)
> at
> org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.initialize(Jets3tNativeFileSystemStore.java:53)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> at org.apache.hadoop.fs.s3native.$Proxy2.initialize(Unknown Source)
> at
> org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:278)
> at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1418)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
> at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1443)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1431)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:203)
> at
> org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:55)
> at
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)
> at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:908)
> at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:802)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
> at
> org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:241)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at
> org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:286)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
>
>