You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by motta <mo...@gmail.com> on 2011/11/09 14:27:05 UTC

Running Mahout SVD on Amazon Elastic Map Reduce

Hi everybody,
I have tried to run my first Mahout SVD Job (DistributedLanczosSolver) in
Elastic Map Reduce.
Before going to Amazon I've tried to run it on my local hadoop installation
and it works, so the problem is just on Amazon.

Using the online Amazon Management console i passed the following parameters
JAR: s3n://<my-bucket-name>/mahout-examples-0.5-job.jar
JAR ARGUMENTS:
org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver --input
s3n://<my-bucket-name>/input --output s3n://<my-bucket-name>/output
--numCols 24 --numRows 913327 --cleansvd true --rank 24

The reason why it fails is reported here. Thank you for your help

Exception in thread "main" java.lang.IllegalArgumentException: This file
system object (hdfs://ip-10-227-133-187.eu-west-1.compute.internal:9000)
does not support access to the request path 's3n://polimi-neodata-svd/input'
You possibly called FileSystem.get(conf) when you should have called
FileSystem.get(uri, conf) to obtain a file system supporting your path.
	at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:351)
	at
org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:99)
	at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:263)
	at
org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:116)
	at
org.apache.mahout.math.hadoop.DistributedRowMatrix.setConf(DistributedRowMatrix.java:106)
	at
org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:198)
	at
org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:161)
	at
org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:120)
	at
org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver$DistributedLanczosSolverJob.run(DistributedLanczosSolver.java:290)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at
org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.main(DistributedLanczosSolver.java:296)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

--
View this message in context: http://lucene.472066.n3.nabble.com/Running-Mahout-SVD-on-Amazon-Elastic-Map-Reduce-tp3493514p3493514.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Running Mahout SVD on Amazon Elastic Map Reduce

Posted by Sean Owen <sr...@gmail.com>.
I better interpretation of the error is probably that the Hadoop param
"fs.default.name" is set to some hdfs: URL. You will almost certainly
need to set this to your S3 bucket, like "s3n://bucketname"  (or
something very similar to that, I forget). The point is you have to
tell the system what the default namespace is for paths.

On Thu, Nov 10, 2011 at 7:52 AM, Alfredo Motta <al...@live.com> wrote:
> I didn't hard-codec any hdfs prefix, I've just used mahout-examples-0.5-job.jar
> (downloaded from mahout website) to run DistributedLanczosSolver.
>
> The output suggest that the jar invoked FileSystem.get(conf) instead
> of FileSystem.get(uri, conf) to get my input matrix
>
> is it possible?
>
> 2011/11/10 Ted Dunning <te...@gmail.com>
>
>> This looks like a hard-coded hdfs prefix in a path name construction
>> somewhere.
>>
>> On Wed, Nov 9, 2011 at 8:27 AM, motta <mo...@gmail.com> wrote:
>>
>> > Hi everybody,
>> > I have tried to run my first Mahout SVD Job (DistributedLanczosSolver) in
>> > Elastic Map Reduce.
>> > Before going to Amazon I've tried to run it on my local hadoop
>> installation
>> > and it works, so the problem is just on Amazon.
>> >
>> > Using the online Amazon Management console i passed the following
>> > parameters
>> > JAR: s3n://<my-bucket-name>/mahout-examples-0.5-job.jar
>> > JAR ARGUMENTS:
>> > org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver --input
>> > s3n://<my-bucket-name>/input --output s3n://<my-bucket-name>/output
>> > --numCols 24 --numRows 913327 --cleansvd true --rank 24
>> >
>> > The reason why it fails is reported here. Thank you for your help
>> >
>> > Exception in thread "main" java.lang.IllegalArgumentException: This file
>> > system object (hdfs://ip-10-227-133-187.eu-west-1.compute.internal:9000)
>> > does not support access to the request path
>> > 's3n://polimi-neodata-svd/input'
>> > You possibly called FileSystem.get(conf) when you should have called
>> > FileSystem.get(uri, conf) to obtain a file system supporting your path.
>> >        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:351)
>> >        at
>> >
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:99)
>> >        at
>> > org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:263)
>> >        at
>> >
>> >
>> org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:116)
>> >        at
>> >
>> >
>> org.apache.mahout.math.hadoop.DistributedRowMatrix.setConf(DistributedRowMatrix.java:106)
>> >        at
>> >
>> >
>> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:198)
>> >        at
>> >
>> >
>> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:161)
>> >        at
>> >
>> >
>> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:120)
>> >        at
>> >
>> >
>> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver$DistributedLanczosSolverJob.run(DistributedLanczosSolver.java:290)
>> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>> >        at
>> >
>> >
>> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.main(DistributedLanczosSolver.java:296)
>> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >        at
>> >
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> >        at
>> >
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >        at java.lang.reflect.Method.invoke(Method.java:597)
>> >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>> >
>> > --
>> > View this message in context:
>> >
>> http://lucene.472066.n3.nabble.com/Running-Mahout-SVD-on-Amazon-Elastic-Map-Reduce-tp3493514p3493514.html
>> > Sent from the Mahout User List mailing list archive at Nabble.com.
>> >
>>
>

Re: Running Mahout SVD on Amazon Elastic Map Reduce

Posted by Mat Kelcey <ma...@gmail.com>.
I mean I _haven't_ not _have_ :/

On 10 November 2011 15:10, Mat Kelcey <ma...@gmail.com> wrote:
> I have checked the code yet but I think this might be the same problem...
> https://forums.aws.amazon.com/thread.jspa?messageID=124779&#124779
>
> Mat
>

Re: Running Mahout SVD on Amazon Elastic Map Reduce

Posted by Mat Kelcey <ma...@gmail.com>.
I have checked the code yet but I think this might be the same problem...
https://forums.aws.amazon.com/thread.jspa?messageID=124779&#124779

Mat

Re: Running Mahout SVD on Amazon Elastic Map Reduce

Posted by Ted Dunning <te...@gmail.com>.
Yes. That is possible. What you are saying is a way of putting in a hardcoded prefix by implication. 

Sent from my iPhone

On Nov 10, 2011, at 2:52, Alfredo Motta <al...@live.com> wrote:

> The output suggest that the jar invoked FileSystem.get(conf) instead
> of FileSystem.get(uri, conf) to get my input matrix
> 
> is it possible?

Re: Running Mahout SVD on Amazon Elastic Map Reduce

Posted by Alfredo Motta <al...@live.com>.
I didn't hard-codec any hdfs prefix, I've just used mahout-examples-0.5-job.jar
(downloaded from mahout website) to run DistributedLanczosSolver.

The output suggest that the jar invoked FileSystem.get(conf) instead
of FileSystem.get(uri, conf) to get my input matrix

is it possible?

2011/11/10 Ted Dunning <te...@gmail.com>

> This looks like a hard-coded hdfs prefix in a path name construction
> somewhere.
>
> On Wed, Nov 9, 2011 at 8:27 AM, motta <mo...@gmail.com> wrote:
>
> > Hi everybody,
> > I have tried to run my first Mahout SVD Job (DistributedLanczosSolver) in
> > Elastic Map Reduce.
> > Before going to Amazon I've tried to run it on my local hadoop
> installation
> > and it works, so the problem is just on Amazon.
> >
> > Using the online Amazon Management console i passed the following
> > parameters
> > JAR: s3n://<my-bucket-name>/mahout-examples-0.5-job.jar
> > JAR ARGUMENTS:
> > org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver --input
> > s3n://<my-bucket-name>/input --output s3n://<my-bucket-name>/output
> > --numCols 24 --numRows 913327 --cleansvd true --rank 24
> >
> > The reason why it fails is reported here. Thank you for your help
> >
> > Exception in thread "main" java.lang.IllegalArgumentException: This file
> > system object (hdfs://ip-10-227-133-187.eu-west-1.compute.internal:9000)
> > does not support access to the request path
> > 's3n://polimi-neodata-svd/input'
> > You possibly called FileSystem.get(conf) when you should have called
> > FileSystem.get(uri, conf) to obtain a file system supporting your path.
> >        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:351)
> >        at
> >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:99)
> >        at
> > org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:263)
> >        at
> >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:116)
> >        at
> >
> >
> org.apache.mahout.math.hadoop.DistributedRowMatrix.setConf(DistributedRowMatrix.java:106)
> >        at
> >
> >
> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:198)
> >        at
> >
> >
> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:161)
> >        at
> >
> >
> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:120)
> >        at
> >
> >
> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver$DistributedLanczosSolverJob.run(DistributedLanczosSolver.java:290)
> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> >        at
> >
> >
> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.main(DistributedLanczosSolver.java:296)
> >        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >        at
> >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >        at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >        at java.lang.reflect.Method.invoke(Method.java:597)
> >        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/Running-Mahout-SVD-on-Amazon-Elastic-Map-Reduce-tp3493514p3493514.html
> > Sent from the Mahout User List mailing list archive at Nabble.com.
> >
>

Re: Running Mahout SVD on Amazon Elastic Map Reduce

Posted by Ted Dunning <te...@gmail.com>.
This looks like a hard-coded hdfs prefix in a path name construction
somewhere.

On Wed, Nov 9, 2011 at 8:27 AM, motta <mo...@gmail.com> wrote:

> Hi everybody,
> I have tried to run my first Mahout SVD Job (DistributedLanczosSolver) in
> Elastic Map Reduce.
> Before going to Amazon I've tried to run it on my local hadoop installation
> and it works, so the problem is just on Amazon.
>
> Using the online Amazon Management console i passed the following
> parameters
> JAR: s3n://<my-bucket-name>/mahout-examples-0.5-job.jar
> JAR ARGUMENTS:
> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver --input
> s3n://<my-bucket-name>/input --output s3n://<my-bucket-name>/output
> --numCols 24 --numRows 913327 --cleansvd true --rank 24
>
> The reason why it fails is reported here. Thank you for your help
>
> Exception in thread "main" java.lang.IllegalArgumentException: This file
> system object (hdfs://ip-10-227-133-187.eu-west-1.compute.internal:9000)
> does not support access to the request path
> 's3n://polimi-neodata-svd/input'
> You possibly called FileSystem.get(conf) when you should have called
> FileSystem.get(uri, conf) to obtain a file system supporting your path.
>        at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:351)
>        at
>
> org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:99)
>        at
> org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:263)
>        at
>
> org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:116)
>        at
>
> org.apache.mahout.math.hadoop.DistributedRowMatrix.setConf(DistributedRowMatrix.java:106)
>        at
>
> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:198)
>        at
>
> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:161)
>        at
>
> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:120)
>        at
>
> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver$DistributedLanczosSolverJob.run(DistributedLanczosSolver.java:290)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>        at
>
> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.main(DistributedLanczosSolver.java:296)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Running-Mahout-SVD-on-Amazon-Elastic-Map-Reduce-tp3493514p3493514.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>