You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Matt Spitz <ms...@meebo-inc.com> on 2010/11/01 16:58:35 UTC

Re: Getting mahout to run on the DFS

Blast!  I ran it as another user, and no dice.  Same error.

I guess my question for you was to figure out what your classpath was and
see if there was anything different.  bin/mahout is just a simple script,
and I was just adding a quick 'echo' to it.

What version of hadoop are you running?  I wonder if the "Path" class is
defined differently for different versions.

Thanks,
Matt

On Sat, Oct 30, 2010 at 4:56 PM, Jeff Eastman <jd...@windwardsolutions.com>wrote:

> On 10/29/10 10:09 AM, Jeff Eastman wrote:
>
>> Ok, very interesting. I think you are onto the root cause. I can't work on
>> this until the weekend but will investigate further then.
>>
>>  I tried creating another user on my CHD3 box and, for a minute, thought I
> could duplicate something like your problem. But it was a permission problem
> in examples/bin/work that resulted in 0 vectors being output from
> seq2sparse. That caused an array indexing error in RandomSeed generator but
> it went away when I made /work be 777. Even in that situation, I got the
> same error (of course) running kmeans -xm sequential.
>
> You can modify bin/mahout to your heart's content. I hope you are having
> better luck than I am. Build-reuters works perfectly under both userIds.
>

Re: Getting mahout to run on the DFS

Posted by Matt Spitz <ms...@meebo-inc.com>.
Gah, that's ridiculous.  I didn't specify MAHOUT_HOME, which makes our
HADOOP_CLASSPATH identical.

So, running locally (no HADOOP_HOME/HADOOP_CONF_DIR set), kmeans runs fine
with -xm mapreduce and -xm sequential.

Running on hadoop (using HADOOP_HOME/HADOOP_CONF_DIR), kmeans runs fine with
-xm sequential, but runs into the exception mentioned above with -xm
mapreduce.

There's gotta be something different about the way in which we browse our
filesystems on the DFS.  Or perhaps the permissions with which these things
are created?

Looks like the clusters/part-randomSeed is -rw-r--r--, which is the same as
all of the chunk-* files in reuters-out-seqdir.

I'm stumped.

-Matt

On Mon, Nov 1, 2010 at 12:26 PM, Jeff Eastman <je...@narus.com> wrote:

> Frustrating. We're both running CHD3, right? That's Hadoop 0.20.2. I added
> the echos you suggested and my Classpath: output is empty. My Command:
> output is essentially the same as what you reported.
>
> -----Original Message-----
> From: Matt Spitz [mailto:mspitz@meebo-inc.com]
> Sent: Monday, November 01, 2010 8:59 AM
> To: user@mahout.apache.org
> Subject: Re: Getting mahout to run on the DFS
>
> Blast!  I ran it as another user, and no dice.  Same error.
>
> I guess my question for you was to figure out what your classpath was and
> see if there was anything different.  bin/mahout is just a simple script,
> and I was just adding a quick 'echo' to it.
>
> What version of hadoop are you running?  I wonder if the "Path" class is
> defined differently for different versions.
>
> Thanks,
> Matt
>
> On Sat, Oct 30, 2010 at 4:56 PM, Jeff Eastman <jdog@windwardsolutions.com
> >wrote:
>
> > On 10/29/10 10:09 AM, Jeff Eastman wrote:
> >
> >> Ok, very interesting. I think you are onto the root cause. I can't work
> on
> >> this until the weekend but will investigate further then.
> >>
> >>  I tried creating another user on my CHD3 box and, for a minute, thought
> I
> > could duplicate something like your problem. But it was a permission
> problem
> > in examples/bin/work that resulted in 0 vectors being output from
> > seq2sparse. That caused an array indexing error in RandomSeed generator
> but
> > it went away when I made /work be 777. Even in that situation, I got the
> > same error (of course) running kmeans -xm sequential.
> >
> > You can modify bin/mahout to your heart's content. I hope you are having
> > better luck than I am. Build-reuters works perfectly under both userIds.
> >
>

RE: Getting mahout to run on the DFS

Posted by Jeff Eastman <je...@Narus.com>.
Frustrating. We're both running CHD3, right? That's Hadoop 0.20.2. I added the echos you suggested and my Classpath: output is empty. My Command: output is essentially the same as what you reported.

-----Original Message-----
From: Matt Spitz [mailto:mspitz@meebo-inc.com] 
Sent: Monday, November 01, 2010 8:59 AM
To: user@mahout.apache.org
Subject: Re: Getting mahout to run on the DFS

Blast!  I ran it as another user, and no dice.  Same error.

I guess my question for you was to figure out what your classpath was and
see if there was anything different.  bin/mahout is just a simple script,
and I was just adding a quick 'echo' to it.

What version of hadoop are you running?  I wonder if the "Path" class is
defined differently for different versions.

Thanks,
Matt

On Sat, Oct 30, 2010 at 4:56 PM, Jeff Eastman <jd...@windwardsolutions.com>wrote:

> On 10/29/10 10:09 AM, Jeff Eastman wrote:
>
>> Ok, very interesting. I think you are onto the root cause. I can't work on
>> this until the weekend but will investigate further then.
>>
>>  I tried creating another user on my CHD3 box and, for a minute, thought I
> could duplicate something like your problem. But it was a permission problem
> in examples/bin/work that resulted in 0 vectors being output from
> seq2sparse. That caused an array indexing error in RandomSeed generator but
> it went away when I made /work be 777. Even in that situation, I got the
> same error (of course) running kmeans -xm sequential.
>
> You can modify bin/mahout to your heart's content. I hope you are having
> better luck than I am. Build-reuters works perfectly under both userIds.
>