You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Jay Vyas <ja...@gmail.com> on 2014/03/29 01:52:49 UTC

(help!) Can someone scan this

Hi again mahout:

Im wrapping a distributed recommender like this:

https://raw.githubusercontent.com/jayunit100/bigpetstore/master/src/main/java/org/bigtop/bigpetstore/clustering/BPSRecommnder.java

And its not working.

Any thoguhts on why?  The error message is simply that intermediate data
sets dont exist (i.e. numUsers.bin or /tmp/preparePreferencesMatrix...).

Basically its clear that the intermediate jobs are failing but i cant see
any reason why they would fail.... And I don't see any meaningfull stack
traces.

I've found alot of good whitepapers and stuff on how the algorithms work ,
but its not clear what is really done for me by mahout, and what i have to
do on my own for the distributed recommender APIs.

-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: (help!) Can someone scan this

Posted by Jay Vyas <ja...@gmail.com>.
FYI I eventually got this working.  Im not sure what the fix was, but here
is all the stuff i tried (some combination below must have got it working) .

- created log4j.properties files and made sure all the necessary properties
were there
- exported some of the usual hadoop HOME and HADOOP_CONF dir env properties.
- expoerted MAHOUT_HOME.

In any case, I thin something about the way mahout nests jobs, or else, the
way it logs, makes it tricky to debug when failures happen in local mode,
but i was never able to put my finger on just what.




On Sat, Mar 29, 2014 at 11:34 AM, Jay Vyas <ja...@gmail.com> wrote:

> 0.9.0  .... What do you mean by explicitly setting the /tmp path?
>
> Thanks for the feedback.  FYI, after the job is run, I see that it fails
> IMMEDIATELY when starting the PreparePreferenceMatrix job, and i see this
> in my local hadoop /tmp dir:
>
> ├── [        102]  local
> │   └── [        102]  localRunner
> │       └── [        170]  jay
> │           ├── [         68]  job_local1531736937_0001
> │           ├── [         68]  job_local218993552_0002
> │           └── [        136]  jobcache
> │               ├── [        102]  job_local1531736937_0001
> │               │   └── [        102]
> attempt_local1531736937_0001_m_000000_0
> │               │       └── [        136]  output
> │               │           ├── [         14]  file.out
> │               │           └── [         32]  file.out.index
> │               └── [        102]  job_local218993552_0002
> │                   └── [        102]
> attempt_local218993552_0002_m_000000_0
> │                       └── [        136]  output
> │                           ├── [         14]  file.out
> │                           └── [         32]  file.out.index
> └── [        136]  staging
>     ├── [        102]  jay1531736937
>     └── [        102]  jay218993552
>
>
>
> On Sat, Mar 29, 2014 at 2:01 AM, Sebastian Schelter <ss...@apache.org>wrote:
>
>> Jay,
>>
>> which version of Mahout are you using? Have you tried to explicitly set
>> the temp path?
>>
>> --sebastian
>>
>>
>> On 03/29/2014 01:52 AM, Jay Vyas wrote:
>>
>>> Hi again mahout:
>>>
>>> Im wrapping a distributed recommender like this:
>>>
>>> https://raw.githubusercontent.com/jayunit100/bigpetstore/
>>> master/src/main/java/org/bigtop/bigpetstore/clustering/
>>> BPSRecommnder.java
>>>
>>> And its not working.
>>>
>>> Any thoguhts on why?  The error message is simply that intermediate data
>>> sets dont exist (i.e. numUsers.bin or /tmp/preparePreferencesMatrix...).
>>>
>>> Basically its clear that the intermediate jobs are failing but i cant see
>>> any reason why they would fail.... And I don't see any meaningfull stack
>>> traces.
>>>
>>> I've found alot of good whitepapers and stuff on how the algorithms work
>>> ,
>>> but its not clear what is really done for me by mahout, and what i have
>>> to
>>> do on my own for the distributed recommender APIs.
>>>
>>>
>>
>
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: (help!) Can someone scan this

Posted by Jay Vyas <ja...@gmail.com>.
0.9.0  .... What do you mean by explicitly setting the /tmp path?

Thanks for the feedback.  FYI, after the job is run, I see that it fails
IMMEDIATELY when starting the PreparePreferenceMatrix job, and i see this
in my local hadoop /tmp dir:

├── [        102]  local
│   └── [        102]  localRunner
│       └── [        170]  jay
│           ├── [         68]  job_local1531736937_0001
│           ├── [         68]  job_local218993552_0002
│           └── [        136]  jobcache
│               ├── [        102]  job_local1531736937_0001
│               │   └── [        102]
attempt_local1531736937_0001_m_000000_0
│               │       └── [        136]  output
│               │           ├── [         14]  file.out
│               │           └── [         32]  file.out.index
│               └── [        102]  job_local218993552_0002
│                   └── [        102]
attempt_local218993552_0002_m_000000_0
│                       └── [        136]  output
│                           ├── [         14]  file.out
│                           └── [         32]  file.out.index
└── [        136]  staging
    ├── [        102]  jay1531736937
    └── [        102]  jay218993552



On Sat, Mar 29, 2014 at 2:01 AM, Sebastian Schelter <ss...@apache.org> wrote:

> Jay,
>
> which version of Mahout are you using? Have you tried to explicitly set
> the temp path?
>
> --sebastian
>
>
> On 03/29/2014 01:52 AM, Jay Vyas wrote:
>
>> Hi again mahout:
>>
>> Im wrapping a distributed recommender like this:
>>
>> https://raw.githubusercontent.com/jayunit100/bigpetstore/
>> master/src/main/java/org/bigtop/bigpetstore/clustering/BPSRecommnder.java
>>
>> And its not working.
>>
>> Any thoguhts on why?  The error message is simply that intermediate data
>> sets dont exist (i.e. numUsers.bin or /tmp/preparePreferencesMatrix...).
>>
>> Basically its clear that the intermediate jobs are failing but i cant see
>> any reason why they would fail.... And I don't see any meaningfull stack
>> traces.
>>
>> I've found alot of good whitepapers and stuff on how the algorithms work ,
>> but its not clear what is really done for me by mahout, and what i have to
>> do on my own for the distributed recommender APIs.
>>
>>
>


-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: (help!) Can someone scan this

Posted by Sebastian Schelter <ss...@apache.org>.
Jay,

which version of Mahout are you using? Have you tried to explicitly set 
the temp path?

--sebastian

On 03/29/2014 01:52 AM, Jay Vyas wrote:
> Hi again mahout:
>
> Im wrapping a distributed recommender like this:
>
> https://raw.githubusercontent.com/jayunit100/bigpetstore/master/src/main/java/org/bigtop/bigpetstore/clustering/BPSRecommnder.java
>
> And its not working.
>
> Any thoguhts on why?  The error message is simply that intermediate data
> sets dont exist (i.e. numUsers.bin or /tmp/preparePreferencesMatrix...).
>
> Basically its clear that the intermediate jobs are failing but i cant see
> any reason why they would fail.... And I don't see any meaningfull stack
> traces.
>
> I've found alot of good whitepapers and stuff on how the algorithms work ,
> but its not clear what is really done for me by mahout, and what i have to
> do on my own for the distributed recommender APIs.
>


Re: (help!) Can someone scan this

Posted by Ted Dunning <te...@gmail.com>.
Have you run the component jobs by hand successfully?




On Fri, Mar 28, 2014 at 5:52 PM, Jay Vyas <ja...@gmail.com> wrote:

> Hi again mahout:
>
> Im wrapping a distributed recommender like this:
>
>
> https://raw.githubusercontent.com/jayunit100/bigpetstore/master/src/main/java/org/bigtop/bigpetstore/clustering/BPSRecommnder.java
>
> And its not working.
>
> Any thoguhts on why?  The error message is simply that intermediate data
> sets dont exist (i.e. numUsers.bin or /tmp/preparePreferencesMatrix...).
>
> Basically its clear that the intermediate jobs are failing but i cant see
> any reason why they would fail.... And I don't see any meaningfull stack
> traces.
>
> I've found alot of good whitepapers and stuff on how the algorithms work ,
> but its not clear what is really done for me by mahout, and what i have to
> do on my own for the distributed recommender APIs.
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com
>