You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Florent Empis <fl...@gmail.com> on 2010/05/10 15:45:02 UTC

Error while trying to use mahout/examples/bin/build-reuters.sh

Hello,

I try to run the Dirichlet example via this shell:
mahout/examples/bin/build-reuters.sh
I'm fairly new to Mahout and to Maven and I think something is wrong with
dependencies and probably with the repositories I should add to my
settings.xml ?

Thanks for your help!

-------------------------------
[INFO]
------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO]
------------------------------------------------------------------------
[INFO] An exception occured while executing the Java class. null

org.apache.commons.compress.compressors.CompressorException
[INFO]
------------------------------------------------------------------------
[INFO] Trace
org.apache.maven.lifecycle.LifecycleExecutionException: An exception occured
while executing the Java class. null
        at
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)

-------------------------------
[INFO]
------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO]
------------------------------------------------------------------------
[INFO] An exception occured while executing the Java class. null

org.apache.commons.compress.compressors.CompressorException
[INFO]
------------------------------------------------------------------------
[INFO] Trace
org.apache.maven.lifecycle.LifecycleExecutionException: An exception occured
while executing the Java class. null
        at
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)

Re: Clustering users

Posted by Sean Owen <sr...@gmail.com>.
I believe those jobs will internally create whatever they need along
the way, including user vectors if needed.

To just create them by themselves, you could run ToItemPrefsMapper and
ToUserVectorReducer from org.apache.mahout.cf.taste.hadoop.item.


On Tue, May 11, 2010 at 5:51 AM, First Qaxy <qa...@yahoo.ca> wrote:
> Hello,
> Just started looking into clustering - KMeansDriver - and have a question on clustering users. Considering that overall I have a huge number of items, how do I create the vectors for the users? Is there any code in Mahout to support that or do I need to write it by turning all userN, itemM pairs (boolean pref) into a userN vector ? I'm not exactly sure - how would this vector look like? Also at the end of the clustering I need to show the original user. Is that possible?Any pointers would be great. Thanks,-qf
>
>

Re: Clustering users

Posted by Ted Dunning <te...@gmail.com>.
Generally, you want to do a bit of projection on these data before
clustering.

One option is random projection.  This maps each item to a sparse binary
vector based on a few independent hashes of the original item id.  This
gives you are moderate dimensional vector to do clustering in (say 100,000
dimensions instead of the original gazillion).

The other option is SVD.  With a gazillion columns in your matrix, you may
want to do the random projection trick first and then do an SVD.  The
resulting 10-30 dimensional representation for users is likely to cluster
much better than the original data.

The random projection you would need to implement.  The SVD can be done once
you have Mahout vectors to play with.

On Mon, May 10, 2010 at 9:51 PM, First Qaxy <qa...@yahoo.ca> wrote:

> Considering that overall I have a huge number of items, how do I create the
> vectors for the users? Is there any code in Mahout to support that or do I
> need to write it by turning all userN, itemM pairs (boolean pref) into a
> userN vector ?

Re: Clustering users

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
In general, you will need to write a custom Mapper that accepts whatever 
format your user data is in and converts it to Mahout Vectors in a 
preprocessing job. Look at the Synthetic Control clustering jobs in 
/examples/ for an instance in the Canopy package. Once you have your 
input data as a sequence file [key=Text(userId); 
value=VectorWritable(prefs)] you can run the KMeansDriver - or any of 
the other clustering jobs - on it directly.

The user vectors would indeed be constructed out of the M items for each 
user e.g. user-i = [item-i0, item-i1, ..., item-NM]. If you wrap the 
user vector in a NamedVectorWritable you can attach the userId as the 
vector name and it will pass through the clustering and out the end in 
the clusteredPoints. Just map your boolean preferences to 0 and 1; a 
ManhattanDistanceMeasure would be a good place to start too.

When you are done, you will have a 'clusteredPoints' directory with more 
sequence files [key=IntWritable(clusterId); value=VectorWritable(prefs)] 
which you can feed into subsequent processing or output with the 
ClusterDumper. Have fun and let me know if you need any more hints.

Jeff


On 5/10/10 9:51 PM, First Qaxy wrote:
> Hello,
> Just started looking into clustering - KMeansDriver - and have a question on clustering users. Considering that overall I have a huge number of items, how do I create the vectors for the users? Is there any code in Mahout to support that or do I need to write it by turning all userN, itemM pairs (boolean pref) into a userN vector ? I'm not exactly sure - how would this vector look like? Also at the end of the clustering I need to show the original user. Is that possible?Any pointers would be great. Thanks,-qf
>
>
>    


Re: RecommenderJob output

Posted by Sean Owen <sr...@gmail.com>.
The values are entries in the final recommendation vector. They don't
have a good interpretation by themselves, but larger values should
mean better recommendation. So the recommendations are ordered by this
value. It's included just in case it is useful. In other recommender
systems (like .pseudo), this would be the actual estimated preference.

However I don't immediately see why the result would be negative
infinity, ever. I'd have to look into that.

On Tue, May 11, 2010 at 7:11 AM, First Qaxy <qa...@yahoo.ca> wrote:
> Hello,
> When running the RecommenderJob with --booleanData false on this input:101,1001101,1002101,1003101,1004101,1005102,1002102,1003103,1002103,1003103,1004105,1001105,1002105,1003105,1004105,1015106,1002106,1003106,1004106,1020106,1021
> the output that I'm getting has:
> 101     [1015:4.0,1021:3.0,1020:3.0,1005:-Infinity,1004:-Infinity,1003:-Infinity,1001:-Infinity,1002:-Infinity]102      [1004:10.0,1005:8.0,1020:2.0,1021:2.0,1015:2.0,1003:-Infinity,1002:-Infinity]103        [1005:12.0,1021:3.0,1020:3.0,1015:3.0,1004:-Infinity,1002:-Infinity,1003:-Infinity]105  [1005:14.0,1020:3.0,1021:3.0,1015:-Infinity,1004:-Infinity,1003:-Infinity,1001:-Infinity,1002:-Infinity]106     [1005:12.0,1021:4.0,1015:3.0,1004:-Infinity,1002:-Infinity,1003:-Infinity,1020:-Infinity]
> What is the meaning(formula) of the float number? > 101 [1015:4.0 <= what is 4.0 ?
> Thanks, -qf
>
>

Re: RecommenderJob output

Posted by First Qaxy <qa...@yahoo.ca>.
> the training model. I'm interesting in the "model" deployed in production, not just for the purpose of training. 

err, I meant to say : not just for the purpose of *testing*.
--- On Tue, 5/11/10, First Qaxy <qa...@yahoo.ca> wrote:

From: First Qaxy <qa...@yahoo.ca>
Subject: Re: RecommenderJob output
To: user@mahout.apache.org
Received: Tuesday, May 11, 2010, 10:01 PM

Great info. No, I'm not looking into having multiple active processes trying to update it. It's more of a single worker process that needs to update the "model" as new data becomes available (every few hours, days,... depending on the customer needs). Ideally I should be able to tell which users were affected so only their recommendations would end up being updated back to Solr. I am getting closer to the end of the evaluation process of Mahout and will soon proceed with the implementation, at which point I hope I'll be able to provide better feedback and contribute more.

On a different thread - I have a high level / best practices question: When doing clustering or classification with large datasets - is the expectation that the algorithms would run on the whole data set available or a (carefully selected) sub set i.e. the training model. I'm interesting in the "model" deployed in production, not just for the purpose of training. 
If the answer is - a sub set - what is usually a good size relative to the full data set and how do people approach this in order to get a representative smaller set?
-qf
--- On Tue, 5/11/10, Sean Owen <sr...@gmail.com> wrote:

From: Sean Owen <sr...@gmail.com>
Subject: Re: RecommenderJob output
To: user@mahout.apache.org
Received: Tuesday, May 11, 2010, 5:08 PM

Can you update it while it's running? Not really. It's a multi-phase
batch job and I don't think you could meaningfully change it on the
fly.

Do you need to run the whole thing every time? No, not at all. Phase 1
(item IDs to item indices) doesn't need to run every time, nor does
phase 3 (count co-occurrence). It's OK if these are a little out of
date. Phase 2 is user vector generation; while I didn't write any
ability to simply append a new user vector to its output, it's easy to
write. So you don't have to run that every time.

Phase 4 and 5 are really where the recommendation happens. Those go
together. You can limit which users it processes though with a file of
user IDs, --usersFile.

I'd say the core job is nearing maturity -- think it's tuned and
debugged. But these kind of practical hooks, like being able to
incrementally update aspects of the pipeline, are exactly what's
needed next. I'd welcome your input and patches in this regard.

Sean


On Tue, May 11, 2010 at 10:00 PM, First Qaxy <qa...@yahoo.ca> wrote:
> One question on the recommendation lifecycle: once a RecommendationJob is being run with the intermediate/temp model being created what is the process of maintaining it? Can I update it or parts of it to reflect new data?
> For example if I have a new user or new preferences for an existing user that I want to compute recommendation for can I do that by incrementally update the internal model and regenerate only recommendations for the user that I'm interested in?
>
> Thanks.
> -qf
> --- On Tue, 5/11/10, Sean Owen <sr...@gmail.com> wrote:
>
> From: Sean Owen <sr...@gmail.com>
> Subject: Re: RecommenderJob output
> To: user@mahout.apache.org
> Cc: mahout-user@lucene.apache.org
> Received: Tuesday, May 11, 2010, 3:55 AM
>
> I just committed more of my local changes, since I'm actively
> improving and fixing things here.
>
> My output looks more reasonable:
>
> 101     [1015:4.0,1021:3.0,1020:3.0]
> 102     [1004:10.0,1005:8.0,1021:2.0,1020:2.0,1015:2.0]
> 103     [1005:12.0,1021:3.0,1015:3.0,1020:3.0]
> 105     [1005:14.0,1021:3.0,1020:3.0]
> 106     [1005:12.0,1021:4.0,1015:3.0]
>
> So you might just try the code from head. booleanData doesn't really
> affect the output, it just enables optimizations for this case.
>
>
>





Re: RecommenderJob output

Posted by First Qaxy <qa...@yahoo.ca>.
Great info. No, I'm not looking into having multiple active processes trying to update it. It's more of a single worker process that needs to update the "model" as new data becomes available (every few hours, days,... depending on the customer needs). Ideally I should be able to tell which users were affected so only their recommendations would end up being updated back to Solr. I am getting closer to the end of the evaluation process of Mahout and will soon proceed with the implementation, at which point I hope I'll be able to provide better feedback and contribute more.

On a different thread - I have a high level / best practices question: When doing clustering or classification with large datasets - is the expectation that the algorithms would run on the whole data set available or a (carefully selected) sub set i.e. the training model. I'm interesting in the "model" deployed in production, not just for the purpose of training. 
If the answer is - a sub set - what is usually a good size relative to the full data set and how do people approach this in order to get a representative smaller set?
-qf
--- On Tue, 5/11/10, Sean Owen <sr...@gmail.com> wrote:

From: Sean Owen <sr...@gmail.com>
Subject: Re: RecommenderJob output
To: user@mahout.apache.org
Received: Tuesday, May 11, 2010, 5:08 PM

Can you update it while it's running? Not really. It's a multi-phase
batch job and I don't think you could meaningfully change it on the
fly.

Do you need to run the whole thing every time? No, not at all. Phase 1
(item IDs to item indices) doesn't need to run every time, nor does
phase 3 (count co-occurrence). It's OK if these are a little out of
date. Phase 2 is user vector generation; while I didn't write any
ability to simply append a new user vector to its output, it's easy to
write. So you don't have to run that every time.

Phase 4 and 5 are really where the recommendation happens. Those go
together. You can limit which users it processes though with a file of
user IDs, --usersFile.

I'd say the core job is nearing maturity -- think it's tuned and
debugged. But these kind of practical hooks, like being able to
incrementally update aspects of the pipeline, are exactly what's
needed next. I'd welcome your input and patches in this regard.

Sean


On Tue, May 11, 2010 at 10:00 PM, First Qaxy <qa...@yahoo.ca> wrote:
> One question on the recommendation lifecycle: once a RecommendationJob is being run with the intermediate/temp model being created what is the process of maintaining it? Can I update it or parts of it to reflect new data?
> For example if I have a new user or new preferences for an existing user that I want to compute recommendation for can I do that by incrementally update the internal model and regenerate only recommendations for the user that I'm interested in?
>
> Thanks.
> -qf
> --- On Tue, 5/11/10, Sean Owen <sr...@gmail.com> wrote:
>
> From: Sean Owen <sr...@gmail.com>
> Subject: Re: RecommenderJob output
> To: user@mahout.apache.org
> Cc: mahout-user@lucene.apache.org
> Received: Tuesday, May 11, 2010, 3:55 AM
>
> I just committed more of my local changes, since I'm actively
> improving and fixing things here.
>
> My output looks more reasonable:
>
> 101     [1015:4.0,1021:3.0,1020:3.0]
> 102     [1004:10.0,1005:8.0,1021:2.0,1020:2.0,1015:2.0]
> 103     [1005:12.0,1021:3.0,1015:3.0,1020:3.0]
> 105     [1005:14.0,1021:3.0,1020:3.0]
> 106     [1005:12.0,1021:4.0,1015:3.0]
>
> So you might just try the code from head. booleanData doesn't really
> affect the output, it just enables optimizations for this case.
>
>
>



Re: RecommenderJob output

Posted by Sean Owen <sr...@gmail.com>.
Can you update it while it's running? Not really. It's a multi-phase
batch job and I don't think you could meaningfully change it on the
fly.

Do you need to run the whole thing every time? No, not at all. Phase 1
(item IDs to item indices) doesn't need to run every time, nor does
phase 3 (count co-occurrence). It's OK if these are a little out of
date. Phase 2 is user vector generation; while I didn't write any
ability to simply append a new user vector to its output, it's easy to
write. So you don't have to run that every time.

Phase 4 and 5 are really where the recommendation happens. Those go
together. You can limit which users it processes though with a file of
user IDs, --usersFile.

I'd say the core job is nearing maturity -- think it's tuned and
debugged. But these kind of practical hooks, like being able to
incrementally update aspects of the pipeline, are exactly what's
needed next. I'd welcome your input and patches in this regard.

Sean


On Tue, May 11, 2010 at 10:00 PM, First Qaxy <qa...@yahoo.ca> wrote:
> One question on the recommendation lifecycle: once a RecommendationJob is being run with the intermediate/temp model being created what is the process of maintaining it? Can I update it or parts of it to reflect new data?
> For example if I have a new user or new preferences for an existing user that I want to compute recommendation for can I do that by incrementally update the internal model and regenerate only recommendations for the user that I'm interested in?
>
> Thanks.
> -qf
> --- On Tue, 5/11/10, Sean Owen <sr...@gmail.com> wrote:
>
> From: Sean Owen <sr...@gmail.com>
> Subject: Re: RecommenderJob output
> To: user@mahout.apache.org
> Cc: mahout-user@lucene.apache.org
> Received: Tuesday, May 11, 2010, 3:55 AM
>
> I just committed more of my local changes, since I'm actively
> improving and fixing things here.
>
> My output looks more reasonable:
>
> 101     [1015:4.0,1021:3.0,1020:3.0]
> 102     [1004:10.0,1005:8.0,1021:2.0,1020:2.0,1015:2.0]
> 103     [1005:12.0,1021:3.0,1015:3.0,1020:3.0]
> 105     [1005:14.0,1021:3.0,1020:3.0]
> 106     [1005:12.0,1021:4.0,1015:3.0]
>
> So you might just try the code from head. booleanData doesn't really
> affect the output, it just enables optimizations for this case.
>
>
>

Re: RecommenderJob output

Posted by First Qaxy <qa...@yahoo.ca>.
Thanks, I've tested it and it did stop showing the -Infinity values.

-qf
--- On Tue, 5/11/10, Sean Owen <sr...@gmail.com> wrote:

From: Sean Owen <sr...@gmail.com>
Subject: Re: RecommenderJob output
To: user@mahout.apache.org
Cc: mahout-user@lucene.apache.org
Received: Tuesday, May 11, 2010, 3:55 AM

I just committed more of my local changes, since I'm actively
improving and fixing things here.

My output looks more reasonable:

101     [1015:4.0,1021:3.0,1020:3.0]
102     [1004:10.0,1005:8.0,1021:2.0,1020:2.0,1015:2.0]
103     [1005:12.0,1021:3.0,1015:3.0,1020:3.0]
105     [1005:14.0,1021:3.0,1020:3.0]
106     [1005:12.0,1021:4.0,1015:3.0]

So you might just try the code from head. booleanData doesn't really
affect the output, it just enables optimizations for this case.



Re: RecommenderJob output

Posted by First Qaxy <qa...@yahoo.ca>.
One question on the recommendation lifecycle: once a RecommendationJob is being run with the intermediate/temp model being created what is the process of maintaining it? Can I update it or parts of it to reflect new data?
For example if I have a new user or new preferences for an existing user that I want to compute recommendation for can I do that by incrementally update the internal model and regenerate only recommendations for the user that I'm interested in?

Thanks.
-qf
--- On Tue, 5/11/10, Sean Owen <sr...@gmail.com> wrote:

From: Sean Owen <sr...@gmail.com>
Subject: Re: RecommenderJob output
To: user@mahout.apache.org
Cc: mahout-user@lucene.apache.org
Received: Tuesday, May 11, 2010, 3:55 AM

I just committed more of my local changes, since I'm actively
improving and fixing things here.

My output looks more reasonable:

101     [1015:4.0,1021:3.0,1020:3.0]
102     [1004:10.0,1005:8.0,1021:2.0,1020:2.0,1015:2.0]
103     [1005:12.0,1021:3.0,1015:3.0,1020:3.0]
105     [1005:14.0,1021:3.0,1020:3.0]
106     [1005:12.0,1021:4.0,1015:3.0]

So you might just try the code from head. booleanData doesn't really
affect the output, it just enables optimizations for this case.



Re: RecommenderJob output

Posted by Sean Owen <sr...@gmail.com>.
I just committed more of my local changes, since I'm actively
improving and fixing things here.

My output looks more reasonable:

101     [1015:4.0,1021:3.0,1020:3.0]
102     [1004:10.0,1005:8.0,1021:2.0,1020:2.0,1015:2.0]
103     [1005:12.0,1021:3.0,1015:3.0,1020:3.0]
105     [1005:14.0,1021:3.0,1020:3.0]
106     [1005:12.0,1021:4.0,1015:3.0]

So you might just try the code from head. booleanData doesn't really
affect the output, it just enables optimizations for this case.

Re: RecommenderJob output

Posted by First Qaxy <qa...@yahoo.ca>.
Sorry, typed the wrong thing - yes, it is true in fact.

--- On Tue, 5/11/10, Sean Owen <sr...@gmail.com> wrote:

From: Sean Owen <sr...@gmail.com>
Subject: Re: RecommenderJob output
To: user@mahout.apache.org
Cc: mahout-user@lucene.apache.org
Received: Tuesday, May 11, 2010, 3:23 AM

Er, wait why are you setting booleanData = false? Though the
formatting got messed up here, it looks like you do not have explicit
ratings. So you should set to true..

On Tue, May 11, 2010 at 7:11 AM, First Qaxy <qa...@yahoo.ca> wrote:
> Hello,
> When running the RecommenderJob with --booleanData false on this input:101,1001101,1002101,1003101,1004101,1005102,1002102,1003103,1002103,1003103,1004105,1001105,1002105,1003105,1004105,1015106,1002106,1003106,1004106,1020106,1021
> the output that I'm getting has:
> 101     [1015:4.0,1021:3.0,1020:3.0,1005:-Infinity,1004:-Infinity,1003:-Infinity,1001:-Infinity,1002:-Infinity]102      [1004:10.0,1005:8.0,1020:2.0,1021:2.0,1015:2.0,1003:-Infinity,1002:-Infinity]103        [1005:12.0,1021:3.0,1020:3.0,1015:3.0,1004:-Infinity,1002:-Infinity,1003:-Infinity]105  [1005:14.0,1020:3.0,1021:3.0,1015:-Infinity,1004:-Infinity,1003:-Infinity,1001:-Infinity,1002:-Infinity]106     [1005:12.0,1021:4.0,1015:3.0,1004:-Infinity,1002:-Infinity,1003:-Infinity,1020:-Infinity]
> What is the meaning(formula) of the float number? > 101 [1015:4.0 <= what is 4.0 ?
> Thanks, -qf
>
>



Re: RecommenderJob output

Posted by Sean Owen <sr...@gmail.com>.
Er, wait why are you setting booleanData = false? Though the
formatting got messed up here, it looks like you do not have explicit
ratings. So you should set to true..

On Tue, May 11, 2010 at 7:11 AM, First Qaxy <qa...@yahoo.ca> wrote:
> Hello,
> When running the RecommenderJob with --booleanData false on this input:101,1001101,1002101,1003101,1004101,1005102,1002102,1003103,1002103,1003103,1004105,1001105,1002105,1003105,1004105,1015106,1002106,1003106,1004106,1020106,1021
> the output that I'm getting has:
> 101     [1015:4.0,1021:3.0,1020:3.0,1005:-Infinity,1004:-Infinity,1003:-Infinity,1001:-Infinity,1002:-Infinity]102      [1004:10.0,1005:8.0,1020:2.0,1021:2.0,1015:2.0,1003:-Infinity,1002:-Infinity]103        [1005:12.0,1021:3.0,1020:3.0,1015:3.0,1004:-Infinity,1002:-Infinity,1003:-Infinity]105  [1005:14.0,1020:3.0,1021:3.0,1015:-Infinity,1004:-Infinity,1003:-Infinity,1001:-Infinity,1002:-Infinity]106     [1005:12.0,1021:4.0,1015:3.0,1004:-Infinity,1002:-Infinity,1003:-Infinity,1020:-Infinity]
> What is the meaning(formula) of the float number? > 101 [1015:4.0 <= what is 4.0 ?
> Thanks, -qf
>
>

RE: Thread Hijacking Re: RecommenderJob output

Posted by Grant Ingersoll <gs...@apache.org>.
On May 11, 2010, at 8:15 AM, Sean Owen wrote:

> (Did that happen? I only see my three replies to the original message
> -- sure, maybe that could have been one -- but all were directly
> relevant to the first message.)
> 
> (Or is this somehow looking connected to another thread because it
> shares the same subject? didn't happen for me in Gmail at least)

There were actually a few hijacks on this thread, AFAICT.  This one, the Safari one and the Clustering one.   If you look at the full headers, you'll see either a common message-id or a common reply-to header, which causes some (but not all) mail clients to automatically group those threads.

-Grant



Re: RecommenderJob output

Posted by Sean Owen <sr...@gmail.com>.
(Did that happen? I only see my three replies to the original message
-- sure, maybe that could have been one -- but all were directly
relevant to the first message.)

(Or is this somehow looking connected to another thread because it
shares the same subject? didn't happen for me in Gmail at least)

On Tue, May 11, 2010 at 1:10 PM, Grant Ingersoll <gs...@apache.org> wrote:
> Please, when starting a new thread, start a new message.

Re: RecommenderJob output

Posted by First Qaxy <qa...@yahoo.ca>.
Hi Grant,
I wasn't aware of that. Thanks. I'll do that going forward.
-qf

--- On Tue, 5/11/10, Grant Ingersoll <gs...@apache.org> wrote:

From: Grant Ingersoll <gs...@apache.org>
Subject: Re: RecommenderJob output
To: user@mahout.apache.org
Cc: mahout-user@lucene.apache.org
Received: Tuesday, May 11, 2010, 8:10 AM

Please, when starting a new thread, start a new message.  

See http://people.apache.org/~hossman/#threadhijack
<snip>
When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  
http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking
</snip>


On May 11, 2010, at 2:11 AM, First Qaxy wrote:

> Hello,
> When running the RecommenderJob with --booleanData false on this input:101,1001101,1002101,1003101,1004101,1005102,1002102,1003103,1002103,1003103,1004105,1001105,1002105,1003105,1004105,1015106,1002106,1003106,1004106,1020106,1021
> the output that I'm getting has:
> 101    [1015:4.0,1021:3.0,1020:3.0,1005:-Infinity,1004:-Infinity,1003:-Infinity,1001:-Infinity,1002:-Infinity]102    [1004:10.0,1005:8.0,1020:2.0,1021:2.0,1015:2.0,1003:-Infinity,1002:-Infinity]103    [1005:12.0,1021:3.0,1020:3.0,1015:3.0,1004:-Infinity,1002:-Infinity,1003:-Infinity]105    [1005:14.0,1020:3.0,1021:3.0,1015:-Infinity,1004:-Infinity,1003:-Infinity,1001:-Infinity,1002:-Infinity]106    [1005:12.0,1021:4.0,1015:3.0,1004:-Infinity,1002:-Infinity,1003:-Infinity,1020:-Infinity]
> What is the meaning(formula) of the float number? > 101    [1015:4.0 <= what is 4.0 ?
> Thanks, -qf
> 




Re: RecommenderJob output

Posted by Grant Ingersoll <gs...@apache.org>.
Please, when starting a new thread, start a new message.  

See http://people.apache.org/~hossman/#threadhijack
<snip>
When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  
http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking
</snip>


On May 11, 2010, at 2:11 AM, First Qaxy wrote:

> Hello,
> When running the RecommenderJob with --booleanData false on this input:101,1001101,1002101,1003101,1004101,1005102,1002102,1003103,1002103,1003103,1004105,1001105,1002105,1003105,1004105,1015106,1002106,1003106,1004106,1020106,1021
> the output that I'm getting has:
> 101	[1015:4.0,1021:3.0,1020:3.0,1005:-Infinity,1004:-Infinity,1003:-Infinity,1001:-Infinity,1002:-Infinity]102	[1004:10.0,1005:8.0,1020:2.0,1021:2.0,1015:2.0,1003:-Infinity,1002:-Infinity]103	[1005:12.0,1021:3.0,1020:3.0,1015:3.0,1004:-Infinity,1002:-Infinity,1003:-Infinity]105	[1005:14.0,1020:3.0,1021:3.0,1015:-Infinity,1004:-Infinity,1003:-Infinity,1001:-Infinity,1002:-Infinity]106	[1005:12.0,1021:4.0,1015:3.0,1004:-Infinity,1002:-Infinity,1003:-Infinity,1020:-Infinity]
> What is the meaning(formula) of the float number? > 101	[1015:4.0 <= what is 4.0 ?
> Thanks, -qf
> 


RecommenderJob output

Posted by First Qaxy <qa...@yahoo.ca>.
Hello,
When running the RecommenderJob with --booleanData false on this input:101,1001101,1002101,1003101,1004101,1005102,1002102,1003103,1002103,1003103,1004105,1001105,1002105,1003105,1004105,1015106,1002106,1003106,1004106,1020106,1021
the output that I'm getting has:
101	[1015:4.0,1021:3.0,1020:3.0,1005:-Infinity,1004:-Infinity,1003:-Infinity,1001:-Infinity,1002:-Infinity]102	[1004:10.0,1005:8.0,1020:2.0,1021:2.0,1015:2.0,1003:-Infinity,1002:-Infinity]103	[1005:12.0,1021:3.0,1020:3.0,1015:3.0,1004:-Infinity,1002:-Infinity,1003:-Infinity]105	[1005:14.0,1020:3.0,1021:3.0,1015:-Infinity,1004:-Infinity,1003:-Infinity,1001:-Infinity,1002:-Infinity]106	[1005:12.0,1021:4.0,1015:3.0,1004:-Infinity,1002:-Infinity,1003:-Infinity,1020:-Infinity]
What is the meaning(formula) of the float number? > 101	[1015:4.0 <= what is 4.0 ?
Thanks, -qf


Clustering users

Posted by First Qaxy <qa...@yahoo.ca>.
Hello,
Just started looking into clustering - KMeansDriver - and have a question on clustering users. Considering that overall I have a huge number of items, how do I create the vectors for the users? Is there any code in Mahout to support that or do I need to write it by turning all userN, itemM pairs (boolean pref) into a userN vector ? I'm not exactly sure - how would this vector look like? Also at the end of the clustering I need to show the original user. Is that possible?Any pointers would be great. Thanks,-qf


Re: Wiki and Safari

Posted by Ted Dunning <te...@gmail.com>.
Renders fine for me using Chrome

On Mon, May 10, 2010 at 11:35 AM, Jeff Eastman
<jd...@windwardsolutions.com>wrote:

> Anybody else have problems viewing our Wiki using Safari? The pictures on
> this page don't render for me and I also cannot see many of the {code}
> blocks on other pages.  Firefox seems to be just fine.
>
>
> On 5/10/10 7:39 AM, Robin Anil wrote:
>
>> https://cwiki.apache.org/confluence/display/MAHOUT/k-Means
>>
>>
>>
>
>

Re: Wiki and Safari

Posted by Sebastian Schelter <se...@zalando.de>.
Looks fine for me in Safari on OS X 10.4.11

2010/5/10 Jeff Eastman <jd...@windwardsolutions.com>

> Anybody else have problems viewing our Wiki using Safari? The pictures on
> this page don't render for me and I also cannot see many of the {code}
> blocks on other pages.  Firefox seems to be just fine.
>
>
> On 5/10/10 7:39 AM, Robin Anil wrote:
>
>> https://cwiki.apache.org/confluence/display/MAHOUT/k-Means
>>
>>
>>
>
>

Wiki and Safari

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
Anybody else have problems viewing our Wiki using Safari? The pictures 
on this page don't render for me and I also cannot see many of the 
{code} blocks on other pages.  Firefox seems to be just fine.


On 5/10/10 7:39 AM, Robin Anil wrote:
> https://cwiki.apache.org/confluence/display/MAHOUT/k-Means
>
>    


Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Robin Anil <ro...@gmail.com>.
https://cwiki.apache.org/confluence/display/MAHOUT/k-Means

Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
That seems to have done it. I ran the script with the latest commit and 
there is nothing new in /. Quick work :)


On 5/10/10 2:09 PM, Sean Owen wrote:
> Yes that sounds like a symptom of the last issue I fixed up. I just
> committed a similar fix.
>
> Basically wherever we did...
>
> String path = parent + "/foo"
>
> and it became
>
> Path path = new Path(parent, "/foo")
>
> the "/" needs to go away. Perhaps surprisingly, in something like the
> above, parent gets ignored.
>
> On Mon, May 10, 2010 at 10:06 PM, Jeff Eastman
> <jd...@windwardsolutions.com>  wrote:
>    
>> Yup, I drive a Mac and I've got /tokenized-documents on my root too. Need to
>> investigate this further...
>>
>> On 5/10/10 1:43 PM, Florent Empis wrote:
>>      
>>> Hi,
>>>
>>> It might help for the build part, but probably won't fix the 2nd issue?
>>> The / is not writeable on most systems so creation of
>>> /tokenized-documents/_temporary
>>> will still fail?
>>>
>>>        
>>
>>      
>    


Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Sean Owen <sr...@gmail.com>.
Yes that sounds like a symptom of the last issue I fixed up. I just
committed a similar fix.

Basically wherever we did...

String path = parent + "/foo"

and it became

Path path = new Path(parent, "/foo")

the "/" needs to go away. Perhaps surprisingly, in something like the
above, parent gets ignored.

On Mon, May 10, 2010 at 10:06 PM, Jeff Eastman
<jd...@windwardsolutions.com> wrote:
> Yup, I drive a Mac and I've got /tokenized-documents on my root too. Need to
> investigate this further...
>
> On 5/10/10 1:43 PM, Florent Empis wrote:
>>
>> Hi,
>>
>> It might help for the build part, but probably won't fix the 2nd issue?
>> The / is not writeable on most systems so creation of
>> /tokenized-documents/_temporary
>> will still fail?
>>
>
>

Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
Yup, I drive a Mac and I've got /tokenized-documents on my root too. 
Need to investigate this further...

On 5/10/10 1:43 PM, Florent Empis wrote:
> Hi,
>
> It might help for the build part, but probably won't fix the 2nd issue?
> The / is not writeable on most systems so creation of
> /tokenized-documents/_temporary
> will still fail?
>    


Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Florent Empis <fl...@gmail.com>.
Yep, I'm running trunk.
However, I *think* that when I try to build utils (via mvn -B) it's actually
pulling mahout-core/mahout-test jars from a repo which seems strange to
me...


2010/5/10 Jeff Eastman <jd...@windwardsolutions.com>

> Sean posted something about that recently (5/5/10: Re: Installation problem
> in utils) in which he claims to have fixed it. At least, all the tests ran.
> But then, you are running the reuters script and that does not get exercised
> in the build. I suspect there are some more issues with the recent temp file
> allocation patch. Are you running trunk?
>
>
> On 5/10/10 1:43 PM, Florent Empis wrote:
>
>> Hi,
>>
>> It might help for the build part, but probably won't fix the 2nd issue?
>> The / is not writeable on most systems so creation of
>> /tokenized-documents/_temporary
>> will still fail?
>>
>> 2010/5/10 Jeff Eastman<jd...@windwardsolutions.com>
>>
>>
>>
>>> Hi Florent,
>>>
>>> I successfully ran the new build-reuters.sh before I committed it this
>>> morning, so I suspect you must have some other problem in your system.
>>> Have
>>> you tried deleting your Maven repository (.m2) and doing a full mvn clean
>>> install?
>>>
>>> Jeff
>>>
>>>
>>> On 5/10/10 12:50 PM, Florent Empis wrote:
>>>
>>>
>>>
>>>> Hi,
>>>>
>>>> I've seen the commit from Robin this afternoon so I gave it another try.
>>>> Using the new shell I still run into a few problems
>>>> At first, in order to satisfy a dependency to slf4j I've had to add the
>>>> following to examples/pom.xml (once again I'm not a maven expert, so
>>>> this
>>>> may not be the correct way to do it)
>>>>
>>>> <dependency>
>>>>   <groupId>org.slf4j</groupId>
>>>>   <artifactId>slf4j-nop</artifactId>
>>>>   <version>1.5.8</version>
>>>>   <classifier>sources</classifier>
>>>> </dependency>
>>>>
>>>> Then, after a succesful mvn -B
>>>> I've launched the shell:
>>>> florent@florent-laptop:~/workspace/mahout$
>>>> ./examples/bin/build-reuters.sh
>>>>
>>>> It fails with the following error:
>>>> 10/05/10 21:28:06 WARN mapred.LocalJobRunner: job_local_0001
>>>> java.io.IOException: The temporary job-output directory
>>>> file:/tokenized-documents/_temporary doesn't exist!
>>>> at
>>>>
>>>>
>>>> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:204)
>>>> at
>>>>
>>>>
>>>> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:234)
>>>> at
>>>>
>>>>
>>>> org.apache.hadoop.mapred.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:48)
>>>> at
>>>>
>>>>
>>>> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:662)
>>>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:352)
>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>>>> at
>>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>>>> 10/05/10 21:28:07 INFO mapred.JobClient:  map 0% reduce 0%
>>>> 10/05/10 21:28:07 INFO mapred.JobClient: Job complete: job_local_0001
>>>> 10/05/10 21:28:07 INFO mapred.JobClient: Counters: 0
>>>> 10/05/10 21:28:07 ERROR driver.MahoutDriver: MahoutDriver failed with
>>>> args:
>>>> [-i, ./examples/bin/work/reuters-out-seqdir/, -o,
>>>> ./examples/bin/work/reuters-out-seqdir-sparse, null]
>>>> Job failed!
>>>> Exception in thread "main" java.io.IOException: Job failed!
>>>> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
>>>> at
>>>>
>>>>
>>>> org.apache.mahout.utils.vectors.text.DocumentProcessor.tokenizeDocuments(DocumentProcessor.java:97)
>>>> at
>>>>
>>>>
>>>> org.apache.mahout.text.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:215)
>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>>
>>>>
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>> at
>>>>
>>>>
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>> at
>>>>
>>>>
>>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>>> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>>> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:172)
>>>>
>>>> A find makes me think that the issue is
>>>> in
>>>>
>>>> /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java
>>>>
>>>>
>>>> /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java:
>>>>  public static final String TOKENIZED_DOCUMENT_OUTPUT_FOLDER =
>>>> "/tokenized-documents";
>>>>
>>>> I tried changing this value, but it did not solve my problem, although I
>>>> did
>>>> a mvn -B on utils afterwards.... it looks like the mahout-utils used by
>>>> the
>>>> test comes from somewhere else: I guess there's something I'm
>>>> missing....
>>>>
>>>>
>>>>
>>>>
>>>> 2010/5/10 Jeff Eastman<jd...@windwardsolutions.com>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> I will commit once I verify it completes.  It's running now...
>>>>> Jeff
>>>>>
>>>>>
>>>>> On 5/10/10 7:50 AM, Robin Anil wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> +1. Should be using bin/mahout script for all these.
>>>>>>
>>>>>>
>>>>>> Robin
>>>>>>
>>>>>>
>>>>>> On Mon, May 10, 2010 at 8:12 PM, Jeff Eastman<
>>>>>> jdog@windwardsolutions.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Well, thanks for the info. Perhaps we should replace the script then.
>>>>>>> Leaving time bombs around like this is not good.
>>>>>>> Jeff
>>>>>>>
>>>>>>>
>>>>>>> On 5/10/10 7:32 AM, Robin Anil wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> thats been broken for a long time, it was used by David while he
>>>>>>>> developed
>>>>>>>> LDA, It didn't get updated to work post 0.2 . Use Sisir's script to
>>>>>>>> convert
>>>>>>>> reuters to vectors, its up on the wiki
>>>>>>>>
>>>>>>>> Robin
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>

Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Sean Owen <sr...@gmail.com>.
I am indeed suspicious that among the extensive changes I made, that
some assumption got violated somewhere, about who's creating which
files where. It may have been necessary to make the tests happy.

If that's the case I hope it's isolated, and should be easy to fix.

On Mon, May 10, 2010 at 10:00 PM, Jeff Eastman
<jd...@windwardsolutions.com> wrote:
> Sean posted something about that recently (5/5/10: Re: Installation problem
> in utils) in which he claims to have fixed it. At least, all the tests ran.
> But then, you are running the reuters script and that does not get exercised
> in the build. I suspect there are some more issues with the recent temp file
> allocation patch. Are you running trunk?
>

Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
Sean posted something about that recently (5/5/10: Re: Installation 
problem in utils) in which he claims to have fixed it. At least, all the 
tests ran. But then, you are running the reuters script and that does 
not get exercised in the build. I suspect there are some more issues 
with the recent temp file allocation patch. Are you running trunk?

On 5/10/10 1:43 PM, Florent Empis wrote:
> Hi,
>
> It might help for the build part, but probably won't fix the 2nd issue?
> The / is not writeable on most systems so creation of
> /tokenized-documents/_temporary
> will still fail?
>
> 2010/5/10 Jeff Eastman<jd...@windwardsolutions.com>
>
>    
>> Hi Florent,
>>
>> I successfully ran the new build-reuters.sh before I committed it this
>> morning, so I suspect you must have some other problem in your system. Have
>> you tried deleting your Maven repository (.m2) and doing a full mvn clean
>> install?
>>
>> Jeff
>>
>>
>> On 5/10/10 12:50 PM, Florent Empis wrote:
>>
>>      
>>> Hi,
>>>
>>> I've seen the commit from Robin this afternoon so I gave it another try.
>>> Using the new shell I still run into a few problems
>>> At first, in order to satisfy a dependency to slf4j I've had to add the
>>> following to examples/pom.xml (once again I'm not a maven expert, so this
>>> may not be the correct way to do it)
>>>
>>> <dependency>
>>>    <groupId>org.slf4j</groupId>
>>>    <artifactId>slf4j-nop</artifactId>
>>>    <version>1.5.8</version>
>>>    <classifier>sources</classifier>
>>> </dependency>
>>>
>>> Then, after a succesful mvn -B
>>> I've launched the shell:
>>> florent@florent-laptop:~/workspace/mahout$
>>> ./examples/bin/build-reuters.sh
>>>
>>> It fails with the following error:
>>> 10/05/10 21:28:06 WARN mapred.LocalJobRunner: job_local_0001
>>> java.io.IOException: The temporary job-output directory
>>> file:/tokenized-documents/_temporary doesn't exist!
>>> at
>>>
>>> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:204)
>>> at
>>>
>>> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:234)
>>> at
>>>
>>> org.apache.hadoop.mapred.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:48)
>>> at
>>>
>>> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:662)
>>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:352)
>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>>> at
>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>>> 10/05/10 21:28:07 INFO mapred.JobClient:  map 0% reduce 0%
>>> 10/05/10 21:28:07 INFO mapred.JobClient: Job complete: job_local_0001
>>> 10/05/10 21:28:07 INFO mapred.JobClient: Counters: 0
>>> 10/05/10 21:28:07 ERROR driver.MahoutDriver: MahoutDriver failed with
>>> args:
>>> [-i, ./examples/bin/work/reuters-out-seqdir/, -o,
>>> ./examples/bin/work/reuters-out-seqdir-sparse, null]
>>> Job failed!
>>> Exception in thread "main" java.io.IOException: Job failed!
>>> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
>>> at
>>>
>>> org.apache.mahout.utils.vectors.text.DocumentProcessor.tokenizeDocuments(DocumentProcessor.java:97)
>>> at
>>>
>>> org.apache.mahout.text.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:215)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>> at
>>>
>>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:172)
>>>
>>> A find makes me think that the issue is
>>> in
>>> /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java
>>>
>>> /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java:
>>>   public static final String TOKENIZED_DOCUMENT_OUTPUT_FOLDER =
>>> "/tokenized-documents";
>>>
>>> I tried changing this value, but it did not solve my problem, although I
>>> did
>>> a mvn -B on utils afterwards.... it looks like the mahout-utils used by
>>> the
>>> test comes from somewhere else: I guess there's something I'm missing....
>>>
>>>
>>>
>>>
>>> 2010/5/10 Jeff Eastman<jd...@windwardsolutions.com>
>>>
>>>
>>>
>>>        
>>>> I will commit once I verify it completes.  It's running now...
>>>> Jeff
>>>>
>>>>
>>>> On 5/10/10 7:50 AM, Robin Anil wrote:
>>>>
>>>>
>>>>
>>>>          
>>>>> +1. Should be using bin/mahout script for all these.
>>>>>
>>>>>
>>>>> Robin
>>>>>
>>>>>
>>>>> On Mon, May 10, 2010 at 8:12 PM, Jeff Eastman<
>>>>> jdog@windwardsolutions.com
>>>>>
>>>>>
>>>>>            
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>              
>>>>>
>>>>>
>>>>>
>>>>>            
>>>>>> Well, thanks for the info. Perhaps we should replace the script then.
>>>>>> Leaving time bombs around like this is not good.
>>>>>> Jeff
>>>>>>
>>>>>>
>>>>>> On 5/10/10 7:32 AM, Robin Anil wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>              
>>>>>>> thats been broken for a long time, it was used by David while he
>>>>>>> developed
>>>>>>> LDA, It didn't get updated to work post 0.2 . Use Sisir's script to
>>>>>>> convert
>>>>>>> reuters to vectors, its up on the wiki
>>>>>>>
>>>>>>> Robin
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>                
>>>>>>
>>>>>>
>>>>>>
>>>>>>              
>>>>>
>>>>>
>>>>>            
>>>>
>>>>
>>>>          
>>>
>>>        
>>
>>      
>    


Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Florent Empis <fl...@gmail.com>.
Hi,

It might help for the build part, but probably won't fix the 2nd issue?
The / is not writeable on most systems so creation of
/tokenized-documents/_temporary
will still fail?

2010/5/10 Jeff Eastman <jd...@windwardsolutions.com>

> Hi Florent,
>
> I successfully ran the new build-reuters.sh before I committed it this
> morning, so I suspect you must have some other problem in your system. Have
> you tried deleting your Maven repository (.m2) and doing a full mvn clean
> install?
>
> Jeff
>
>
> On 5/10/10 12:50 PM, Florent Empis wrote:
>
>> Hi,
>>
>> I've seen the commit from Robin this afternoon so I gave it another try.
>> Using the new shell I still run into a few problems
>> At first, in order to satisfy a dependency to slf4j I've had to add the
>> following to examples/pom.xml (once again I'm not a maven expert, so this
>> may not be the correct way to do it)
>>
>> <dependency>
>>   <groupId>org.slf4j</groupId>
>>   <artifactId>slf4j-nop</artifactId>
>>   <version>1.5.8</version>
>>   <classifier>sources</classifier>
>> </dependency>
>>
>> Then, after a succesful mvn -B
>> I've launched the shell:
>> florent@florent-laptop:~/workspace/mahout$
>> ./examples/bin/build-reuters.sh
>>
>> It fails with the following error:
>> 10/05/10 21:28:06 WARN mapred.LocalJobRunner: job_local_0001
>> java.io.IOException: The temporary job-output directory
>> file:/tokenized-documents/_temporary doesn't exist!
>> at
>>
>> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:204)
>> at
>>
>> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:234)
>> at
>>
>> org.apache.hadoop.mapred.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:48)
>> at
>>
>> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:662)
>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:352)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>> at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>> 10/05/10 21:28:07 INFO mapred.JobClient:  map 0% reduce 0%
>> 10/05/10 21:28:07 INFO mapred.JobClient: Job complete: job_local_0001
>> 10/05/10 21:28:07 INFO mapred.JobClient: Counters: 0
>> 10/05/10 21:28:07 ERROR driver.MahoutDriver: MahoutDriver failed with
>> args:
>> [-i, ./examples/bin/work/reuters-out-seqdir/, -o,
>> ./examples/bin/work/reuters-out-seqdir-sparse, null]
>> Job failed!
>> Exception in thread "main" java.io.IOException: Job failed!
>> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
>> at
>>
>> org.apache.mahout.utils.vectors.text.DocumentProcessor.tokenizeDocuments(DocumentProcessor.java:97)
>> at
>>
>> org.apache.mahout.text.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:215)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at
>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at
>>
>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:172)
>>
>> A find makes me think that the issue is
>> in
>> /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java
>>
>> /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java:
>>  public static final String TOKENIZED_DOCUMENT_OUTPUT_FOLDER =
>> "/tokenized-documents";
>>
>> I tried changing this value, but it did not solve my problem, although I
>> did
>> a mvn -B on utils afterwards.... it looks like the mahout-utils used by
>> the
>> test comes from somewhere else: I guess there's something I'm missing....
>>
>>
>>
>>
>> 2010/5/10 Jeff Eastman<jd...@windwardsolutions.com>
>>
>>
>>
>>> I will commit once I verify it completes.  It's running now...
>>> Jeff
>>>
>>>
>>> On 5/10/10 7:50 AM, Robin Anil wrote:
>>>
>>>
>>>
>>>> +1. Should be using bin/mahout script for all these.
>>>>
>>>>
>>>> Robin
>>>>
>>>>
>>>> On Mon, May 10, 2010 at 8:12 PM, Jeff Eastman<
>>>> jdog@windwardsolutions.com
>>>>
>>>>
>>>>> wrote:
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Well, thanks for the info. Perhaps we should replace the script then.
>>>>> Leaving time bombs around like this is not good.
>>>>> Jeff
>>>>>
>>>>>
>>>>> On 5/10/10 7:32 AM, Robin Anil wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> thats been broken for a long time, it was used by David while he
>>>>>> developed
>>>>>> LDA, It didn't get updated to work post 0.2 . Use Sisir's script to
>>>>>> convert
>>>>>> reuters to vectors, its up on the wiki
>>>>>>
>>>>>> Robin
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>

Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
Hi Florent,

I successfully ran the new build-reuters.sh before I committed it this 
morning, so I suspect you must have some other problem in your system. 
Have you tried deleting your Maven repository (.m2) and doing a full mvn 
clean install?

Jeff

On 5/10/10 12:50 PM, Florent Empis wrote:
> Hi,
>
> I've seen the commit from Robin this afternoon so I gave it another try.
> Using the new shell I still run into a few problems
> At first, in order to satisfy a dependency to slf4j I've had to add the
> following to examples/pom.xml (once again I'm not a maven expert, so this
> may not be the correct way to do it)
>
> <dependency>
>    <groupId>org.slf4j</groupId>
>    <artifactId>slf4j-nop</artifactId>
>    <version>1.5.8</version>
>    <classifier>sources</classifier>
> </dependency>
>
> Then, after a succesful mvn -B
> I've launched the shell:
> florent@florent-laptop:~/workspace/mahout$ ./examples/bin/build-reuters.sh
>
> It fails with the following error:
> 10/05/10 21:28:06 WARN mapred.LocalJobRunner: job_local_0001
> java.io.IOException: The temporary job-output directory
> file:/tokenized-documents/_temporary doesn't exist!
> at
> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:204)
> at
> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:234)
> at
> org.apache.hadoop.mapred.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:48)
> at
> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:662)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:352)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 10/05/10 21:28:07 INFO mapred.JobClient:  map 0% reduce 0%
> 10/05/10 21:28:07 INFO mapred.JobClient: Job complete: job_local_0001
> 10/05/10 21:28:07 INFO mapred.JobClient: Counters: 0
> 10/05/10 21:28:07 ERROR driver.MahoutDriver: MahoutDriver failed with args:
> [-i, ./examples/bin/work/reuters-out-seqdir/, -o,
> ./examples/bin/work/reuters-out-seqdir-sparse, null]
> Job failed!
> Exception in thread "main" java.io.IOException: Job failed!
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
> at
> org.apache.mahout.utils.vectors.text.DocumentProcessor.tokenizeDocuments(DocumentProcessor.java:97)
> at
> org.apache.mahout.text.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:215)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:172)
>
> A find makes me think that the issue is
> in /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java
> /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java:
>   public static final String TOKENIZED_DOCUMENT_OUTPUT_FOLDER =
> "/tokenized-documents";
>
> I tried changing this value, but it did not solve my problem, although I did
> a mvn -B on utils afterwards.... it looks like the mahout-utils used by the
> test comes from somewhere else: I guess there's something I'm missing....
>
>
>
>
> 2010/5/10 Jeff Eastman<jd...@windwardsolutions.com>
>
>    
>> I will commit once I verify it completes.  It's running now...
>> Jeff
>>
>>
>> On 5/10/10 7:50 AM, Robin Anil wrote:
>>
>>      
>>> +1. Should be using bin/mahout script for all these.
>>>
>>>
>>> Robin
>>>
>>>
>>> On Mon, May 10, 2010 at 8:12 PM, Jeff Eastman<jdog@windwardsolutions.com
>>>        
>>>> wrote:
>>>>          
>>>
>>>
>>>        
>>>> Well, thanks for the info. Perhaps we should replace the script then.
>>>> Leaving time bombs around like this is not good.
>>>> Jeff
>>>>
>>>>
>>>> On 5/10/10 7:32 AM, Robin Anil wrote:
>>>>
>>>>
>>>>
>>>>          
>>>>> thats been broken for a long time, it was used by David while he
>>>>> developed
>>>>> LDA, It didn't get updated to work post 0.2 . Use Sisir's script to
>>>>> convert
>>>>> reuters to vectors, its up on the wiki
>>>>>
>>>>> Robin
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>            
>>>>
>>>>
>>>>          
>>>
>>>        
>>
>>      
>    


Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Florent Empis <fl...@gmail.com>.
Hi,

I've seen the commit from Robin this afternoon so I gave it another try.
Using the new shell I still run into a few problems
At first, in order to satisfy a dependency to slf4j I've had to add the
following to examples/pom.xml (once again I'm not a maven expert, so this
may not be the correct way to do it)

<dependency>
  <groupId>org.slf4j</groupId>
  <artifactId>slf4j-nop</artifactId>
  <version>1.5.8</version>
  <classifier>sources</classifier>
</dependency>

Then, after a succesful mvn -B
I've launched the shell:
florent@florent-laptop:~/workspace/mahout$ ./examples/bin/build-reuters.sh

It fails with the following error:
10/05/10 21:28:06 WARN mapred.LocalJobRunner: job_local_0001
java.io.IOException: The temporary job-output directory
file:/tokenized-documents/_temporary doesn't exist!
at
org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:204)
at
org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:234)
at
org.apache.hadoop.mapred.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:48)
at
org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:662)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:352)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
10/05/10 21:28:07 INFO mapred.JobClient:  map 0% reduce 0%
10/05/10 21:28:07 INFO mapred.JobClient: Job complete: job_local_0001
10/05/10 21:28:07 INFO mapred.JobClient: Counters: 0
10/05/10 21:28:07 ERROR driver.MahoutDriver: MahoutDriver failed with args:
[-i, ./examples/bin/work/reuters-out-seqdir/, -o,
./examples/bin/work/reuters-out-seqdir-sparse, null]
Job failed!
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
at
org.apache.mahout.utils.vectors.text.DocumentProcessor.tokenizeDocuments(DocumentProcessor.java:97)
at
org.apache.mahout.text.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:215)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:172)

A find makes me think that the issue is
in /utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java
/utils/src/main/java/org/apache/mahout/utils/vectors/text/DocumentProcessor.java:
 public static final String TOKENIZED_DOCUMENT_OUTPUT_FOLDER =
"/tokenized-documents";

I tried changing this value, but it did not solve my problem, although I did
a mvn -B on utils afterwards.... it looks like the mahout-utils used by the
test comes from somewhere else: I guess there's something I'm missing....




2010/5/10 Jeff Eastman <jd...@windwardsolutions.com>

> I will commit once I verify it completes.  It's running now...
> Jeff
>
>
> On 5/10/10 7:50 AM, Robin Anil wrote:
>
>> +1. Should be using bin/mahout script for all these.
>>
>>
>> Robin
>>
>>
>> On Mon, May 10, 2010 at 8:12 PM, Jeff Eastman<jdog@windwardsolutions.com
>> >wrote:
>>
>>
>>
>>> Well, thanks for the info. Perhaps we should replace the script then.
>>> Leaving time bombs around like this is not good.
>>> Jeff
>>>
>>>
>>> On 5/10/10 7:32 AM, Robin Anil wrote:
>>>
>>>
>>>
>>>> thats been broken for a long time, it was used by David while he
>>>> developed
>>>> LDA, It didn't get updated to work post 0.2 . Use Sisir's script to
>>>> convert
>>>> reuters to vectors, its up on the wiki
>>>>
>>>> Robin
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>

Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
I will commit once I verify it completes.  It's running now...
Jeff

On 5/10/10 7:50 AM, Robin Anil wrote:
> +1. Should be using bin/mahout script for all these.
>
>
> Robin
>
>
> On Mon, May 10, 2010 at 8:12 PM, Jeff Eastman<jd...@windwardsolutions.com>wrote:
>
>    
>> Well, thanks for the info. Perhaps we should replace the script then.
>> Leaving time bombs around like this is not good.
>> Jeff
>>
>>
>> On 5/10/10 7:32 AM, Robin Anil wrote:
>>
>>      
>>> thats been broken for a long time, it was used by David while he developed
>>> LDA, It didn't get updated to work post 0.2 . Use Sisir's script to
>>> convert
>>> reuters to vectors, its up on the wiki
>>>
>>> Robin
>>>
>>>
>>>        
>>
>>      
>    


Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Robin Anil <ro...@gmail.com>.
+1. Should be using bin/mahout script for all these.


Robin


On Mon, May 10, 2010 at 8:12 PM, Jeff Eastman <jd...@windwardsolutions.com>wrote:

> Well, thanks for the info. Perhaps we should replace the script then.
> Leaving time bombs around like this is not good.
> Jeff
>
>
> On 5/10/10 7:32 AM, Robin Anil wrote:
>
>> thats been broken for a long time, it was used by David while he developed
>> LDA, It didn't get updated to work post 0.2 . Use Sisir's script to
>> convert
>> reuters to vectors, its up on the wiki
>>
>> Robin
>>
>>
>
>

Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
Well, thanks for the info. Perhaps we should replace the script then. 
Leaving time bombs around like this is not good.
Jeff

On 5/10/10 7:32 AM, Robin Anil wrote:
> thats been broken for a long time, it was used by David while he developed
> LDA, It didn't get updated to work post 0.2 . Use Sisir's script to convert
> reuters to vectors, its up on the wiki
>
> Robin
>    


Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Robin Anil <ro...@gmail.com>.
thats been broken for a long time, it was used by David while he developed
LDA, It didn't get updated to work post 0.2 . Use Sisir's script to convert
reuters to vectors, its up on the wiki

Robin

On Mon, May 10, 2010 at 7:53 PM, Florent Empis <fl...@gmail.com>wrote:

> Try adding a -X on your mvn calls you'll get a much more verbose output.
>
> 2010/5/10 Jeff Eastman <jd...@windwardsolutions.com>
>
> > I'm also getting a build error running that job but do not see the
> > additional text you are seeing. How do you enable that extra information?
> > Jeff
> >
> > $ ./build-reuters.sh
> > Downloading Reuters-21578
> >  % Total    % Received % Xferd  Average Speed   Time    Time     Time
> >  Current
> >                                 Dload  Upload   Total   Spent    Left
> >  Speed
> > 100 7959k  100 7959k    0     0   697k      0  0:00:11  0:00:11 --:--:--
> >  667k
> > Extracting...
> > Converting to plain text.
> > + Error stacktraces are turned on.
> > [ERROR] BUILD ERROR
> >
> >
> >
> >
> >
> > On 5/10/10 7:07 AM, Florent Empis wrote:
> >
> >> Sorry, I made a mistake
> >> I have a problem with compressors and with SLF4J:
> >> [INFO]
> >> ------------------------------------------------------------------------
> >> [ERROR] BUILD ERROR
> >> [INFO]
> >> ------------------------------------------------------------------------
> >> [INFO] An exception occured while executing the Java class.
> >> org/slf4j/impl/StaticLoggerBinder
> >>
> >> org.slf4j.impl.StaticLoggerBinder
> >> [INFO]
> >> ------------------------------------------------------------------------
> >> [INFO] Trace
> >> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
> >> occured
> >> while executing the Java class. org/slf4j/impl/StaticLoggerBinder
> >> at
> >>
> >>
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
> >>
> >>
> >> 2010/5/10 Florent Empis<fl...@gmail.com>
> >>
> >>
> >>
> >>> Hello,
> >>>
> >>> I try to run the Dirichlet example via this shell:
> >>> mahout/examples/bin/build-reuters.sh
> >>> I'm fairly new to Mahout and to Maven and I think something is wrong
> with
> >>> dependencies and probably with the repositories I should add to my
> >>> settings.xml ?
> >>>
> >>> Thanks for your help!
> >>>
> >>> -------------------------------
> >>> [INFO]
> >>>
> ------------------------------------------------------------------------
> >>> [ERROR] BUILD ERROR
> >>> [INFO]
> >>>
> ------------------------------------------------------------------------
> >>> [INFO] An exception occured while executing the Java class. null
> >>>
> >>> org.apache.commons.compress.compressors.CompressorException
> >>> [INFO]
> >>>
> ------------------------------------------------------------------------
> >>> [INFO] Trace
> >>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
> >>> occured while executing the Java class. null
> >>>         at
> >>>
> >>>
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
> >>>
> >>> -------------------------------
> >>> [INFO]
> >>>
> ------------------------------------------------------------------------
> >>> [ERROR] BUILD ERROR
> >>> [INFO]
> >>>
> ------------------------------------------------------------------------
> >>> [INFO] An exception occured while executing the Java class. null
> >>>
> >>> org.apache.commons.compress.compressors.CompressorException
> >>> [INFO]
> >>>
> ------------------------------------------------------------------------
> >>> [INFO] Trace
> >>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
> >>> occured while executing the Java class. null
> >>>         at
> >>>
> >>>
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >
> >
>

Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
I moved up to examples to run the script and now I'm getting this:

[INFO] 
------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] 
------------------------------------------------------------------------
[INFO] An exception occured while executing the Java class. null

org.apache.commons.compress.compressors.CompressorException
[INFO] 
------------------------------------------------------------------------
[INFO] Trace
org.apache.maven.lifecycle.LifecycleExecutionException: An exception 
occured while executing the Java class. null
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:569)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:539)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:387)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:348)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:180)
     at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:328)
     at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:138)
     at org.apache.maven.cli.MavenCli.main(MavenCli.java:362)
     at 
org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:60)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
     at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
     at 
org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
     at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
Caused by: org.apache.maven.plugin.MojoExecutionException: An exception 
occured while executing the Java class. null
     at org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:338)
     at 
org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:490)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:694)
     ... 17 more
Caused by: java.lang.reflect.InvocationTargetException
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:283)
     at java.lang.Thread.run(Thread.java:637)
Caused by: java.lang.NoClassDefFoundError: 
org/apache/commons/compress/compressors/CompressorException
     at java.lang.Class.forName0(Native Method)
     at java.lang.Class.forName(Class.java:169)
     at 
org.apache.lucene.benchmark.byTask.feeds.DocMaker.setConfig(DocMaker.java:373)
     at 
org.apache.lucene.benchmark.byTask.PerfRunData.<init>(PerfRunData.java:84)
     at 
org.apache.lucene.benchmark.byTask.Benchmark.<init>(Benchmark.java:53)
     at org.apache.lucene.benchmark.byTask.Benchmark.main(Benchmark.java:98)
     ... 6 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.commons.compress.compressors.CompressorException
     at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
     at java.security.AccessController.doPrivileged(Native Method)
     at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:315)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:250)
     at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:398)
     ... 12 more
[INFO] 
------------------------------------------------------------------------
[INFO] Total time: 3 seconds
[INFO] Finished at: Mon May 10 07:34:46 PDT 2010
[INFO] Final Memory: 21M/80M
[INFO] 
------------------------------------------------------------------------
Creating vectors from index

core-job:
+ Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] Searching repository for plugin with prefix: 'exec'.
[INFO] 
------------------------------------------------------------------------
[INFO] Building Mahout Utilities
[INFO]    task-segment: [exec:java]
[INFO] 
------------------------------------------------------------------------
[INFO] Preparing exec:java
[INFO] No goals needed for project - skipping
[INFO] [exec:java {execution: default-cli}]
[INFO] 
------------------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO] 
------------------------------------------------------------------------
[INFO] Total time: 2 seconds
[INFO] Finished at: Mon May 10 07:35:08 PDT 2010
[INFO] Final Memory: 20M/80M
[INFO] 
------------------------------------------------------------------------
Running LDA
+ Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] Searching repository for plugin with prefix: 'exec'.
[INFO] 
------------------------------------------------------------------------
[INFO] Building Mahout Core
[INFO]    task-segment: [exec:java]
[INFO] 
------------------------------------------------------------------------
[INFO] Preparing exec:java
[INFO] No goals needed for project - skipping
[INFO] [exec:java {execution: default-cli}]
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for 
further details.
[INFO] 
------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] 
------------------------------------------------------------------------
[INFO] An exception occured while executing the Java class. 
org/slf4j/impl/StaticLoggerBinder

org.slf4j.impl.StaticLoggerBinder
[INFO] 
------------------------------------------------------------------------
[INFO] Trace
org.apache.maven.lifecycle.LifecycleExecutionException: An exception 
occured while executing the Java class. org/slf4j/impl/StaticLoggerBinder
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:569)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:539)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:387)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:348)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:180)
     at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:328)
     at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:138)
     at org.apache.maven.cli.MavenCli.main(MavenCli.java:362)
     at 
org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:60)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
     at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
     at 
org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
     at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
Caused by: org.apache.maven.plugin.MojoExecutionException: An exception 
occured while executing the Java class. org/slf4j/impl/StaticLoggerBinder
     at org.codehaus.mojo.exec.ExecJavaMojo.execute(ExecJavaMojo.java:338)
     at 
org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:490)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:694)
     ... 17 more
Caused by: java.lang.NoClassDefFoundError: org/slf4j/impl/StaticLoggerBinder
     at org.slf4j.LoggerFactory.getSingleton(LoggerFactory.java:223)
     at org.slf4j.LoggerFactory.bind(LoggerFactory.java:120)
     at 
org.slf4j.LoggerFactory.performInitialization(LoggerFactory.java:111)
     at org.slf4j.LoggerFactory.getILoggerFactory(LoggerFactory.java:269)
     at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:242)
     at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:255)
     at 
org.apache.mahout.clustering.lda.LDADriver.<clinit>(LDADriver.java:65)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:283)
     at java.lang.Thread.run(Thread.java:637)
Caused by: java.lang.ClassNotFoundException: 
org.slf4j.impl.StaticLoggerBinder
     at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
     at java.security.AccessController.doPrivileged(Native Method)
     at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:315)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:250)
     at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:398)
     ... 13 more
[INFO] 
------------------------------------------------------------------------
[INFO] Total time: 3 seconds
[INFO] Finished at: Mon May 10 07:35:13 PDT 2010
[INFO] Final Memory: 23M/80M
[INFO] 
------------------------------------------------------------------------
Writing top words for each topic to to examples/work/topics/
ls: ../examples/work/lda/state-*: No such file or directory
[ERROR] BUILD ERROR


Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
Thanks, I added -X to the first mvn after "Converting to plain text". I 
don't know exactly what to make of this.

./build-reuters.sh
Converting to plain text.
+ Error stacktraces are turned on.
Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700)
Java version: 1.6.0_17
Java home: /System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home
Default locale: en_US, platform encoding: MacRoman
OS name: "mac os x" version: "10.6.3" arch: "x86_64" Family: "mac"
[DEBUG] Building Maven user-level plugin registry from: 
'/Users/jeff/.m2/plugin-registry.xml'
[DEBUG] Building Maven global-level plugin registry from: 
'/Users/jeff/apache-maven-2.2.1/conf/plugin-registry.xml'
[INFO] Scanning for projects...
[INFO] Searching repository for plugin with prefix: 'exec'.
[DEBUG] Loading plugin prefixes from group: org.apache.maven.plugins
[DEBUG] Loading plugin prefixes from group: org.codehaus.mojo
[DEBUG] exec-maven-plugin: resolved to version 1.1.1 from repository central
[DEBUG] Retrieving parent-POM: org.codehaus.mojo:mojo-parent:pom:20 for 
project: null:exec-maven-plugin:maven-plugin:1.1.1 from the repository.
[DEBUG] Adding managed dependencies for unknown:exec-maven-plugin
[DEBUG]   org.apache.maven:maven-plugin-api:jar:2.0
[DEBUG]   junit:junit:jar:3.8.2:test
[DEBUG] Wagons could not be registered as the extension container was 
never created
[INFO] 
------------------------------------------------------------------------
[INFO] Building Maven Default Project
[INFO]    task-segment: [exec:java]
[INFO] 
------------------------------------------------------------------------
[INFO] Preparing exec:java
[INFO] No goals needed for project - skipping
[INFO] 
------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] 
------------------------------------------------------------------------
[INFO] Cannot execute mojo: java. It requires a project with an existing 
pom.xml, but the build is not using one.
[INFO] 
------------------------------------------------------------------------
[DEBUG] Trace
org.apache.maven.lifecycle.LifecycleExecutionException: Cannot execute 
mojo: java. It requires a project with an existing pom.xml, but the 
build is not using one.
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandaloneGoal(DefaultLifecycleExecutor.java:569)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:539)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:387)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:348)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:180)
     at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:328)
     at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:138)
     at org.apache.maven.cli.MavenCli.main(MavenCli.java:362)
     at 
org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:60)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
     at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
     at 
org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
     at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
Caused by: org.apache.maven.plugin.MojoExecutionException: Cannot 
execute mojo: java. It requires a project with an existing pom.xml, but 
the build is not using one.
     at 
org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:414)
     at 
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:694)
     ... 17 more
[INFO] 
------------------------------------------------------------------------
[INFO] Total time: < 1 second
[INFO] Finished at: Mon May 10 07:27:00 PDT 2010
[INFO] Final Memory: 3M/80M
[INFO] 
------------------------------------------------------------------------



On 5/10/10 7:23 AM, Florent Empis wrote:
> Try adding a -X on your mvn calls you'll get a much more verbose output.
>
> 2010/5/10 Jeff Eastman<jd...@windwardsolutions.com>
>
>    
>> I'm also getting a build error running that job but do not see the
>> additional text you are seeing. How do you enable that extra information?
>> Jeff
>>
>> $ ./build-reuters.sh
>> Downloading Reuters-21578
>>   % Total    % Received % Xferd  Average Speed   Time    Time     Time
>>   Current
>>                                  Dload  Upload   Total   Spent    Left
>>   Speed
>> 100 7959k  100 7959k    0     0   697k      0  0:00:11  0:00:11 --:--:--
>>   667k
>> Extracting...
>> Converting to plain text.
>> + Error stacktraces are turned on.
>> [ERROR] BUILD ERROR
>>
>>
>>
>>
>>
>> On 5/10/10 7:07 AM, Florent Empis wrote:
>>
>>      
>>> Sorry, I made a mistake
>>> I have a problem with compressors and with SLF4J:
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [ERROR] BUILD ERROR
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [INFO] An exception occured while executing the Java class.
>>> org/slf4j/impl/StaticLoggerBinder
>>>
>>> org.slf4j.impl.StaticLoggerBinder
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [INFO] Trace
>>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
>>> occured
>>> while executing the Java class. org/slf4j/impl/StaticLoggerBinder
>>> at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
>>>
>>>
>>> 2010/5/10 Florent Empis<fl...@gmail.com>
>>>
>>>
>>>
>>>        
>>>> Hello,
>>>>
>>>> I try to run the Dirichlet example via this shell:
>>>> mahout/examples/bin/build-reuters.sh
>>>> I'm fairly new to Mahout and to Maven and I think something is wrong with
>>>> dependencies and probably with the repositories I should add to my
>>>> settings.xml ?
>>>>
>>>> Thanks for your help!
>>>>
>>>> -------------------------------
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [ERROR] BUILD ERROR
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] An exception occured while executing the Java class. null
>>>>
>>>> org.apache.commons.compress.compressors.CompressorException
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] Trace
>>>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
>>>> occured while executing the Java class. null
>>>>          at
>>>>
>>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
>>>>
>>>> -------------------------------
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [ERROR] BUILD ERROR
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] An exception occured while executing the Java class. null
>>>>
>>>> org.apache.commons.compress.compressors.CompressorException
>>>> [INFO]
>>>> ------------------------------------------------------------------------
>>>> [INFO] Trace
>>>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
>>>> occured while executing the Java class. null
>>>>          at
>>>>
>>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
>>>>
>>>>
>>>>
>>>>
>>>>          
>>>
>>>        
>>
>>      
>    


Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Florent Empis <fl...@gmail.com>.
Try adding a -X on your mvn calls you'll get a much more verbose output.

2010/5/10 Jeff Eastman <jd...@windwardsolutions.com>

> I'm also getting a build error running that job but do not see the
> additional text you are seeing. How do you enable that extra information?
> Jeff
>
> $ ./build-reuters.sh
> Downloading Reuters-21578
>  % Total    % Received % Xferd  Average Speed   Time    Time     Time
>  Current
>                                 Dload  Upload   Total   Spent    Left
>  Speed
> 100 7959k  100 7959k    0     0   697k      0  0:00:11  0:00:11 --:--:--
>  667k
> Extracting...
> Converting to plain text.
> + Error stacktraces are turned on.
> [ERROR] BUILD ERROR
>
>
>
>
>
> On 5/10/10 7:07 AM, Florent Empis wrote:
>
>> Sorry, I made a mistake
>> I have a problem with compressors and with SLF4J:
>> [INFO]
>> ------------------------------------------------------------------------
>> [ERROR] BUILD ERROR
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] An exception occured while executing the Java class.
>> org/slf4j/impl/StaticLoggerBinder
>>
>> org.slf4j.impl.StaticLoggerBinder
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] Trace
>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
>> occured
>> while executing the Java class. org/slf4j/impl/StaticLoggerBinder
>> at
>>
>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
>>
>>
>> 2010/5/10 Florent Empis<fl...@gmail.com>
>>
>>
>>
>>> Hello,
>>>
>>> I try to run the Dirichlet example via this shell:
>>> mahout/examples/bin/build-reuters.sh
>>> I'm fairly new to Mahout and to Maven and I think something is wrong with
>>> dependencies and probably with the repositories I should add to my
>>> settings.xml ?
>>>
>>> Thanks for your help!
>>>
>>> -------------------------------
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [ERROR] BUILD ERROR
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [INFO] An exception occured while executing the Java class. null
>>>
>>> org.apache.commons.compress.compressors.CompressorException
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [INFO] Trace
>>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
>>> occured while executing the Java class. null
>>>         at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
>>>
>>> -------------------------------
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [ERROR] BUILD ERROR
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [INFO] An exception occured while executing the Java class. null
>>>
>>> org.apache.commons.compress.compressors.CompressorException
>>> [INFO]
>>> ------------------------------------------------------------------------
>>> [INFO] Trace
>>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
>>> occured while executing the Java class. null
>>>         at
>>>
>>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
>>>
>>>
>>>
>>>
>>
>>
>
>

Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Jeff Eastman <jd...@windwardsolutions.com>.
I'm also getting a build error running that job but do not see the 
additional text you are seeing. How do you enable that extra information?
Jeff

$ ./build-reuters.sh
Downloading Reuters-21578
   % Total    % Received % Xferd  Average Speed   Time    Time     Time  
Current
                                  Dload  Upload   Total   Spent    Left  
Speed
100 7959k  100 7959k    0     0   697k      0  0:00:11  0:00:11 
--:--:--  667k
Extracting...
Converting to plain text.
+ Error stacktraces are turned on.
[ERROR] BUILD ERROR




On 5/10/10 7:07 AM, Florent Empis wrote:
> Sorry, I made a mistake
> I have a problem with compressors and with SLF4J:
> [INFO]
> ------------------------------------------------------------------------
> [ERROR] BUILD ERROR
> [INFO]
> ------------------------------------------------------------------------
> [INFO] An exception occured while executing the Java class.
> org/slf4j/impl/StaticLoggerBinder
>
> org.slf4j.impl.StaticLoggerBinder
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Trace
> org.apache.maven.lifecycle.LifecycleExecutionException: An exception occured
> while executing the Java class. org/slf4j/impl/StaticLoggerBinder
> at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
>
>
> 2010/5/10 Florent Empis<fl...@gmail.com>
>
>    
>> Hello,
>>
>> I try to run the Dirichlet example via this shell:
>> mahout/examples/bin/build-reuters.sh
>> I'm fairly new to Mahout and to Maven and I think something is wrong with
>> dependencies and probably with the repositories I should add to my
>> settings.xml ?
>>
>> Thanks for your help!
>>
>> -------------------------------
>> [INFO]
>> ------------------------------------------------------------------------
>> [ERROR] BUILD ERROR
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] An exception occured while executing the Java class. null
>>
>> org.apache.commons.compress.compressors.CompressorException
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] Trace
>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
>> occured while executing the Java class. null
>>          at
>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
>>
>> -------------------------------
>> [INFO]
>> ------------------------------------------------------------------------
>> [ERROR] BUILD ERROR
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] An exception occured while executing the Java class. null
>>
>> org.apache.commons.compress.compressors.CompressorException
>> [INFO]
>> ------------------------------------------------------------------------
>> [INFO] Trace
>> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
>> occured while executing the Java class. null
>>          at
>> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
>>
>>
>>      
>    


Re: Error while trying to use mahout/examples/bin/build-reuters.sh

Posted by Florent Empis <fl...@gmail.com>.
Sorry, I made a mistake
I have a problem with compressors and with SLF4J:
[INFO]
------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO]
------------------------------------------------------------------------
[INFO] An exception occured while executing the Java class.
org/slf4j/impl/StaticLoggerBinder

org.slf4j.impl.StaticLoggerBinder
[INFO]
------------------------------------------------------------------------
[INFO] Trace
org.apache.maven.lifecycle.LifecycleExecutionException: An exception occured
while executing the Java class. org/slf4j/impl/StaticLoggerBinder
at
org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)


2010/5/10 Florent Empis <fl...@gmail.com>

> Hello,
>
> I try to run the Dirichlet example via this shell:
> mahout/examples/bin/build-reuters.sh
> I'm fairly new to Mahout and to Maven and I think something is wrong with
> dependencies and probably with the repositories I should add to my
> settings.xml ?
>
> Thanks for your help!
>
> -------------------------------
> [INFO]
> ------------------------------------------------------------------------
> [ERROR] BUILD ERROR
> [INFO]
> ------------------------------------------------------------------------
> [INFO] An exception occured while executing the Java class. null
>
> org.apache.commons.compress.compressors.CompressorException
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Trace
> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
> occured while executing the Java class. null
>         at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
>
> -------------------------------
> [INFO]
> ------------------------------------------------------------------------
> [ERROR] BUILD ERROR
> [INFO]
> ------------------------------------------------------------------------
> [INFO] An exception occured while executing the Java class. null
>
> org.apache.commons.compress.compressors.CompressorException
> [INFO]
> ------------------------------------------------------------------------
> [INFO] Trace
> org.apache.maven.lifecycle.LifecycleExecutionException: An exception
> occured while executing the Java class. null
>         at
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:719)
>
>