You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Jeff Eastman (Reopened) (JIRA)" <ji...@apache.org> on 2012/01/12 19:59:40 UTC

[jira] [Reopened] (MAHOUT-854) Add MinHash to build-reuters.sh example

     [ https://issues.apache.org/jira/browse/MAHOUT-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Eastman reopened MAHOUT-854:
---------------------------------


Reopening since this appears to be related to a current Jenkins build failure:

Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory /tmp/mahout-work-jenkins/reuters-minhash already exists
	at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:134)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:846)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:807)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)

                
> Add MinHash to build-reuters.sh example
> ---------------------------------------
>
>                 Key: MAHOUT-854
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-854
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Clustering, Examples
>            Reporter: Varun Thacker
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 0.6
>
>         Attachments: MAHOUT-854.patch
>
>
> We can use the Reuters data set for MinHash clustering. Thus adding the MinHash algorithm to the build-reuters.sh would be nice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira