You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Paritosh Ranjan (Created) (JIRA)" <ji...@apache.org> on 2012/02/23 08:51:51 UTC
[jira] [Created] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Refactor Canopy Clustering into a separate post process with outlier pruning
----------------------------------------------------------------------------
Key: MAHOUT-982
URL: https://issues.apache.org/jira/browse/MAHOUT-982
Project: Mahout
Issue Type: Sub-task
Components: Clustering
Affects Versions: 0.6
Reporter: Paritosh Ranjan
Assignee: Paritosh Ranjan
Fix For: 0.7
Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Posted by "Paritosh Ranjan (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223992#comment-13223992 ]
Paritosh Ranjan commented on MAHOUT-982:
----------------------------------------
I plan to commit this in a day or two. Please object if you see any concern.
Then I will do similar refactorings for FuzzyK, KMeans and Dirichlet.
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-982
> URL: https://issues.apache.org/jira/browse/MAHOUT-982
> Project: Mahout
> Issue Type: Sub-task
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Assignee: Paritosh Ranjan
> Labels: clustering
> Fix For: 0.7
>
> Attachments: MAHOUT-982.txt
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226906#comment-13226906 ]
Hudson commented on MAHOUT-982:
-------------------------------
Integrated in Mahout-Quality #1388 (See [https://builds.apache.org/job/Mahout-Quality/1388/])
MAHOUT-982, Added method and CLI option to remove outliers. (Revision 1299207)
Result = SUCCESS
pranjan : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1299207
Files :
* /mahout/trunk/core/src/main/java/org/apache/mahout/clustering/canopy/CanopyDriver.java
* /mahout/trunk/core/src/main/java/org/apache/mahout/common/commandline/DefaultOptionCreator.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/canopy/TestCanopyCreation.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/classify/ClusterClassificationDriverTest.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/kmeans/TestKmeansClustering.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterCountReaderTest.java
* /mahout/trunk/core/src/test/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterOutputPostProcessorTest.java
* /mahout/trunk/examples/src/main/java/org/apache/mahout/clustering/syntheticcontrol/canopy/Job.java
* /mahout/trunk/examples/src/main/java/org/apache/mahout/clustering/syntheticcontrol/fuzzykmeans/Job.java
* /mahout/trunk/examples/src/main/java/org/apache/mahout/clustering/syntheticcontrol/kmeans/Job.java
* /mahout/trunk/integration/src/test/java/org/apache/mahout/clustering/TestClusterDumper.java
* /mahout/trunk/integration/src/test/java/org/apache/mahout/clustering/TestClusterEvaluator.java
* /mahout/trunk/integration/src/test/java/org/apache/mahout/clustering/cdbw/TestCDbwEvaluator.java
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-982
> URL: https://issues.apache.org/jira/browse/MAHOUT-982
> Project: Mahout
> Issue Type: Sub-task
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Assignee: Paritosh Ranjan
> Labels: clustering
> Fix For: 0.7
>
> Attachments: MAHOUT-982.txt
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Posted by "Paritosh Ranjan (Resolved) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paritosh Ranjan resolved MAHOUT-982.
------------------------------------
Resolution: Fixed
The clustering has been refactored to a separate process and their is a suppor for outlier pruning now. All the code is committed.
Resolving the issue.
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-982
> URL: https://issues.apache.org/jira/browse/MAHOUT-982
> Project: Mahout
> Issue Type: Sub-task
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Assignee: Paritosh Ranjan
> Labels: clustering
> Fix For: 0.7
>
> Attachments: MAHOUT-982.txt
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222169#comment-13222169 ]
jiraposter@reviews.apache.org commented on MAHOUT-982:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4174/
-----------------------------------------------------------
Review request for mahout.
Summary
-------
Executing clustering using ClusterClassificationDriver in CanopyDriver.
This replaces the existing funtionality. If this refactoring is marked ok, then we can add a threshold as the method parameter/CLI argument to support oulier removal in CanopyClustering.
This patch is first of its kind for the ClusteringDrivers. If this is okayed, then the similar refactoring can be done easily for KMeans, FuzzyK and Dirichlet.
This addresses bug MAHOUT-982.
https://issues.apache.org/jira/browse/MAHOUT-982
Diffs
-----
trunk/core/src/test/java/org/apache/mahout/clustering/canopy/TestCanopyCreation.java 1294137
trunk/core/src/main/java/org/apache/mahout/clustering/canopy/ClusterMapper.java 1294137
trunk/core/src/main/java/org/apache/mahout/clustering/canopy/CanopyClusterer.java 1294137
trunk/core/src/main/java/org/apache/mahout/clustering/canopy/CanopyDriver.java 1294137
Diff: https://reviews.apache.org/r/4174/diff
Testing
-------
Thanks,
Paritosh
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-982
> URL: https://issues.apache.org/jira/browse/MAHOUT-982
> Project: Mahout
> Issue Type: Sub-task
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Assignee: Paritosh Ranjan
> Labels: clustering
> Fix For: 0.7
>
> Attachments: MAHOUT-982.txt
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Commented] (MAHOUT-982) Refactor Canopy Clustering into a separate post process with outlier pruning
Posted by Isabel Drost <is...@apache.org>.
On 23.02.2012 Jeff Eastman wrote:
> Only JIRA committers can be assigned an issue but anybody can
> contribute.
Just for reference: Our JIRA features a role "contributor" - issues can be
assigned to anyone who has this role. For any Mahout JIRA admin (should be any
committer AFAIK) making someone "contributor" is as simple as going to the
project administration page, click on members and add the respective user name
to that role.
Isabel
Re: [jira] [Commented] (MAHOUT-982) Refactor Canopy Clustering into
a separate post process with outlier pruning
Posted by Paritosh Ranjan <pr...@xebia.com>.
I have assigned all the stories related to clustering refactoring to myself.
If anyone wants to contribute to the refactoring stories, please feel
free to submit the patch and drop a comment on the JIRA issue.
On 23-02-2012 18:02, Jeff Eastman wrote:
> Only JIRA committers can be assigned an issue but anybody can
> contribute. Paritosh, since you are riding herd on these refactorings
> and will most likely be committing the patches, why don't you assign
> them to yourself?
>
> On 2/23/12 2:25 AM, Paritosh Ranjan (Commented) (JIRA) wrote:
>> [
>> https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214494#comment-13214494
>> ]
>>
>> Paritosh Ranjan commented on MAHOUT-982:
>> ----------------------------------------
>>
>> Suneel, thanks for this initiative.
>> Please feel free to assign it to yourself ( if its possible ) or to
>> submit patches.
>>
>>> Refactor Canopy Clustering into a separate post process with outlier
>>> pruning
>>> ----------------------------------------------------------------------------
>>>
>>>
>>> Key: MAHOUT-982
>>> URL: https://issues.apache.org/jira/browse/MAHOUT-982
>>> Project: Mahout
>>> Issue Type: Sub-task
>>> Components: Clustering
>>> Affects Versions: 0.6
>>> Reporter: Paritosh Ranjan
>>> Assignee: Paritosh Ranjan
>>> Labels: clustering
>>> Fix For: 0.7
>>>
>>>
>>> Use ClusterClassificationDriver to refactor clustering out of
>>> CanopyDriver with outlier pruning support.
>> --
>> This message is automatically generated by JIRA.
>> If you think it was sent incorrectly, please contact your JIRA
>> administrators:
>> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
>> For more information on JIRA, see:
>> http://www.atlassian.com/software/jira
>>
>>
>>
>>
>
Re: [jira] [Commented] (MAHOUT-982) Refactor Canopy Clustering into
a separate post process with outlier pruning
Posted by Jeff Eastman <jd...@windwardsolutions.com>.
Only JIRA committers can be assigned an issue but anybody can
contribute. Paritosh, since you are riding herd on these refactorings
and will most likely be committing the patches, why don't you assign
them to yourself?
On 2/23/12 2:25 AM, Paritosh Ranjan (Commented) (JIRA) wrote:
> [ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214494#comment-13214494 ]
>
> Paritosh Ranjan commented on MAHOUT-982:
> ----------------------------------------
>
> Suneel, thanks for this initiative.
> Please feel free to assign it to yourself ( if its possible ) or to submit patches.
>
>> Refactor Canopy Clustering into a separate post process with outlier pruning
>> ----------------------------------------------------------------------------
>>
>> Key: MAHOUT-982
>> URL: https://issues.apache.org/jira/browse/MAHOUT-982
>> Project: Mahout
>> Issue Type: Sub-task
>> Components: Clustering
>> Affects Versions: 0.6
>> Reporter: Paritosh Ranjan
>> Assignee: Paritosh Ranjan
>> Labels: clustering
>> Fix For: 0.7
>>
>>
>> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>
>
[jira] [Commented] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Posted by "Paritosh Ranjan (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214494#comment-13214494 ]
Paritosh Ranjan commented on MAHOUT-982:
----------------------------------------
Suneel, thanks for this initiative.
Please feel free to assign it to yourself ( if its possible ) or to submit patches.
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-982
> URL: https://issues.apache.org/jira/browse/MAHOUT-982
> Project: Mahout
> Issue Type: Sub-task
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Assignee: Paritosh Ranjan
> Labels: clustering
> Fix For: 0.7
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Posted by "Paritosh Ranjan (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paritosh Ranjan updated MAHOUT-982:
-----------------------------------
Attachment: MAHOUT-982.txt
Jeff, will you please review this patch?
Implemented ClusterClassification Driver in CanopyDriver to clusterData.
This replaces the existing funtionality. If this is ok, then we can add a threshold as the method parameter/CLI argument to support oulier removal in CanopyClustering.
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-982
> URL: https://issues.apache.org/jira/browse/MAHOUT-982
> Project: Mahout
> Issue Type: Sub-task
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Assignee: Paritosh Ranjan
> Labels: clustering
> Fix For: 0.7
>
> Attachments: MAHOUT-982.txt
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Posted by "Jeff Eastman (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225326#comment-13225326 ]
Jeff Eastman commented on MAHOUT-982:
-------------------------------------
+1 I like the way the driver was compressed and the mapper disappeared. Also less time-consuming unit tests since classification is tested on its own.
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-982
> URL: https://issues.apache.org/jira/browse/MAHOUT-982
> Project: Mahout
> Issue Type: Sub-task
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Assignee: Paritosh Ranjan
> Labels: clustering
> Fix For: 0.7
>
> Attachments: MAHOUT-982.txt
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Posted by "Paritosh Ranjan (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225324#comment-13225324 ]
Paritosh Ranjan commented on MAHOUT-982:
----------------------------------------
I plan to commit this in a day or two. If you see any concern, please suggest. I have changed the signature of CanopyDriver.run by adding an argument clusterClassificationThreshold. The default is 0.0 and its not mandatory.
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-982
> URL: https://issues.apache.org/jira/browse/MAHOUT-982
> Project: Mahout
> Issue Type: Sub-task
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Assignee: Paritosh Ranjan
> Labels: clustering
> Fix For: 0.7
>
> Attachments: MAHOUT-982.txt
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Posted by "Suneel Marthi (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13214453#comment-13214453 ]
Suneel Marthi commented on MAHOUT-982:
--------------------------------------
Paritosh, I can take a crack at this, please assign this to me.
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-982
> URL: https://issues.apache.org/jira/browse/MAHOUT-982
> Project: Mahout
> Issue Type: Sub-task
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Assignee: Paritosh Ranjan
> Labels: clustering
> Fix For: 0.7
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Posted by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13225319#comment-13225319 ]
jiraposter@reviews.apache.org commented on MAHOUT-982:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4245/
-----------------------------------------------------------
Review request for mahout.
Summary
-------
Added outlier removal capability to Canopy Clustering.
This addresses bug Mahout-982.
https://issues.apache.org/jira/browse/Mahout-982
Diffs
-----
trunk/core/src/main/java/org/apache/mahout/common/commandline/DefaultOptionCreator.java 1294137
trunk/core/src/test/java/org/apache/mahout/clustering/canopy/TestCanopyCreation.java 1298406
trunk/core/src/main/java/org/apache/mahout/clustering/canopy/CanopyDriver.java 1298408
trunk/core/src/test/java/org/apache/mahout/clustering/classify/ClusterClassificationDriverTest.java 1294454
trunk/core/src/test/java/org/apache/mahout/clustering/kmeans/TestKmeansClustering.java 1294137
trunk/core/src/test/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterCountReaderTest.java 1294137
trunk/core/src/test/java/org/apache/mahout/clustering/topdown/postprocessor/ClusterOutputPostProcessorTest.java 1294137
trunk/examples/src/main/java/org/apache/mahout/clustering/syntheticcontrol/canopy/Job.java 1294137
trunk/examples/src/main/java/org/apache/mahout/clustering/syntheticcontrol/fuzzykmeans/Job.java 1294137
trunk/examples/src/main/java/org/apache/mahout/clustering/syntheticcontrol/kmeans/Job.java 1294137
trunk/integration/src/test/java/org/apache/mahout/clustering/TestClusterDumper.java 1294137
trunk/integration/src/test/java/org/apache/mahout/clustering/TestClusterEvaluator.java 1294137
trunk/integration/src/test/java/org/apache/mahout/clustering/cdbw/TestCDbwEvaluator.java 1294137
Diff: https://reviews.apache.org/r/4245/diff
Testing
-------
Added test cases for both sequential and mapreduce version.
Thanks,
Paritosh
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-982
> URL: https://issues.apache.org/jira/browse/MAHOUT-982
> Project: Mahout
> Issue Type: Sub-task
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Assignee: Paritosh Ranjan
> Labels: clustering
> Fix For: 0.7
>
> Attachments: MAHOUT-982.txt
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Posted by "Paritosh Ranjan (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216532#comment-13216532 ]
Paritosh Ranjan commented on MAHOUT-982:
----------------------------------------
Successful in using ClusterClassificationDriver to clusterData for Canopy Clustering. Have tried it for sequential version for now. Still need to do it for the mapreduce version.
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-982
> URL: https://issues.apache.org/jira/browse/MAHOUT-982
> Project: Mahout
> Issue Type: Sub-task
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Assignee: Paritosh Ranjan
> Labels: clustering
> Fix For: 0.7
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAHOUT-982) Refactor Canopy Clustering into a
separate post process with outlier pruning
Posted by "Suneel Marthi (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Suneel Marthi updated MAHOUT-982:
---------------------------------
Comment: was deleted
(was: Paritosh, I can take a crack at this, please assign this to me.)
> Refactor Canopy Clustering into a separate post process with outlier pruning
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-982
> URL: https://issues.apache.org/jira/browse/MAHOUT-982
> Project: Mahout
> Issue Type: Sub-task
> Components: Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Assignee: Paritosh Ranjan
> Labels: clustering
> Fix For: 0.7
>
>
> Use ClusterClassificationDriver to refactor clustering out of CanopyDriver with outlier pruning support.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira