You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Isabel Drost (JIRA)" <ji...@apache.org> on 2009/08/11 21:50:14 UTC

[jira] Commented: (MAHOUT-157) Frequent Pattern Mining using Parallel FP-Growth

    [ https://issues.apache.org/jira/browse/MAHOUT-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742033#action_12742033 ] 

Isabel Drost commented on MAHOUT-157:
-------------------------------------

Patch applies to trunk but I run into problems when trying to get it to compile. I needed to apply MAHOUT-124 first. Then I got an error that indicated that you are using the Combinations class not only in the tests (where it is put by the diff) but also in the regular source code. After copying the class to src/main/java, I get the following error: 

MAHOUT-157/core/src/main/java/org/apache/mahout/fpm/pfpgrowth/ParallelFPGrowthReducer.java:[92,41] get(java.lang.String) in org.apache.mahout.common.Parameters cannot be applied to (java.lang.String,java.lang.String)

I guess I have done something wrong when applying the patches one after another?


Other than that I only have some general comments before going into more detail for the review: I am missing some documentation, both JavaDoc and package.html, at least a link to the original paper would be nice to have.

PFPGrowth - seems like you do quite a lot of work in your constructor. I think it is no good idea to start map reduce jobs from within a constructor. Maybe I am reading something wrong here?

Is it possible to break up the test into unit tests? I think that would make changing the code and tracking where the change actually broke the code by far easier.

AggregatorReducer, line 88: Please avoid calls to <Exception>.printStackTrace() - usually those messages get lost when the system is in production. Better log the message with Logger.<fatal|error|warn>("your message", <Exception>) - maybe rethrow the exception if you cannot handle it properly.

TreeNode - the class seems to contain public attributes only but no methods. Please at least explain which type of tree these nodes are supposed to be a part of. From the code alone I am not able to understand the its usage... 

> Frequent Pattern Mining using Parallel FP-Growth
> ------------------------------------------------
>
>                 Key: MAHOUT-157
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-157
>             Project: Mahout
>          Issue Type: New Feature
>    Affects Versions: 0.2
>            Reporter: Robin Anil
>             Fix For: 0.2
>
>         Attachments: MAHOUT-157-August-6.patch, MAHOUT-157-Combinations-BSD-License.patch, MAHOUT-157-Combinations-BSD-License.patch, MAHOUT-157-inProgress-August-5.patch
>
>
> Implement: http://infolab.stanford.edu/~echang/recsys08-69.pdf

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.