You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2011/03/01 12:50:36 UTC
[jira] Commented: (MAHOUT-293) Add more tunable parameters to
PFPGrowth implementation
[ https://issues.apache.org/jira/browse/MAHOUT-293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000843#comment-13000843 ]
Sean Owen commented on MAHOUT-293:
----------------------------------
Praveen, if you'd like to supply a patch, that's great. Otherwise I think we will mark this as won't-fix as of 0.5 since nobody seems to be taking it up.
> Add more tunable parameters to PFPGrowth implementation
> -------------------------------------------------------
>
> Key: MAHOUT-293
> URL: https://issues.apache.org/jira/browse/MAHOUT-293
> Project: Mahout
> Issue Type: Improvement
> Components: Frequent Itemset/Association Rule Mining
> Affects Versions: 0.4
> Reporter: Robin Anil
> Assignee: Robin Anil
> Fix For: 0.5
>
>
> Objective is to add more tunable parameters to the PFPGrowth algorithm.
> From Neal on Mahout User list:
> I often use Christian Borgelt's itemset implementations for playing
> with data. He's implemented a nice set of switches, see below.
> Setting a minimum support threshold and mimimum itemset size are both
> convenient and tend to make the algorithm run a bit faster.
> http://www.borgelt.net/software.html
> nealr@nrichter-laptop:~$ fpgrowth_fim
> usage: fpgrowth_fim [options] infile outfile
> find frequent item sets with the fpgrowth algorithm
> version 1.13 (2008.05.02) (c) 2004-2008 Christian Borgelt
> -m# minimal number of items per item set (default: 1)
> -n# maximal number of items per item set (default: no limit)
> -s# minimal support of an item set (default: 10%)
> (positive: percentage, negative: absolute number)
> -d# minimal binary logarithm of support quotient (default: none)
> -p# output format for the item set support (default: "%.1f")
> -a print absolute support (number of transactions)
> -g write output in scanable form (quote certain characters)
> -q# sort items w.r.t. their frequency (default: -2)
> (1: ascending, -1: descending, 0: do not sort,
> 2: ascending, -2: descending w.r.t. transaction size sum)
> -u use alternative tree projection method
> -z do not prune tree projections to bonsai
> -j use quicksort to sort the transactions (default: heapsort)
> -i# ignore records starting with a character in the given string
> -b/f/r# blank characters, field and record separators
> (default: " \t\r", " \t", "\n")
> infile file to read transactions from
> outfile file to write frequent item se
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira