You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by psun <ps...@gmail.com> on 2012/11/16 07:07:03 UTC

Getting started with frequent pattern mining

Hi,

I'm totally new to mahout and machine learning. I would like to use FP
algorithm in my java code. 

I have installed hadoop, mahout on ubuntu. I'm running some examples for FP.

I understand that FP algorithm has to be imported into the java code to call
the algorithm from the code. Can anyone please provide me resources on
getting started with running FP algorithm using mapreduce from java? More
like a library resource to describe libraries, methods and it's list of
parameters will be useful too. 

Regards,
psun



--
View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-frequent-pattern-mining-tp4020655.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Getting started with frequent pattern mining

Posted by psun <ps...@gmail.com>.

Thank you for the info. and the link on data set.

Is there a documentation on the library which help me understand how to use
it? For e.g. 
1. understanding the output. I see list of folders in the output. files in
"frequentpatterns/part-?" and "fpgrowth/part-?" seems to have the same
content. 
2. knowing list of methods provided by the library

How does one know all that? If we have to do code walk through, can anyone
please share the link for the info?

Thank you.

Regards,
psun



--
View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-frequent-pattern-mining-tp4020655p4020778.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Getting started with frequent pattern mining

Posted by zhipeng zhao <zh...@yahoo.com>.

To better understand the algorithm/parameters etc, I would suggest read these paper:

The original FP Growth algorithm paper, talking about the algorithm/parameters
"Mining Frequent Patterns without Candidate Generation"
http://www.cs.uiuc.edu/~hanj/pdf/sigmod00.pdf

The papallel verion of FP Growth algorhtm, on which Mahout impmlementation is build:
"PFP: Parallel FP-Growth for Query Recommendation"
infolab.stanford.edu/~echang/recsys08-69.pdf


-zhipeng
--- On Fri, 11/16/12, Qinghao <ro...@gmail.com> wrote:

> From: Qinghao <ro...@gmail.com>
> Subject: Re: Getting started with frequent pattern mining
> To: mahout-user@lucene.apache.org
> Date: Friday, November 16, 2012, 2:02 AM
> Well, this algorithm requires an
> input like transaction dataset.
> There are some parameters you need to set:
> support value (same as FP)
> groups (indicates how many groups, into which F-List will be
> devided)
> You can download some dataset from http://fimi.ua.ac.be/data/
> 
> Regards,
> Q
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-frequent-pattern-mining-tp4020655p4020663.html
> Sent from the Mahout User List mailing list archive at
> Nabble.com.
>

Re: Getting started with frequent pattern mining

Posted by Qinghao <ro...@gmail.com>.

Well, this algorithm requires an input like transaction dataset.
There are some parameters you need to set:
support value (same as FP)
groups (indicates how many groups, into which F-List will be devided)
You can download some dataset from http://fimi.ua.ac.be/data/

Regards,
Q



--
View this message in context: http://lucene.472066.n3.nabble.com/Getting-started-with-frequent-pattern-mining-tp4020655p4020663.html
Sent from the Mahout User List mailing list archive at Nabble.com.