You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Robin Anil <ro...@apache.org> on 2010/04/20 13:14:27 UTC

Fwd: Database connectivity support for PFPGrowth data mining

+mahout-user (Moving the discussion to mahout user mailing list)


Instead of a StringRecordIterator, you can substitute it with your
DBRecordIterator which returns one transaction( Pair<List<String>,Long>> )
per next() call.

First part being the unordered itemlist and the second being the number of
times it occurs(count) in the database. The last one is just an
optimisation. It doesn' t matter whether you provide a transaction repeated
3 times with count=1 or once with count=3

You will need to use the iterator twice or reset before calling the FPGrowth
algorithm i.e after the generateFList call

Robin





---------- Forwarded message ----------
From: Rashmi Paliwal <ra...@globallogic.com>
Date: Tue, Apr 20, 2010 at 4:28 PM
Subject: Database connectivity support for PFPGrowth data mining
To: robinanil@apache.org, srowen@gmail.com


 Hi Robin and Sean,



I am in process of exploring PFPGrowth data mining for generating patterns
and recommendations out of the input data. I want to use it in my project.

I am running a sample example in which I am getting the input data from a
txt file. My requirement is to read the data from the database table.

I am using DB2 and MYSQL for that. Below is the code snippet:



generateFList = fp.generateFList(*new* StringRecordIterator(

                              *new* FileLineIterable(*new* File(input),
encoding, *false*),

                              pattern), minSupport);



                  StringRecordIterator transactions = *null*;



                  transactions = *new*
StringRecordIterator(*new*FileLineIterable(

                              *new* File(input), encoding, *false*),
pattern);



This particular code is using FileLineIterator and reading the input from a
file. Instead I need to read the input data from the relational database.

I didn’t find relevant help from the internet.



Please guide me in the direction.

Thanks and regards,

*Rashmi Paliwal| Sr Software Engineer |* *GlobalLogic Inc.*

*The Global Product Development Leader*

*USA** | INDIA | UKRAINE | CHINA**
*B-34/1, Sector 59,  Noida 201301 U.P

Phone: +91. 120. 406.2000 x 2347 | Fax:+91.120.258.5721

www.globallogic.com



*InfoWorld Award Winner for Agile
Innovation<http://www.globallogic.com/Media/pressReleaseDetail.asp?press/65>
*



*Disclaimer: **http://www.globallogic.com/email_disclaimer.txt<https://webmail.globallogic.com/exchweb/bin/redir.asp?URL=http://www.globallogic.com/email_disclaimer.txt>
*