You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by ray <rt...@gmail.com> on 2015/04/27 14:48:37 UTC

bringing back the fp-growth code in mahout

I had it in mind to volunteer to maintain the fp-growth code in Mahout, 
but I see that Spark has an fp-growth implementation.  So now that I 
have the time to work on this, I'm wondering if there is any point, or 
if there is still any interest in the Mahout community.

If not, so be it.  If so, I volunteer.

Regards, Ray.

Re: bringing back the fp-growth code in mahout

Posted by Ted Dunning <te...@gmail.com>.
On Mon, Apr 27, 2015 at 8:13 PM, ray <rt...@gmail.com> wrote:

> What is the best way to tell if Apache code is being maintained, in
> particular the fp-growth algorithm in Spark's MLlib?
>

Ask on the appropriate mailing list.

Re: bringing back the fp-growth code in mahout

Posted by ray <rt...@gmail.com>.
What is the best way to tell if Apache code is being maintained, in 
particular the fp-growth algorithm in Spark's MLlib?

My original intent (5 months ago) was to replace the map reduce portion 
of the fp-growth code with an alternate, though I wasn't sure what that 
alternate should be.

My motivation for wanting frequent itemsets is that they are closed with 
respect to intersections, so they form simplicial complexes.  I've 
written software for mining simplicial complexes for their geometry. 
Actually, for their 2-dimensional persistent homology.  It means I can 
look at how the geometry changes as both the support and confidence 
parameters vary.  I'm hoping to take at least some of the guesswork out 
of making the right choices for these parameters, which seems to be sort 
of an open question.

So for now I'll see if Spark's implementation generates usable frequent 
item sets, and have some fun learning Scala, and see about maybe getting 
fp-growth running on top of Flink.


On 04/27/2015 07:59 AM, Ted Dunning wrote:
>
> Ray,
>
> Is the Spark implementation usable?  Is it maintained?  If not, there is
> a decent reason to move forward.
>
> I don't think that we want to revive the old map-reduce implementation.
>
>
>
> On Mon, Apr 27, 2015 at 5:48 AM, ray <rtmelton@gmail.com
> <ma...@gmail.com>> wrote:
>
>     I had it in mind to volunteer to maintain the fp-growth code in
>     Mahout, but I see that Spark has an fp-growth implementation.  So
>     now that I have the time to work on this, I'm wondering if there is
>     any point, or if there is still any interest in the Mahout community.
>
>     If not, so be it.  If so, I volunteer.
>
>     Regards, Ray.
>
>

Re: bringing back the fp-growth code in mahout

Posted by Ted Dunning <te...@gmail.com>.
Ray,

Is the Spark implementation usable?  Is it maintained?  If not, there is a
decent reason to move forward.

I don't think that we want to revive the old map-reduce implementation.



On Mon, Apr 27, 2015 at 5:48 AM, ray <rt...@gmail.com> wrote:

> I had it in mind to volunteer to maintain the fp-growth code in Mahout,
> but I see that Spark has an fp-growth implementation.  So now that I have
> the time to work on this, I'm wondering if there is any point, or if there
> is still any interest in the Mahout community.
>
> If not, so be it.  If so, I volunteer.
>
> Regards, Ray.
>