You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Daniel Erenrich (JIRA)" <ji...@apache.org> on 2014/11/26 05:40:12 UTC

[jira] [Commented] (SPARK-4001) Add Apriori algorithm to Spark MLlib

    [ https://issues.apache.org/jira/browse/SPARK-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225728#comment-14225728 ] 

Daniel Erenrich commented on SPARK-4001:
----------------------------------------

I was about to start coding something like this when I noticed this ticket. What's the status here?

Association rule algorithms in general (and apriori in particular) are useful in collaborative filtering contexts (which mllib already has code for). 

As far as library cohesiveness, my though here is that we can frame the inputs to look near identical to the matrix facotarization code though with (basket_id, item_id) instead of (user_id, item_id). That input format would be inefficient though (so maybe we'd support a second more natural input format. This though would sidestep the concern the sklearn folks had.

> Add Apriori algorithm to Spark MLlib
> ------------------------------------
>
>                 Key: SPARK-4001
>                 URL: https://issues.apache.org/jira/browse/SPARK-4001
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Jacky Li
>            Assignee: Jacky Li
>
> Apriori is the classic algorithm for frequent item set mining in a transactional data set.  It will be useful if Apriori algorithm is added to MLLib in Spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org