You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@madlib.apache.org by Frank McQuillan <fm...@pivotal.io> on 2016/10/27 22:00:33 UTC

Proposed improvement to association rules (Apriori) algorithm

Here is a comment from a MADlib user that I recently heard:

“No apparent way to set an upper bound for itemset size in assoc_rules
function. This results in it running forever with larger data sets. In the
R "arules" package, you can set a max itemset size so that it doesn't look
for unnecessarily large associations.”
https://cran.r-project.org/web/packages/arules/arules.pdf

Does a single optional parameter make sense to add to
http://madlib.incubator.apache.org/docs/latest/group__grp__assoc__rules.html
similar to the maxlen parameter in “arules” ?

Any other considerations here or improvements to make the this algorithm at
the same time? minlen?

Thanks,
Frank

Re: Proposed improvement to association rules (Apriori) algorithm

Posted by Frank McQuillan <fm...@pivotal.io>.
I created a JIRA on this
https://issues.apache.org/jira/browse/MADLIB-1031



On Thu, Oct 27, 2016 at 3:00 PM, Frank McQuillan <fm...@pivotal.io>
wrote:

> Here is a comment from a MADlib user that I recently heard:
>
> “No apparent way to set an upper bound for itemset size in assoc_rules
> function. This results in it running forever with larger data sets. In the
> R "arules" package, you can set a max itemset size so that it doesn't look
> for unnecessarily large associations.”
> https://cran.r-project.org/web/packages/arules/arules.pdf
>
> Does a single optional parameter make sense to add to
> http://madlib.incubator.apache.org/docs/latest/group__
> grp__assoc__rules.html
> similar to the maxlen parameter in “arules” ?
>
> Any other considerations here or improvements to make the this algorithm
> at the same time? minlen?
>
> Thanks,
> Frank
>
>
>
>
>
>

Re: Proposed improvement to association rules (Apriori) algorithm

Posted by Frank McQuillan <fm...@pivotal.io>.
I created a JIRA on this
https://issues.apache.org/jira/browse/MADLIB-1031



On Thu, Oct 27, 2016 at 3:00 PM, Frank McQuillan <fm...@pivotal.io>
wrote:

> Here is a comment from a MADlib user that I recently heard:
>
> “No apparent way to set an upper bound for itemset size in assoc_rules
> function. This results in it running forever with larger data sets. In the
> R "arules" package, you can set a max itemset size so that it doesn't look
> for unnecessarily large associations.”
> https://cran.r-project.org/web/packages/arules/arules.pdf
>
> Does a single optional parameter make sense to add to
> http://madlib.incubator.apache.org/docs/latest/group__
> grp__assoc__rules.html
> similar to the maxlen parameter in “arules” ?
>
> Any other considerations here or improvements to make the this algorithm
> at the same time? minlen?
>
> Thanks,
> Frank
>
>
>
>
>
>