You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@madlib.apache.org by nj...@apache.org on 2019/04/25 21:47:40 UTC

[madlib] branch master updated: Association Rule: Change default max_itemset_size to 10

This is an automated email from the ASF dual-hosted git repository.

njayaram pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/madlib.git


The following commit(s) were added to refs/heads/master by this push:
     new ca3bb8d  Association Rule: Change default max_itemset_size to 10
ca3bb8d is described below

commit ca3bb8d3e3e184563ad7495da22166e7f8515b09
Author: Himanshu Pandey <hp...@pivotal.io>
AuthorDate: Mon Apr 22 15:17:24 2019 -0700

    Association Rule: Change default max_itemset_size to 10
    
    JIRA: MADLIB-1288
    Change default value of max_itemset_size optional param from infinity to
    10, so that the module does not run for too long. The default value is
    based on the corresponding R library.
    
    Closes #373
---
 src/ports/postgres/modules/assoc_rules/assoc_rules.py_in  | 2 +-
 src/ports/postgres/modules/assoc_rules/assoc_rules.sql_in | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/ports/postgres/modules/assoc_rules/assoc_rules.py_in b/src/ports/postgres/modules/assoc_rules/assoc_rules.py_in
index 243851d..e67d887 100644
--- a/src/ports/postgres/modules/assoc_rules/assoc_rules.py_in
+++ b/src/ports/postgres/modules/assoc_rules/assoc_rules.py_in
@@ -64,7 +64,7 @@ def assoc_rules(madlib_schema, support, confidence, tid_col,
     begin_step_exec = time.time();
     cal_itemsets_time = 0;
     if max_itemset_size is None:
-        max_itemset_size = float('inf')
+        max_itemset_size = 10
     elif max_itemset_size <= 1:
         plpy.error("ERROR: max_itemset_size has to be greater than 1.")
 
diff --git a/src/ports/postgres/modules/assoc_rules/assoc_rules.sql_in b/src/ports/postgres/modules/assoc_rules/assoc_rules.sql_in
index bcd5464..dafe117 100644
--- a/src/ports/postgres/modules/assoc_rules/assoc_rules.sql_in
+++ b/src/ports/postgres/modules/assoc_rules/assoc_rules.sql_in
@@ -268,7 +268,7 @@ This generates all association rules that satisfy the specified minimum
   as the algorithm progresses.</dd>
 
   <dt>max_itemset_size (optional)</dt>
-  <dd>INTEGER, default: generate itemsets of all sizes. Determines the maximum size of frequent
+  <dd>INTEGER, default: 10. Determines the maximum size of frequent
   itemsets that are used for generating association rules. Must be 2 or more.
   This parameter can be used to reduce run time for data sets where itemset size is large,
   which is a common situation. If your query is not returning or is running too long,
@@ -549,7 +549,7 @@ AS $$
                                        input_table,
                                        output_schema,
                                        False,
-                                       'NULL');
+                                       10);
 
 $$ LANGUAGE plpythonu
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA', `');
@@ -579,7 +579,7 @@ AS $$
                                        input_table,
                                        output_schema,
                                        verbose,
-                                       'NULL');
+                                       10);
 $$ LANGUAGE plpythonu
 m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA', `');