You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@madlib.apache.org by nj...@apache.org on 2019/04/25 21:47:40 UTC
[madlib] branch master updated: Association Rule: Change default
max_itemset_size to 10
This is an automated email from the ASF dual-hosted git repository.
njayaram pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/madlib.git
The following commit(s) were added to refs/heads/master by this push:
new ca3bb8d Association Rule: Change default max_itemset_size to 10
ca3bb8d is described below
commit ca3bb8d3e3e184563ad7495da22166e7f8515b09
Author: Himanshu Pandey <hp...@pivotal.io>
AuthorDate: Mon Apr 22 15:17:24 2019 -0700
Association Rule: Change default max_itemset_size to 10
JIRA: MADLIB-1288
Change default value of max_itemset_size optional param from infinity to
10, so that the module does not run for too long. The default value is
based on the corresponding R library.
Closes #373
---
src/ports/postgres/modules/assoc_rules/assoc_rules.py_in | 2 +-
src/ports/postgres/modules/assoc_rules/assoc_rules.sql_in | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/src/ports/postgres/modules/assoc_rules/assoc_rules.py_in b/src/ports/postgres/modules/assoc_rules/assoc_rules.py_in
index 243851d..e67d887 100644
--- a/src/ports/postgres/modules/assoc_rules/assoc_rules.py_in
+++ b/src/ports/postgres/modules/assoc_rules/assoc_rules.py_in
@@ -64,7 +64,7 @@ def assoc_rules(madlib_schema, support, confidence, tid_col,
begin_step_exec = time.time();
cal_itemsets_time = 0;
if max_itemset_size is None:
- max_itemset_size = float('inf')
+ max_itemset_size = 10
elif max_itemset_size <= 1:
plpy.error("ERROR: max_itemset_size has to be greater than 1.")
diff --git a/src/ports/postgres/modules/assoc_rules/assoc_rules.sql_in b/src/ports/postgres/modules/assoc_rules/assoc_rules.sql_in
index bcd5464..dafe117 100644
--- a/src/ports/postgres/modules/assoc_rules/assoc_rules.sql_in
+++ b/src/ports/postgres/modules/assoc_rules/assoc_rules.sql_in
@@ -268,7 +268,7 @@ This generates all association rules that satisfy the specified minimum
as the algorithm progresses.</dd>
<dt>max_itemset_size (optional)</dt>
- <dd>INTEGER, default: generate itemsets of all sizes. Determines the maximum size of frequent
+ <dd>INTEGER, default: 10. Determines the maximum size of frequent
itemsets that are used for generating association rules. Must be 2 or more.
This parameter can be used to reduce run time for data sets where itemset size is large,
which is a common situation. If your query is not returning or is running too long,
@@ -549,7 +549,7 @@ AS $$
input_table,
output_schema,
False,
- 'NULL');
+ 10);
$$ LANGUAGE plpythonu
m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA', `');
@@ -579,7 +579,7 @@ AS $$
input_table,
output_schema,
verbose,
- 'NULL');
+ 10);
$$ LANGUAGE plpythonu
m4_ifdef(`__HAS_FUNCTION_PROPERTIES__', `MODIFIES SQL DATA', `');