You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@madlib.apache.org by nj...@apache.org on 2017/12/12 20:41:37 UTC

madlib git commit: Correlation: Fix bug with international characters

Repository: madlib
Updated Branches:
  refs/heads/master dd906f7c7 -> 47e357446


Correlation: Fix bug with international characters

JIRA:MADLIB-1186

Additional Author: Nandish Jayaram <nj...@apache.org>

If the column name of an independent variable used in
madlib.correlation(...) has quotes in it, then the query fails due
to a regular string concat used for creating an intermediate column
name that reflects the average of the column.

This commit uses add_postfix() to create that column name instead.
Originally, the new column name was `avg_{column_name}`, that is
replaced with add_postfix(column_name, '_avg'). The prefix `avg_`
is changed to suffix `_avg`. This is only an intermediate column,
and not shown as an output, hence ignoring the semantics of the
final string name.

Closes #214


Project: http://git-wip-us.apache.org/repos/asf/madlib/repo
Commit: http://git-wip-us.apache.org/repos/asf/madlib/commit/47e35744
Tree: http://git-wip-us.apache.org/repos/asf/madlib/tree/47e35744
Diff: http://git-wip-us.apache.org/repos/asf/madlib/diff/47e35744

Branch: refs/heads/master
Commit: 47e357446e39be6662eb472b1b523280c831b360
Parents: dd906f7
Author: Swati Soni <so...@gmail.com>
Authored: Mon Dec 11 14:09:46 2017 -0800
Committer: Nandish Jayaram <nj...@apache.org>
Committed: Tue Dec 12 12:38:37 2017 -0800

----------------------------------------------------------------------
 src/ports/postgres/modules/stats/correlation.py_in | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/madlib/blob/47e35744/src/ports/postgres/modules/stats/correlation.py_in
----------------------------------------------------------------------
diff --git a/src/ports/postgres/modules/stats/correlation.py_in b/src/ports/postgres/modules/stats/correlation.py_in
index f658b48..d524eb1 100644
--- a/src/ports/postgres/modules/stats/correlation.py_in
+++ b/src/ports/postgres/modules/stats/correlation.py_in
@@ -179,9 +179,11 @@ def _populate_output_table(schema_madlib, source_table, output_table,
             function_name = "Correlation"
             agg_str = "{0}.correlation_agg(x, mean)".format(schema_madlib)
 
-        cols = ','.join(["coalesce({0}, avg_{0})".format(col) for col in col_names])
-        avgs = ','.join(["avg({0}) AS avg_{0}".format(col) for col in col_names])
-        avg_array = ','.join(["avg_{0}".format(col) for col in col_names])
+        cols = ','.join(["coalesce({0}, {1})".format(col, add_postfix(col, "_avg"))
+                        for col in col_names])
+        avgs = ','.join(["avg({0}) AS {1}".format(col, add_postfix(col, "_avg"))
+                        for col in col_names])
+        avg_array = ','.join([str(add_postfix(col, "_avg")) for col in col_names])
         # actual computation
         sql1 = """