You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Jingyi Mei (JIRA)" <ji...@apache.org> on 2018/05/30 00:11:00 UTC

[jira] [Created] (MADLIB-1243) Encode_categorical_variables doesn't work with column name with special characters when specifying top

Jingyi Mei created MADLIB-1243:
----------------------------------

             Summary: Encode_categorical_variables doesn't work with column name with special characters when specifying top
                 Key: MADLIB-1243
                 URL: https://issues.apache.org/jira/browse/MADLIB-1243
             Project: Apache MADlib
          Issue Type: Bug
          Components: Module: Utilities
            Reporter: Jingyi Mei
             Fix For: v1.15


Encode_categorical_variables doesn't work with column name with special characters when specifying 'top' value as input parameter. Here is the repro:

1. Create table with special character in column name
{code:java}
DROP TABLE IF EXISTS abalone_special_char;
CREATE TABLE abalone_special_char (
    id serial,
    "se''x" character varying,
    "len'%*()gth" double precision,
    diameter double precision,
    height double precision,
    "ClaЖss" integer
);
COPY abalone_special_char ("se''x", "len'%*()gth", diameter, height, "ClaЖss") FROM stdin WITH DELIMITER '|' NULL as '@';
F"F|0.475|0.37|0.125|2
F'F|0.475|0.37|0.125|2
F$F|0.475|0.37|0.125|2
MЖM|0.475|0.37|0.125|2
M@[}(:*;M|0.475|0.37|0.125|2
M,M|0.475|0.37|0.125|2
\.{code}
2. call encode_categorical_variables with "se''x" as categorical column name and specify 3 as top value:
{code:java}
select encode_categorical_variables('abalone_special_char', 'abalone_special_char_out2',
'"se''''x"', '',
NULL, '3'
);{code}
Here is the error msg:
{code:java}
ERROR: KeyError: '"se\'\'x"' (plpython.c:4960)
CONTEXT: Traceback (most recent call last):
PL/Python function "encode_categorical_variables", line 23, in <module>
return encode_categorical.encode_categorical_variables(**globals())
PL/Python function "encode_categorical_variables", line 611, in encode_categorical_variables
PL/Python function "encode_categorical_variables", line 104, in build_output_table
PL/Python function "encode_categorical_variables", line 342, in _build_encoding_str
PL/Python function "encode_categorical_variables"{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)