You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2018/06/04 05:44:00 UTC

[jira] [Resolved] (MADLIB-1237) Mini-batch preprocessor fails for dt_golf dataset

     [ https://issues.apache.org/jira/browse/MADLIB-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Frank McQuillan resolved MADLIB-1237.
-------------------------------------
    Resolution: Fixed

> Mini-batch preprocessor fails for dt_golf dataset 
> --------------------------------------------------
>
>                 Key: MADLIB-1237
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1237
>             Project: Apache MADlib
>          Issue Type: Bug
>          Components: Module: Utilities
>            Reporter: Frank McQuillan
>            Assignee: Jingyi Mei
>            Priority: Major
>             Fix For: v1.15
>
>
> For the dt_golf data set from 
> http://madlib.apache.org/docs/latest/group__grp__decision__tree.html#examples
> minibatch pre-processor fails
> {code}
> SELECT madlib.minibatch_preprocessor('dt_golf',
> 'dt_golf_packed_2', 
> 'class', 
> '"Temp_Humidity"', NULL ,1, True);
> ERROR: spiexceptions.SyntaxError: syntax error at or near "t"
> LINE 8: ...T madlib.array_contains_null(ARRAY[(class) = 'Don't Play', (...
>  ^
> QUERY:
>  SELECT SUM(source_table_row_count_by_group) AS source_table_row_count,
>  SUM(num_rows_processed_by_group) AS total_num_rows_processed,
>  AVG(num_rows_processed_by_group) AS avg_num_rows_processed
>  FROM (
>  SELECT COUNT(*) AS source_table_row_count_by_group,
>  SUM(CASE
>  WHEN NOT madlib.array_contains_null(ARRAY[(class) = 'Don't Play', (class) = 'Play']::INTEGER[]) AND
>  NOT madlib.array_contains_null(("Temp_Humidity")::DOUBLE PRECISION[])
>  THEN 1
>  ELSE 0
>  END) AS num_rows_processed_by_group
>  FROM dt_golf
> ) AS s
> CONTEXT: Traceback (most recent call last):
>  PL/Python function "minibatch_preprocessor", line 24, in <module>
>  minibatch_preprocessor_obj.minibatch_preprocessor()
>  PL/Python function "minibatch_preprocessor", line 45, in wrapper
>  PL/Python function "minibatch_preprocessor", line 104, in minibatch_preprocessor
>  PL/Python function "minibatch_preprocessor", line 236, in _get_skipped_rows_processed_count
> PL/Python function "minibatch_preprocessor"
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)