You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2019/04/10 20:02:00 UTC
[jira] [Created] (MADLIB-1322) MLP with minibatch fails for integer
dependent variable
Frank McQuillan created MADLIB-1322:
---------------------------------------
Summary: MLP with minibatch fails for integer dependent variable
Key: MADLIB-1322
URL: https://issues.apache.org/jira/browse/MADLIB-1322
Project: Apache MADlib
Issue Type: Bug
Components: Module: Neural Networks
Reporter: Frank McQuillan
Fix For: v1.16
(1)
If I have an integer dependent variable and I mini-batch:
{code}
select madlib.minibatch_preprocessor(
'classification_train', -- input table
'mini_batch_packed_train', -- output table
'response', -- response INTEGER
'feature_vector', -- indep vars
NULL, -- grouping
NULL, -- buffer size (or size of the mini-batch)
TRUE -- Encode scalar int dependent variable (if response is integer instead of boolean or char)
);
{code}
Then the table looks like:
{code}
madlib=# \d+ batch_packed_train_summary
Table "public.mini_batch_packed_train_summary"
Column | Type | Modifiers | Storage | Stats target | Description
--------------------------+-----------+-----------+----------+--------------+-------------
source_table | text | | extended | |
output_table | text | | extended | |
dependent_varname | text | | extended | |
independent_varname | text | | extended | |
dependent_vartype | text | | extended | |
buffer_size | integer | | plain | |
class_values | integer[] | | extended | |
num_rows_processed | integer | | plain | |
num_missing_rows_skipped | integer | | plain | |
grouping_cols | text | | extended | |
{code}
Then MLP classification fails with:
{code}
InternalError: (psycopg2.InternalError) TypeError: must be string, not int
CONTEXT: Traceback (most recent call last):
PL/Python function "mlp_classification", line 33, in <module>
grouping_col)
PL/Python function "mlp_classification", line 42, in wrapper
PL/Python function "mlp_classification", line 147, in mlp
PL/Python function "mlp_classification", line 74, in quote_literal
{code}
(2)
If I cast to text explicitly:
{code}
select madlib.minibatch_preprocessor(
'classification_train', -- input table
'mini_batch_packed_train', -- output table
'response::TEXT', -- response
'feature_vector', -- indep vars
NULL, -- grouping
NULL, -- buffer size (or size of the mini-batch)
TRUE -- Encode scalar int dependent variable (if response is integer instead of boolean or char)
);
{code}
The tables looks like:
{code}
madlib=# \d+ mini_batch_packed_train_summary
Table "public.mini_batch_packed_train_summary"
Column | Type | Modifiers | Storage | Stats target | Description
--------------------------+---------+-----------+----------+--------------+-------------
source_table | text | | extended | |
output_table | text | | extended | |
dependent_varname | text | | extended | |
independent_varname | text | | extended | |
dependent_vartype | text | | extended | |
buffer_size | integer | | plain | |
class_values | text[] | | extended | |
num_rows_processed | integer | | plain | |
num_missing_rows_skipped | integer | | plain | |
grouping_cols | text | | extended | |
{code}
And MLP training works OK.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)