You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Rahul Iyer (JIRA)" <ji...@apache.org> on 2016/10/24 17:22:58 UTC
[jira] [Commented] (MADLIB-1029) Decision Tree's output summary
table does not contain the right list independent variables
[ https://issues.apache.org/jira/browse/MADLIB-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15602627#comment-15602627 ]
Rahul Iyer commented on MADLIB-1029:
------------------------------------
I'm not able to reproduce the error (possibly since I don't have the exact dataset - abalone dataset on UCI does not have a {{color}} feature):
{code}
madlib-pg94=# select * from adaboost_output_test_summary;
-[ RECORD 1 ]---------+--------------------------------------------------------------------------------------------------------------------------------------
method | tree_train
is_classification | t
source_table | abalone
model_table | adaboost_output_test
id_col_name | id
dependent_varname | sex
independent_varnames | rings, length, diameter, height, whole, shucked, viscera, shell
cat_features | rings
con_features | length,diameter,height,whole,shucked,viscera,shell
grouping_cols |
num_all_groups | 1
num_failed_groups | 0
total_rows_processed | 4177
total_rows_skipped | 0
dependent_var_levels | "F","I","M"
dependent_var_type | text
input_cp | 0.01
independent_var_types | integer, double precision, double precision, double precision, double precision, double precision, double precision, double precision
{code}
> Decision Tree's output summary table does not contain the right list independent variables
> ------------------------------------------------------------------------------------------
>
> Key: MADLIB-1029
> URL: https://issues.apache.org/jira/browse/MADLIB-1029
> Project: Apache MADlib
> Issue Type: Bug
> Components: Module: Decision Tree
> Reporter: April Song
>
> Decision Tree's output summary table does not contain the right list independent variables.
> Steps to reproduce:
> select madlib.tree_train('abalone_2', -- source table
> 'adaboost_output_test', -- output model table
> 'rowid', -- id column
> 'sex', -- response
> 'length,diam,height,whole,shucked,viscera,shell,rings,color', -- features
> NULL::text, -- exclude columns
> 'gini', -- split criterion
> NULL::text, -- no grouping
> NULL::text, -- no weights
> 5, -- max depth
> 3, -- min split
> 1, -- min bucket
> 10
> );
> gpadmin=# select * from adaboost_output_test_summary;
> -[ RECORD 1 ]---------+---------------------
> method | tree_train
> is_classification | t
> source_table | abalone_2
> model_table | adaboost_output_test
> id_col_name | rowid
> dependent_varname | sex
> independent_varnames | color
> cat_features | color
> con_features |
> grouping_cols |
> num_all_groups | 1
> num_failed_groups | 0
> total_rows_processed | 2835
> total_rows_skipped | 0
> dependent_var_levels | "0","1"
> dependent_var_type | integer
> input_cp | 0.01
> independent_var_types | text
> Abalone data can be found here: https://archive.ics.uci.edu/ml/datasets/Abalone
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)