You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/03/15 21:17:34 UTC

[jira] [Commented] (MADLIB-975) Random Forest training error message

    [ https://issues.apache.org/jira/browse/MADLIB-975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196149#comment-15196149 ] 

ASF GitHub Bot commented on MADLIB-975:
---------------------------------------

GitHub user orhankislal opened a pull request:

    https://github.com/apache/incubator-madlib/pull/32

    Recursive_partitioning: Fix random forrest error message

    JIRA: MADLIB-975
    Random forest train gave a non-descript error if every data point had a NULL feature. Added an if check and and assert to give a proper description of the situation.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/orhankislal/incubator-madlib bugfix/random_forest

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-madlib/pull/32.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #32
    
----
commit 795723a421276b9c46aa7b90c096cd5ddfd5bc12
Author: Orhan Kislal <ok...@pivotal.io>
Date:   2016-03-10T19:30:03Z

    Recursive_partitioning: Fix random forrest error message
    
    JIRA: MADLIB-975
    Random forest train gave a non-descript error if every data point had a NULL feature. Added an if check and and assert to give a proper description of the situation.

----


> Random Forest training error message
> ------------------------------------
>
>                 Key: MADLIB-975
>                 URL: https://issues.apache.org/jira/browse/MADLIB-975
>             Project: Apache MADlib
>          Issue Type: Bug
>            Reporter: Rashmi Raghu
>            Assignee: Orhan Kislal
>            Priority: Minor
>             Fix For: v1.9
>
>
> Error message during RF training not interpretable. See query example below. Sample data for query sent offline.
>  SELECT madlib.forest_train(
> 			 'dev.training_data_take_1', --training_table_name
>              'dev.models_random_forest', -- output_table_name,
>              'id', -- id_col_name,
>             'event', -- dependent_variable,
>              '*', -- list_of_features,
>              'id,regionname,wellid,day', -- list_of_features_to_exclude,
>              NULL, -- grouping_cols,
>              100 -- num_trees,
>              -- num_random_features,
>              -- importance,
>              -- num_permutations,
>              -- max_tree_depth,
>              -- min_split,
>              -- min_bucket,
>              -- num_splits,
>              -- surrogate_params,
>              -- verbose,
>              -- sample_ratio
>              );
> ERROR:  AttributeError: 'NoneType' object has no attribute 'sort' (plpython.c:4648)
> CONTEXT:  Traceback (most recent call last):
>   PL/Python function "forest_train", line 42, in <module>
>     sample_ratio
>   PL/Python function "forest_train", line 337, in forest_train
> PL/Python function "forest_train"
> ********** Error **********
> ERROR: AttributeError: 'NoneType' object has no attribute 'sort' (plpython.c:4648)
> SQL state: XX000
> Context: Traceback (most recent call last):
>   PL/Python function "forest_train", line 42, in <module>
>     sample_ratio
>   PL/Python function "forest_train", line 337, in forest_train
> PL/Python function "forest_train"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)