You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2016/08/27 00:17:20 UTC

[jira] [Commented] (MADLIB-994) RF - improve docs for describing memory usage

    [ https://issues.apache.org/jira/browse/MADLIB-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15440286#comment-15440286 ] 

Frank McQuillan commented on MADLIB-994:
----------------------------------------

I put this comment in the decision tree user docs:

"The main parameters that affect memory usage are:  depth of tree, number
of features, and number of values per feature.  If you are hitting VMEM limits,
consider reducing one or more of these parameters."

> RF - improve docs for describing memory usage
> ---------------------------------------------
>
>                 Key: MADLIB-994
>                 URL: https://issues.apache.org/jira/browse/MADLIB-994
>             Project: Apache MADlib
>          Issue Type: Documentation
>          Components: Module: Random Forest
>            Reporter: Frank McQuillan
>            Priority: Minor
>             Fix For: v1.9.1
>
>
> Some users are hitting VMEM limits, so docs need to give more guidance on what params affect memory.
> e.g., 
> Not that many rows, but a large number of features (500 - 800).
> ERROR:  plpy.SPIError: Out of memory  (seg46 slice4 awsaiuirl1179:40006 pid=449659) (plpython.c:4648)
> DETAIL:  VM Protect failed to allocate 1028374648 bytes, 690 MB available
> CONTEXT:  Traceback (most recent call last):
>   PL/Python function "forest_train", line 39, in <module>
>     sample_ratio
>   PL/Python function "forest_train", line 565, in forest_train
>   PL/Python function "forest_train", line 2248, in _tree_train_grps_using_bins
>   PL/Python function "forest_train", line 1324, in _one_step_for_grps
> PL/Python function "forest_train"
>  
> ********** Error **********
>  
> ERROR: plpy.SPIError: Out of memory  (seg46 slice4 awsaiuirl1179:40006 pid=449659) (plpython.c:4648)
> SQL state: XX000
> Detail: VM Protect failed to allocate 1028374648 bytes, 690 MB available
> Context: Traceback (most recent call last):
>   PL/Python function "forest_train", line 39, in <module>
>     sample_ratio
>   PL/Python function "forest_train", line 565, in forest_train
>   PL/Python function "forest_train", line 2248, in _tree_train_grps_using_bins
>   PL/Python function "forest_train", line 1324, in _one_step_for_grps
> PL/Python function "forest_train"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)