You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@madlib.apache.org by "Orhan Kislal (Jira)" <ji...@apache.org> on 2022/07/07 15:22:00 UTC

[jira] [Closed] (MADLIB-1501) Can not train model larger than 1GB.

     [ https://issues.apache.org/jira/browse/MADLIB-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Orhan Kislal closed MADLIB-1501.
--------------------------------
    Fix Version/s:     (was: v1.19.0)
       Resolution: Invalid

> Can not train model larger than 1GB.
> ------------------------------------
>
>                 Key: MADLIB-1501
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1501
>             Project: Apache MADlib
>          Issue Type: Bug
>          Components: Deep Learning
>            Reporter: Xinyi Zhang
>            Priority: Major
>
> When I want to train a model whose size is large than 1GB on Greenplum, I get the error below:
> CONTEXT: PL/Python function "madlib_keras_fit"
> ERROR: spiexceptions.InternalError: invalid memory alloc request size 1100478264 (plpy_elog.c:121) .
>  
> But If I use a smaller model, it can run successfully.
> It seems that "SELECT \{schema_madlib}.fit_step()" can not execute when the model is larger than 1GB.
> I set my shared_buffers to 32GB, and the instance has 290G memory available. So, something wrong might happen to the memory allocation in Madlib.
> I did not find any parameters to solve the problem. But since the large model is quite common, I think there should be a solution for training models larger than 1GB in Madlib.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)