You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@madlib.apache.org by Dmitry Dorofeev <di...@luxmsbi.com> on 2017/05/16 09:58:59 UTC

greenplum: Memory allocation failed. madlib.__bernoulli_vector

Hello, we are getting

psql:06_svm_train.sql:11: ERROR:  plpy.SPIError: plpy.SPIError: Function "madlib.__bernoulli_vector(integer,double precision,double precision,double precision,integer)": Memory allocation failed. Typically, this indicates that Greenplum Database limits the available memory to less than what is needed for this input.  (entry db greenplum.luxms:5432 pid=12385) (plpython.c:4648)
CONTEXT:  Traceback (most recent call last):
  PL/Python function "svm_classification", line 26, in <module>
    return svm.svm(**globals())
  PL/Python function "svm_classification", line 983, in svm
  PL/Python function "svm_classification", line 1103, in _transform_w_kernel
  PL/Python function "svm_classification", line 277, in fit
PL/Python function "svm_classification"

for the following SQL:

SELECT madlib.svm_classification ('stanford.train_input',
                                  'stanford.model_table',
                                  'label=4',
                                  'ind',
                                  'polynomial',   --linear | gaussian
                                  'coef0=0',             -- kernel params
                                  '',             -- grouping_col
                                  'max_iter=1,validation_result=stanford.validation_result');   -- max_iter=200


MADlib 1.9.1
Greenplum DB version 4.3ORCA
CentOS Linux release 7.2.1511

Any hint how we can tune memory settings to avoid such errors ?

Thanks.

Re: greenplum: Memory allocation failed. madlib.__bernoulli_vector

Posted by Dmitry Dorofeev <di...@luxmsbi.com>.

Thanks all!

Frank, here is our data size:

demo=# select count(*) from stanford.train_input;
  count
---------
 1593954
(1 row)

demo=# \d stanford.train_input
      Table "stanford.train_input"
 Column |        Type        | Modifiers
--------+--------------------+-----------
 id     | bigint             |
 ind    | double precision[] |
 label  | integer            |
Distributed randomly

----- Original Message -----
From: "Frank McQuillan" <fm...@pivotal.io>
To: user@madlib.incubator.apache.org
Sent: Tuesday, May 16, 2017 8:10:48 PM
Subject: Re: greenplum: Memory allocation failed. madlib.__bernoulli_vector

How big is 'stanford.train_input' ?

Did u try it with a small sample dataset, and if so, did that work OK?

On Tue, May 16, 2017 at 9:16 AM, Frank McQuillan <fm...@pivotal.io>
wrote:

> This does not look like a MADlib error.
>
> There are a lot of Greenplum experts who respond to the questions on this
> mailing list:
> https://groups.google.com/a/greenplum.org/forum/#!forum/gpdb-users
> so I would suggest you post your question there.
>
> Frank
>
> On Tue, May 16, 2017 at 6:44 AM, Luis Macedo <lm...@pivotal.io> wrote:
>
>> Try diminishing the scope of you data.
>>
>> There was not enough memory to run the python code it seams...
>>
>>
>> *Luis Macedo | Sr Platform Architect | **Pivotal Inc *
>>
>> *Mobile:* +55 11 97616-6438 <+55%2011%2097616-6438>
>> *Pivotal.io <http://pivotal.io>*
>> *Take care of the customers and the rest takes care of itself*
>>
>> 2017-05-16 6:58 GMT-03:00 Dmitry Dorofeev <di...@luxmsbi.com>:
>>
>>> Hello, we are getting
>>>
>>> psql:06_svm_train.sql:11: ERROR:  plpy.SPIError: plpy.SPIError: Function
>>> "madlib.__bernoulli_vector(integer,double precision,double
>>> precision,double precision,integer)": Memory allocation failed. Typically,
>>> this indicates that Greenplum Database limits the available memory to less
>>> than what is needed for this input.  (entry db greenplum.luxms:5432
>>> pid=12385) (plpython.c:4648)
>>> CONTEXT:  Traceback (most recent call last):
>>>   PL/Python function "svm_classification", line 26, in <module>
>>>     return svm.svm(**globals())
>>>   PL/Python function "svm_classification", line 983, in svm
>>>   PL/Python function "svm_classification", line 1103, in
>>> _transform_w_kernel
>>>   PL/Python function "svm_classification", line 277, in fit
>>> PL/Python function "svm_classification"
>>>
>>> for the following SQL:
>>>
>>> SELECT madlib.svm_classification ('stanford.train_input',
>>>                                   'stanford.model_table',
>>>                                   'label=4',
>>>                                   'ind',
>>>                                   'polynomial',   --linear | gaussian
>>>                                   'coef0=0',             -- kernel params
>>>                                   '',             -- grouping_col
>>>                                   'max_iter=1,validation_result=stanford.validation_result');
>>>  -- max_iter=200
>>>
>>>
>>> MADlib 1.9.1
>>> Greenplum DB version 4.3ORCA
>>> CentOS Linux release 7.2.1511
>>>
>>> Any hint how we can tune memory settings to avoid such errors ?
>>>
>>> Thanks.
>>>
>>
>>
>

Re: greenplum: Memory allocation failed. madlib.__bernoulli_vector

Posted by Frank McQuillan <fm...@pivotal.io>.

How big is 'stanford.train_input' ?

Did u try it with a small sample dataset, and if so, did that work OK?

On Tue, May 16, 2017 at 9:16 AM, Frank McQuillan <fm...@pivotal.io>
wrote:

> This does not look like a MADlib error.
>
> There are a lot of Greenplum experts who respond to the questions on this
> mailing list:
> https://groups.google.com/a/greenplum.org/forum/#!forum/gpdb-users
> so I would suggest you post your question there.
>
> Frank
>
> On Tue, May 16, 2017 at 6:44 AM, Luis Macedo <lm...@pivotal.io> wrote:
>
>> Try diminishing the scope of you data.
>>
>> There was not enough memory to run the python code it seams...
>>
>>
>> *Luis Macedo | Sr Platform Architect | **Pivotal Inc *
>>
>> *Mobile:* +55 11 97616-6438 <+55%2011%2097616-6438>
>> *Pivotal.io <http://pivotal.io>*
>> *Take care of the customers and the rest takes care of itself*
>>
>> 2017-05-16 6:58 GMT-03:00 Dmitry Dorofeev <di...@luxmsbi.com>:
>>
>>> Hello, we are getting
>>>
>>> psql:06_svm_train.sql:11: ERROR:  plpy.SPIError: plpy.SPIError: Function
>>> "madlib.__bernoulli_vector(integer,double precision,double
>>> precision,double precision,integer)": Memory allocation failed. Typically,
>>> this indicates that Greenplum Database limits the available memory to less
>>> than what is needed for this input.  (entry db greenplum.luxms:5432
>>> pid=12385) (plpython.c:4648)
>>> CONTEXT:  Traceback (most recent call last):
>>>   PL/Python function "svm_classification", line 26, in <module>
>>>     return svm.svm(**globals())
>>>   PL/Python function "svm_classification", line 983, in svm
>>>   PL/Python function "svm_classification", line 1103, in
>>> _transform_w_kernel
>>>   PL/Python function "svm_classification", line 277, in fit
>>> PL/Python function "svm_classification"
>>>
>>> for the following SQL:
>>>
>>> SELECT madlib.svm_classification ('stanford.train_input',
>>>                                   'stanford.model_table',
>>>                                   'label=4',
>>>                                   'ind',
>>>                                   'polynomial',   --linear | gaussian
>>>                                   'coef0=0',             -- kernel params
>>>                                   '',             -- grouping_col
>>>                                   'max_iter=1,validation_result=stanford.validation_result');
>>>  -- max_iter=200
>>>
>>>
>>> MADlib 1.9.1
>>> Greenplum DB version 4.3ORCA
>>> CentOS Linux release 7.2.1511
>>>
>>> Any hint how we can tune memory settings to avoid such errors ?
>>>
>>> Thanks.
>>>
>>
>>
>

Re: greenplum: Memory allocation failed. madlib.__bernoulli_vector

Posted by Frank McQuillan <fm...@pivotal.io>.

This does not look like a MADlib error.

There are a lot of Greenplum experts who respond to the questions on this
mailing list:
https://groups.google.com/a/greenplum.org/forum/#!forum/gpdb-users
so I would suggest you post your question there.

Frank

On Tue, May 16, 2017 at 6:44 AM, Luis Macedo <lm...@pivotal.io> wrote:

> Try diminishing the scope of you data.
>
> There was not enough memory to run the python code it seams...
>
>
> *Luis Macedo | Sr Platform Architect | **Pivotal Inc *
>
> *Mobile:* +55 11 97616-6438 <+55%2011%2097616-6438>
> *Pivotal.io <http://pivotal.io>*
> *Take care of the customers and the rest takes care of itself*
>
> 2017-05-16 6:58 GMT-03:00 Dmitry Dorofeev <di...@luxmsbi.com>:
>
>> Hello, we are getting
>>
>> psql:06_svm_train.sql:11: ERROR:  plpy.SPIError: plpy.SPIError: Function
>> "madlib.__bernoulli_vector(integer,double precision,double
>> precision,double precision,integer)": Memory allocation failed. Typically,
>> this indicates that Greenplum Database limits the available memory to less
>> than what is needed for this input.  (entry db greenplum.luxms:5432
>> pid=12385) (plpython.c:4648)
>> CONTEXT:  Traceback (most recent call last):
>>   PL/Python function "svm_classification", line 26, in <module>
>>     return svm.svm(**globals())
>>   PL/Python function "svm_classification", line 983, in svm
>>   PL/Python function "svm_classification", line 1103, in
>> _transform_w_kernel
>>   PL/Python function "svm_classification", line 277, in fit
>> PL/Python function "svm_classification"
>>
>> for the following SQL:
>>
>> SELECT madlib.svm_classification ('stanford.train_input',
>>                                   'stanford.model_table',
>>                                   'label=4',
>>                                   'ind',
>>                                   'polynomial',   --linear | gaussian
>>                                   'coef0=0',             -- kernel params
>>                                   '',             -- grouping_col
>>                                   'max_iter=1,validation_result=stanford.validation_result');
>>  -- max_iter=200
>>
>>
>> MADlib 1.9.1
>> Greenplum DB version 4.3ORCA
>> CentOS Linux release 7.2.1511
>>
>> Any hint how we can tune memory settings to avoid such errors ?
>>
>> Thanks.
>>
>
>

Re: greenplum: Memory allocation failed. madlib.__bernoulli_vector

Posted by Luis Macedo <lm...@pivotal.io>.

Try diminishing the scope of you data.

There was not enough memory to run the python code it seams...


*Luis Macedo | Sr Platform Architect | **Pivotal Inc *

*Mobile:* +55 11 97616-6438
*Pivotal.io <http://pivotal.io>*
*Take care of the customers and the rest takes care of itself*

2017-05-16 6:58 GMT-03:00 Dmitry Dorofeev <di...@luxmsbi.com>:

> Hello, we are getting
>
> psql:06_svm_train.sql:11: ERROR:  plpy.SPIError: plpy.SPIError: Function
> "madlib.__bernoulli_vector(integer,double precision,double
> precision,double precision,integer)": Memory allocation failed. Typically,
> this indicates that Greenplum Database limits the available memory to less
> than what is needed for this input.  (entry db greenplum.luxms:5432
> pid=12385) (plpython.c:4648)
> CONTEXT:  Traceback (most recent call last):
>   PL/Python function "svm_classification", line 26, in <module>
>     return svm.svm(**globals())
>   PL/Python function "svm_classification", line 983, in svm
>   PL/Python function "svm_classification", line 1103, in
> _transform_w_kernel
>   PL/Python function "svm_classification", line 277, in fit
> PL/Python function "svm_classification"
>
> for the following SQL:
>
> SELECT madlib.svm_classification ('stanford.train_input',
>                                   'stanford.model_table',
>                                   'label=4',
>                                   'ind',
>                                   'polynomial',   --linear | gaussian
>                                   'coef0=0',             -- kernel params
>                                   '',             -- grouping_col
>                                   'max_iter=1,validation_result=stanford.validation_result');
>  -- max_iter=200
>
>
> MADlib 1.9.1
> Greenplum DB version 4.3ORCA
> CentOS Linux release 7.2.1511
>
> Any hint how we can tune memory settings to avoid such errors ?
>
> Thanks.
>