You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@madlib.apache.org by Dmitry Dorofeev <di...@luxmsbi.com> on 2017/05/16 09:58:59 UTC
greenplum: Memory allocation failed. madlib.__bernoulli_vector
Hello, we are getting
psql:06_svm_train.sql:11: ERROR: plpy.SPIError: plpy.SPIError: Function "madlib.__bernoulli_vector(integer,double precision,double precision,double precision,integer)": Memory allocation failed. Typically, this indicates that Greenplum Database limits the available memory to less than what is needed for this input. (entry db greenplum.luxms:5432 pid=12385) (plpython.c:4648)
CONTEXT: Traceback (most recent call last):
PL/Python function "svm_classification", line 26, in <module>
return svm.svm(**globals())
PL/Python function "svm_classification", line 983, in svm
PL/Python function "svm_classification", line 1103, in _transform_w_kernel
PL/Python function "svm_classification", line 277, in fit
PL/Python function "svm_classification"
for the following SQL:
SELECT madlib.svm_classification ('stanford.train_input',
'stanford.model_table',
'label=4',
'ind',
'polynomial', --linear | gaussian
'coef0=0', -- kernel params
'', -- grouping_col
'max_iter=1,validation_result=stanford.validation_result'); -- max_iter=200
MADlib 1.9.1
Greenplum DB version 4.3ORCA
CentOS Linux release 7.2.1511
Any hint how we can tune memory settings to avoid such errors ?
Thanks.
Re: greenplum: Memory allocation failed. madlib.__bernoulli_vector
Posted by Dmitry Dorofeev <di...@luxmsbi.com>.
Thanks all!
Frank, here is our data size:
demo=# select count(*) from stanford.train_input;
count
---------
1593954
(1 row)
demo=# \d stanford.train_input
Table "stanford.train_input"
Column | Type | Modifiers
--------+--------------------+-----------
id | bigint |
ind | double precision[] |
label | integer |
Distributed randomly
----- Original Message -----
From: "Frank McQuillan" <fm...@pivotal.io>
To: user@madlib.incubator.apache.org
Sent: Tuesday, May 16, 2017 8:10:48 PM
Subject: Re: greenplum: Memory allocation failed. madlib.__bernoulli_vector
How big is 'stanford.train_input' ?
Did u try it with a small sample dataset, and if so, did that work OK?
On Tue, May 16, 2017 at 9:16 AM, Frank McQuillan <fm...@pivotal.io>
wrote:
> This does not look like a MADlib error.
>
> There are a lot of Greenplum experts who respond to the questions on this
> mailing list:
> https://groups.google.com/a/greenplum.org/forum/#!forum/gpdb-users
> so I would suggest you post your question there.
>
> Frank
>
> On Tue, May 16, 2017 at 6:44 AM, Luis Macedo <lm...@pivotal.io> wrote:
>
>> Try diminishing the scope of you data.
>>
>> There was not enough memory to run the python code it seams...
>>
>>
>> *Luis Macedo | Sr Platform Architect | **Pivotal Inc *
>>
>> *Mobile:* +55 11 97616-6438 <+55%2011%2097616-6438>
>> *Pivotal.io <http://pivotal.io>*
>> *Take care of the customers and the rest takes care of itself*
>>
>> 2017-05-16 6:58 GMT-03:00 Dmitry Dorofeev <di...@luxmsbi.com>:
>>
>>> Hello, we are getting
>>>
>>> psql:06_svm_train.sql:11: ERROR: plpy.SPIError: plpy.SPIError: Function
>>> "madlib.__bernoulli_vector(integer,double precision,double
>>> precision,double precision,integer)": Memory allocation failed. Typically,
>>> this indicates that Greenplum Database limits the available memory to less
>>> than what is needed for this input. (entry db greenplum.luxms:5432
>>> pid=12385) (plpython.c:4648)
>>> CONTEXT: Traceback (most recent call last):
>>> PL/Python function "svm_classification", line 26, in <module>
>>> return svm.svm(**globals())
>>> PL/Python function "svm_classification", line 983, in svm
>>> PL/Python function "svm_classification", line 1103, in
>>> _transform_w_kernel
>>> PL/Python function "svm_classification", line 277, in fit
>>> PL/Python function "svm_classification"
>>>
>>> for the following SQL:
>>>
>>> SELECT madlib.svm_classification ('stanford.train_input',
>>> 'stanford.model_table',
>>> 'label=4',
>>> 'ind',
>>> 'polynomial', --linear | gaussian
>>> 'coef0=0', -- kernel params
>>> '', -- grouping_col
>>> 'max_iter=1,validation_result=stanford.validation_result');
>>> -- max_iter=200
>>>
>>>
>>> MADlib 1.9.1
>>> Greenplum DB version 4.3ORCA
>>> CentOS Linux release 7.2.1511
>>>
>>> Any hint how we can tune memory settings to avoid such errors ?
>>>
>>> Thanks.
>>>
>>
>>
>
Re: greenplum: Memory allocation failed. madlib.__bernoulli_vector
Posted by Frank McQuillan <fm...@pivotal.io>.
How big is 'stanford.train_input' ?
Did u try it with a small sample dataset, and if so, did that work OK?
On Tue, May 16, 2017 at 9:16 AM, Frank McQuillan <fm...@pivotal.io>
wrote:
> This does not look like a MADlib error.
>
> There are a lot of Greenplum experts who respond to the questions on this
> mailing list:
> https://groups.google.com/a/greenplum.org/forum/#!forum/gpdb-users
> so I would suggest you post your question there.
>
> Frank
>
> On Tue, May 16, 2017 at 6:44 AM, Luis Macedo <lm...@pivotal.io> wrote:
>
>> Try diminishing the scope of you data.
>>
>> There was not enough memory to run the python code it seams...
>>
>>
>> *Luis Macedo | Sr Platform Architect | **Pivotal Inc *
>>
>> *Mobile:* +55 11 97616-6438 <+55%2011%2097616-6438>
>> *Pivotal.io <http://pivotal.io>*
>> *Take care of the customers and the rest takes care of itself*
>>
>> 2017-05-16 6:58 GMT-03:00 Dmitry Dorofeev <di...@luxmsbi.com>:
>>
>>> Hello, we are getting
>>>
>>> psql:06_svm_train.sql:11: ERROR: plpy.SPIError: plpy.SPIError: Function
>>> "madlib.__bernoulli_vector(integer,double precision,double
>>> precision,double precision,integer)": Memory allocation failed. Typically,
>>> this indicates that Greenplum Database limits the available memory to less
>>> than what is needed for this input. (entry db greenplum.luxms:5432
>>> pid=12385) (plpython.c:4648)
>>> CONTEXT: Traceback (most recent call last):
>>> PL/Python function "svm_classification", line 26, in <module>
>>> return svm.svm(**globals())
>>> PL/Python function "svm_classification", line 983, in svm
>>> PL/Python function "svm_classification", line 1103, in
>>> _transform_w_kernel
>>> PL/Python function "svm_classification", line 277, in fit
>>> PL/Python function "svm_classification"
>>>
>>> for the following SQL:
>>>
>>> SELECT madlib.svm_classification ('stanford.train_input',
>>> 'stanford.model_table',
>>> 'label=4',
>>> 'ind',
>>> 'polynomial', --linear | gaussian
>>> 'coef0=0', -- kernel params
>>> '', -- grouping_col
>>> 'max_iter=1,validation_result=stanford.validation_result');
>>> -- max_iter=200
>>>
>>>
>>> MADlib 1.9.1
>>> Greenplum DB version 4.3ORCA
>>> CentOS Linux release 7.2.1511
>>>
>>> Any hint how we can tune memory settings to avoid such errors ?
>>>
>>> Thanks.
>>>
>>
>>
>
Re: greenplum: Memory allocation failed. madlib.__bernoulli_vector
Posted by Frank McQuillan <fm...@pivotal.io>.
This does not look like a MADlib error.
There are a lot of Greenplum experts who respond to the questions on this
mailing list:
https://groups.google.com/a/greenplum.org/forum/#!forum/gpdb-users
so I would suggest you post your question there.
Frank
On Tue, May 16, 2017 at 6:44 AM, Luis Macedo <lm...@pivotal.io> wrote:
> Try diminishing the scope of you data.
>
> There was not enough memory to run the python code it seams...
>
>
> *Luis Macedo | Sr Platform Architect | **Pivotal Inc *
>
> *Mobile:* +55 11 97616-6438 <+55%2011%2097616-6438>
> *Pivotal.io <http://pivotal.io>*
> *Take care of the customers and the rest takes care of itself*
>
> 2017-05-16 6:58 GMT-03:00 Dmitry Dorofeev <di...@luxmsbi.com>:
>
>> Hello, we are getting
>>
>> psql:06_svm_train.sql:11: ERROR: plpy.SPIError: plpy.SPIError: Function
>> "madlib.__bernoulli_vector(integer,double precision,double
>> precision,double precision,integer)": Memory allocation failed. Typically,
>> this indicates that Greenplum Database limits the available memory to less
>> than what is needed for this input. (entry db greenplum.luxms:5432
>> pid=12385) (plpython.c:4648)
>> CONTEXT: Traceback (most recent call last):
>> PL/Python function "svm_classification", line 26, in <module>
>> return svm.svm(**globals())
>> PL/Python function "svm_classification", line 983, in svm
>> PL/Python function "svm_classification", line 1103, in
>> _transform_w_kernel
>> PL/Python function "svm_classification", line 277, in fit
>> PL/Python function "svm_classification"
>>
>> for the following SQL:
>>
>> SELECT madlib.svm_classification ('stanford.train_input',
>> 'stanford.model_table',
>> 'label=4',
>> 'ind',
>> 'polynomial', --linear | gaussian
>> 'coef0=0', -- kernel params
>> '', -- grouping_col
>> 'max_iter=1,validation_result=stanford.validation_result');
>> -- max_iter=200
>>
>>
>> MADlib 1.9.1
>> Greenplum DB version 4.3ORCA
>> CentOS Linux release 7.2.1511
>>
>> Any hint how we can tune memory settings to avoid such errors ?
>>
>> Thanks.
>>
>
>
Re: greenplum: Memory allocation failed. madlib.__bernoulli_vector
Posted by Luis Macedo <lm...@pivotal.io>.
Try diminishing the scope of you data.
There was not enough memory to run the python code it seams...
*Luis Macedo | Sr Platform Architect | **Pivotal Inc *
*Mobile:* +55 11 97616-6438
*Pivotal.io <http://pivotal.io>*
*Take care of the customers and the rest takes care of itself*
2017-05-16 6:58 GMT-03:00 Dmitry Dorofeev <di...@luxmsbi.com>:
> Hello, we are getting
>
> psql:06_svm_train.sql:11: ERROR: plpy.SPIError: plpy.SPIError: Function
> "madlib.__bernoulli_vector(integer,double precision,double
> precision,double precision,integer)": Memory allocation failed. Typically,
> this indicates that Greenplum Database limits the available memory to less
> than what is needed for this input. (entry db greenplum.luxms:5432
> pid=12385) (plpython.c:4648)
> CONTEXT: Traceback (most recent call last):
> PL/Python function "svm_classification", line 26, in <module>
> return svm.svm(**globals())
> PL/Python function "svm_classification", line 983, in svm
> PL/Python function "svm_classification", line 1103, in
> _transform_w_kernel
> PL/Python function "svm_classification", line 277, in fit
> PL/Python function "svm_classification"
>
> for the following SQL:
>
> SELECT madlib.svm_classification ('stanford.train_input',
> 'stanford.model_table',
> 'label=4',
> 'ind',
> 'polynomial', --linear | gaussian
> 'coef0=0', -- kernel params
> '', -- grouping_col
> 'max_iter=1,validation_result=stanford.validation_result');
> -- max_iter=200
>
>
> MADlib 1.9.1
> Greenplum DB version 4.3ORCA
> CentOS Linux release 7.2.1511
>
> Any hint how we can tune memory settings to avoid such errors ?
>
> Thanks.
>