You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Daniel Daniel (Jira)" <ji...@apache.org> on 2020/10/27 10:05:00 UTC
[jira] [Created] (MADLIB-1460) Prevent an "integer out of range"
exception in linear regression train
Daniel Daniel created MADLIB-1460:
-------------------------------------
Summary: Prevent an "integer out of range" exception in linear regression train
Key: MADLIB-1460
URL: https://issues.apache.org/jira/browse/MADLIB-1460
Project: Apache MADlib
Issue Type: Bug
Components: Module: Linear Regression
Reporter: Daniel Daniel
Linear regression training results in 2 output tables (*neither are optional*):
* The *primary* output table, that includes the computed coefficients.
* A *summary* output table, that contains a single line.
+Scenario+
Running the linear regression training in postgresql on an input table which has *more than 2^31 records* within it (even if a grouping column is specified), fails due to an "*integer out of range*" exception.
+Source+
*The summary table* has a column that stores *the total number of records* involved in the computation. The column's data type is a *singed integer*. However, the total number of records is computed as a *BIGINT*. Therefore, when the total number of records in the input table is beyond the range of a signed integer (i.e., 2^31), an "integer out of range" exception is thrown.
+Solution+
A simple solution is to change the data type of the column from a *signed integer* into a *BIGINT*.
+Test+
We have executed the linear regression training function with and without the suggested modification on an input table having between 2^31-2^32 records. Without the modification, an integer out of range exception was thrown. After modifying the code as suggested, it worked perfectly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)