You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Thomas Tauber-Marshall (JIRA)" <ji...@apache.org> on 2017/12/07 23:53:00 UTC
[jira] [Created] (IMPALA-6295) Inconsistent handling of 'nan' and
'inf' with min/max analytic fns
Thomas Tauber-Marshall created IMPALA-6295:
----------------------------------------------
Summary: Inconsistent handling of 'nan' and 'inf' with min/max analytic fns
Key: IMPALA-6295
URL: https://issues.apache.org/jira/browse/IMPALA-6295
Project: IMPALA
Issue Type: Bug
Components: Backend
Affects Versions: Impala 2.11.0
Reporter: Thomas Tauber-Marshall
Priority: Critical
Incorrect results are returned in some cases where 'nan'/'inf' are the only values in the group and codegen is enabled:
{noformat}
> set DISABLE_CODEGEN_ROWS_THRESHOLD set to 0
> select * from test1 order by col1
+------+-----------+
| col0 | col1 |
+------+-----------+
| 0 | NaN |
| 2 | -Infinity |
| 3 | 0 |
| 1 | Infinity |
+------+-----------+
> set DISABLE_CODEGEN set to true
> select col0, min(col1) from test1 group by col0 order by col0
+------+-----------+
| col0 | min(col1) |
+------+-----------+
| 0 | NaN |
| 1 | Infinity |
| 2 | -Infinity |
| 3 | 0 |
+------+-----------+
> set DISABLE_CODEGEN set to false
> select col0, min(col1) from test1 group by col0 order by col0
+------+------------------------+
| col0 | min(col1) |
+------+------------------------+
| 0 | 1.797693134862316e+308 |
| 1 | 1.797693134862316e+308 |
| 2 | -Infinity |
| 3 | 0 |
+------+------------------------+
> set DISABLE_CODEGEN set to true
> select col0, max(col1) from test1 group by col0 order by col0
+------+-----------+
| col0 | max(col1) |
+------+-----------+
| 0 | NaN |
| 1 | Infinity |
| 2 | -Infinity |
| 3 | 0 |
+------+-----------+
> set DISABLE_CODEGEN set to false
> select col0, max(col1) from test1 group by col0 order by col0
+------+-------------------------+
| col0 | max(col1) |
+------+-------------------------+
| 0 | -1.797693134862316e+308 |
| 1 | Infinity |
| 2 | -1.797693134862316e+308 |
| 3 | 0 |
+------+-------------------------+
{noformat}
We also appear to never return 'nan' as a min or max value despite sorted it as the lowest value when ordering a table (perhaps this is the intended behavior?):
{noformat}
> set DISABLE_CODEGEN_ROWS_THRESHOLD set to 0
> select * from test2 order by col1
+------+-----------+
| col0 | col1 |
+------+-----------+
| 0 | NaN |
| 2 | -Infinity |
| 0 | 0 |
| 3 | 0 |
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 1 | Infinity |
+------+-----------+
> set DISABLE_CODEGEN set to true
> select col0, min(col1) from test2 group by col0 order by col0
+------+-----------+
| col0 | min(col1) |
+------+-----------+
| 0 | 0 |
| 1 | 1 |
| 2 | -Infinity |
| 3 | 0 |
+------+-----------+
> set DISABLE_CODEGEN set to false
> select col0, min(col1) from test2 group by col0 order by col0
+------+-----------+
| col0 | min(col1) |
+------+-----------+
| 0 | 0 |
| 1 | 1 |
| 2 | -Infinity |
| 3 | 0 |
+------+-----------+
> set DISABLE_CODEGEN set to true
> select col0, max(col1) from test2 group by col0 order by col0
+------+-----------+
| col0 | max(col1) |
+------+-----------+
| 0 | 0 |
| 1 | Infinity |
| 2 | 2 |
| 3 | 3 |
+------+-----------+
> set DISABLE_CODEGEN set to false
> select col0, max(col1) from test2 group by col0 order by col0
+------+-----------+
| col0 | max(col1) |
+------+-----------+
| 0 | 0 |
| 1 | Infinity |
| 2 | 2 |
| 3 | 3 |
+------+-----------+
{noformat}
Changing LlvmCodeGen::CodegenMinMax to use OLT/OGT float comparison functions appears to solve the first case (at least for 'nan'), but leads to us returning 'nan' as a max value in the second case.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)