You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Thomas Tauber-Marshall (JIRA)" <ji...@apache.org> on 2017/04/27 18:23:04 UTC
[jira] [Resolved] (IMPALA-397) ORDER BY rand() does not work.
[ https://issues.apache.org/jira/browse/IMPALA-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thomas Tauber-Marshall resolved IMPALA-397.
-------------------------------------------
Resolution: Fixed
Fix Version/s: Impala 2.9.0
commit 6cddb952cefedd373b2a1ce71a1b3cff2e774d70
Author: Thomas Tauber-Marshall <tm...@cloudera.com>
Date: Tue Jan 31 10:33:07 2017 -0800
IMPALA-4731/IMPALA-397/IMPALA-4728: Materialize sort exprs
Previously, exprs used in sorts were evaluated lazily. This can
potentially be bad for performance if the exprs are expensive to
evaluate, and it can lead to crashes if the exprs are
non-deterministic, as this violates assumptions of our sorting
algorithm.
This patch addresses these issues by materializing ordering exprs.
It does so when the expr is non-deterministic (including when it
contains a UDF, which we cannot currently know if they are
non-deterministic), or when its cost exceeds a threshold (or the
cost is unknown).
Testing:
- Added e2e tests in test_sort.py.
- Updated planner tests.
Change-Id: Ifefdaff8557a30ac44ea82ed428e6d1ffbca2e9e
Reviewed-on: http://gerrit.cloudera.org:8080/6322
Reviewed-by: Thomas Tauber-Marshall <tm...@cloudera.com>
Tested-by: Impala Public Jenkins
> ORDER BY rand() does not work.
> ------------------------------
>
> Key: IMPALA-397
> URL: https://issues.apache.org/jira/browse/IMPALA-397
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 1.0.1, Impala 2.3.0
> Reporter: Alexander Behm
> Assignee: Thomas Tauber-Marshall
> Priority: Minor
> Labels: correctness, downgraded, planner, usability
> Fix For: Impala 2.9.0
>
>
> The cause of the issue below is that r is not materialized.
> {code}
> select id, name, rand() r from countries order by r limit 10;
> Query: select id, name, rand() r from countries order by r limit 10
> Query finished, fetching results ...
> +----+----------------+-----------------------+
> | id | name | r |
> +----+----------------+-----------------------+
> | 3 | Canada | 0.0004714746030380365 |
> | 5 | Australia | 0.5895895192351144 |
> | 1 | United States | 0.4431900859080209 |
> | 4 | Ireland | 0.0739258840093044 |
> | 6 | Netherlands | 0.4621509646354946 |
> | 2 | United Kingdom | 0.6679162032287178 |
> | 9 | France | 0.8352529978543767 |
> | 8 | Germany | 0.1610932858479644 |
> | 7 | New Zealand | 0.4815021690360746 |
> | 91 | Antigua | 0.5511845208477156 |
> +----+----------------+-----------------------+
> Returned 10 row(s) in 0.48s
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)