You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by jo...@apache.org on 2018/05/10 16:25:34 UTC

[2/5] impala git commit: IMPALA-6781: expand ORDER BY in some TPCH queries

IMPALA-6781: expand ORDER BY in some TPCH queries

Fix determinism in TPCH Q3, Q10, Q18 by adding another column to the
queries' ORDER BY to guarantee deterministic results. With TPCH 10000
these queries were producing differing results across stress test runs.
They were all valid, but the LIMIT without the more specific ORDER BY
meant that several different result sets were possible. By adding these
columns, we sort by a column that has uniqueness across all rows
returned.

Testing:

Repeated runs of these specific TPCH queries via:

  impala-py.test -k Q18 tests/query_test/test_tpch_queries.py

Stress test on a 140-node cluster with TPCH 10000 loaded. Previously when
using these queries, the stress test would incorrectly report incorrect
results.

Change-Id: If74d127fb57546b1948a34aa6d2e68cdc6880fae
Reviewed-on: http://gerrit.cloudera.org:8080/10351
Reviewed-by: Michael Brown <mi...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/ca0b8eb1
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/ca0b8eb1
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/ca0b8eb1

Branch: refs/heads/2.x
Commit: ca0b8eb11cc34657ae894ed6285057cdda7bfe4f
Parents: 6ba41b6
Author: Michael Brown <mi...@cloudera.com>
Authored: Fri May 4 09:42:09 2018 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Wed May 9 23:10:16 2018 +0000

----------------------------------------------------------------------
 testdata/workloads/tpch/queries/tpch-q10.test | 6 ++++--
 testdata/workloads/tpch/queries/tpch-q18.test | 6 ++++--
 testdata/workloads/tpch/queries/tpch-q3.test  | 6 ++++--
 3 files changed, 12 insertions(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/impala/blob/ca0b8eb1/testdata/workloads/tpch/queries/tpch-q10.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/tpch/queries/tpch-q10.test b/testdata/workloads/tpch/queries/tpch-q10.test
index 45038d2..b877487 100644
--- a/testdata/workloads/tpch/queries/tpch-q10.test
+++ b/testdata/workloads/tpch/queries/tpch-q10.test
@@ -2,6 +2,7 @@
 ---- QUERY: TPCH-Q10
 # Q10 - Returned Item Reporting Query
 # Converted select from multiple tables to joins
+# IMPALA-6781: add c_custkey to ORDER BY list for determinstic results
 select
   c_custkey,
   c_name,
@@ -32,7 +33,8 @@ group by
   c_address,
   c_comment
 order by
-  revenue desc
+  revenue desc,
+  c_custkey
 limit 20
 ---- RESULTS
 57040,'Customer#000057040',734235.2455,632.87,'JAPAN','Eioyzjf4pp','22-895-641-3466','sits. slyly regular requests sleep alongside of the regular inst'
@@ -57,4 +59,4 @@ limit 20
 23431,'Customer#000023431',554269.5360,3381.86,'ROMANIA','HgiV0phqhaIa9aydNoIlb','29-915-458-2654','nusual, even instructions: furiously stealthy n'
 ---- TYPES
 bigint, string, decimal, decimal, string, string, string, string
-====
\ No newline at end of file
+====

http://git-wip-us.apache.org/repos/asf/impala/blob/ca0b8eb1/testdata/workloads/tpch/queries/tpch-q18.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/tpch/queries/tpch-q18.test b/testdata/workloads/tpch/queries/tpch-q18.test
index db39222..bdf74da 100644
--- a/testdata/workloads/tpch/queries/tpch-q18.test
+++ b/testdata/workloads/tpch/queries/tpch-q18.test
@@ -1,6 +1,7 @@
 ====
 ---- QUERY: TPCH-Q18
 # Q18 - Large Value Customer Query
+# IMPALA-6781: add o_orderkey to ORDER BY list for determinstic results
 select
   c_name,
   c_custkey,
@@ -33,7 +34,8 @@ group by
   o_totalprice
 order by
   o_totalprice desc,
-  o_orderdate
+  o_orderdate,
+  o_orderkey
 limit 100
 ---- RESULTS
 'Customer#000128120',128120,4722021,'1994-04-07',544089.09,323.00
@@ -95,4 +97,4 @@ limit 100
 'Customer#000088703',88703,2995076,'1994-01-30',363812.12,302.00
 ---- TYPES
 STRING, BIGINT, BIGINT, STRING, DECIMAL, DECIMAL
-====
\ No newline at end of file
+====

http://git-wip-us.apache.org/repos/asf/impala/blob/ca0b8eb1/testdata/workloads/tpch/queries/tpch-q3.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/tpch/queries/tpch-q3.test b/testdata/workloads/tpch/queries/tpch-q3.test
index 3decb17..9b57607 100644
--- a/testdata/workloads/tpch/queries/tpch-q3.test
+++ b/testdata/workloads/tpch/queries/tpch-q3.test
@@ -1,6 +1,7 @@
 ====
 ---- QUERY: TPCH-Q3
 # Q3 - Shipping Priority Query
+# IMPALA-6781: add order by l_orderkey for determinstic results
 select
   l_orderkey,
   sum(l_extendedprice * (1 - l_discount)) as revenue,
@@ -22,7 +23,8 @@ group by
   o_shippriority
 order by
   revenue desc,
-  o_orderdate
+  o_orderdate,
+  l_orderkey
 limit 10
 ---- RESULTS
 2456423,406181.0111,'1995-03-05',0
@@ -37,4 +39,4 @@ limit 10
 2300070,367371.1452,'1995-03-13',0
 ---- TYPES
 BIGINT, DECIMAL, STRING, INT
-====
\ No newline at end of file
+====