You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2020/02/18 01:40:28 UTC
[impala] branch master updated (d75562a -> 796f5cc)
This is an automated email from the ASF dual-hosted git repository.
tarmstrong pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git.
from d75562a IMPALA-9256: Refactor constraint information into a separate class.
new ba903de Add log of created files for data load
new 796f5cc IMPALA-8909: Fix incorrectly omitting implicit cast in pre-insert sort
The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.
Summary of changes:
fe/src/main/java/org/apache/impala/planner/SortNode.java | 8 +++++---
testdata/bin/create-load-data.sh | 3 +++
.../functional-query/queries/QueryTest/kudu_insert.test | 13 +++++++++++++
3 files changed, 21 insertions(+), 3 deletions(-)
[impala] 02/02: IMPALA-8909: Fix incorrectly omitting implicit cast
in pre-insert sort
Posted by ta...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
tarmstrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git
commit 796f5cc43d545a645f9eed3d13934375f31a3ce8
Author: Csaba Ringhofer <cs...@cloudera.com>
AuthorDate: Thu Feb 13 18:59:07 2020 +0100
IMPALA-8909: Fix incorrectly omitting implicit cast in pre-insert sort
Inserts can add a sort node that orders the rows by partitioning
and Kudu primary key columns (aka. clustered insert). The issue
occurred when the target column was a timestamp and the source
was an expression that returned a string (e.g. concat()). Impala
adds an implicit cast to convert the strings to timestamps before
sorting, but this cast was incorrectly removed later during expression
substitution.
This led to hitting a DCHECK in debug builds and a (not too
informative) error message in release mode.
Note that the cast in question is not visible in EXPLAIN outputs.
Explain should contain implicit casts from explain_level=2 since
https://gerrit.cloudera.org/#/c/11719/ , but it is still not shown
in some expressions. I consider this to be a separate issue.
Testing:
- added an EE test that used to crash
- ran planner / sort / kudu_insert tests
Change-Id: Icca8ab1456a3b840a47833119c9d4fd31a1fff90
Reviewed-on: http://gerrit.cloudera.org:8080/15217
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
fe/src/main/java/org/apache/impala/planner/SortNode.java | 8 +++++---
.../functional-query/queries/QueryTest/kudu_insert.test | 13 +++++++++++++
2 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/fe/src/main/java/org/apache/impala/planner/SortNode.java b/fe/src/main/java/org/apache/impala/planner/SortNode.java
index c05969c..ba894c3 100644
--- a/fe/src/main/java/org/apache/impala/planner/SortNode.java
+++ b/fe/src/main/java/org/apache/impala/planner/SortNode.java
@@ -143,13 +143,15 @@ public class SortNode extends PlanNode {
outputSmap_.put(slotExprs.get(i), new SlotRef(sortTupleSlots.get(i)));
}
ExprSubstitutionMap childSmap = getCombinedChildSmap();
+ // Preserve type as resolvedTupleExprs_ will be used to materialize the tuple and the
+ // layout is already calculated.
resolvedTupleExprs_ =
- Expr.substituteList(resolvedTupleExprs_, childSmap, analyzer, false);
+ Expr.substituteList(resolvedTupleExprs_, childSmap, analyzer, true);
// Remap the ordering exprs to the tuple materialized by this sort node. The mapping
// is a composition of the childSmap and the outputSmap_ because the child node may
- // have also remapped its input (e.g., as in a a series of (sort->analytic)* nodes).
- // Parent nodes have have to do the same so set the composition as the outputSmap_.
+ // have also remapped its input (e.g., as in a series of (sort->analytic)* nodes).
+ // Parent nodes have to do the same so set the composition as the outputSmap_.
outputSmap_ = ExprSubstitutionMap.compose(childSmap, outputSmap_, analyzer);
info_.substituteSortExprs(outputSmap_, analyzer);
diff --git a/testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test b/testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
index 981a734..d2b734e 100644
--- a/testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
+++ b/testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
@@ -320,6 +320,19 @@ NumRowErrors: 0
row_regex: .*EXEC_SINGLE_NODE_ROWS_THRESHOLD=0.*
====
---- QUERY
+create table timestampkey (t timestamp primary key)
+ partition by hash(t) PARTITIONS 2 stored as kudu;
+====
+---- QUERY
+# Regression test for IMPALA-8909 (used to hit a DCHECK in debug builds).
+insert into timestampkey select concat(string_col, " ") from functional.alltypessmall;
+---- RESULTS
+: 0
+---- RUNTIME_PROFILE
+NumModifiedRows: 0
+NumRowErrors: 100
+====
+---- QUERY
# Table with default values
create table tbl_with_defaults (a int primary key, b int null default 10,
c int not null default 100, d int default 1000, e int null, f int not null,
[impala] 01/02: Add log of created files for data load
Posted by ta...@apache.org.
This is an automated email from the ASF dual-hosted git repository.
tarmstrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git
commit ba903deba7e529df6d3136589086105a99c878c1
Author: norbert.luksa <no...@cloudera.com>
AuthorDate: Mon Feb 17 15:36:10 2020 +0100
Add log of created files for data load
As Joe pointed out in IMPALA-9351, it would help debugging issues with
missing files if we had logged the created files when loading the data.
With this commit, running create-load-data.sh now logs the created
files into created-files.log.
Change-Id: I4f413810c6202a07c19ad1893088feedd9f7278f
Reviewed-on: http://gerrit.cloudera.org:8080/15234
Reviewed-by: Zoltan Borok-Nagy <bo...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
testdata/bin/create-load-data.sh | 3 +++
1 file changed, 3 insertions(+)
diff --git a/testdata/bin/create-load-data.sh b/testdata/bin/create-load-data.sh
index 65d18e3..48ba7f5 100755
--- a/testdata/bin/create-load-data.sh
+++ b/testdata/bin/create-load-data.sh
@@ -719,6 +719,9 @@ if [ "${TARGET_FILESYSTEM}" = "hdfs" ]; then
create-internal-hbase-table
run-step "Checking HDFS health" check-hdfs-health.log check-hdfs-health
+
+ # Saving the list of created files can help in debugging missing files.
+ run-step "Logging created files" created-files.log hdfs dfs -ls -R /test-warehouse
fi
# TODO: Investigate why all stats are not preserved. Theoretically, we only need to