You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2019/04/24 23:47:16 UTC
[impala] branch master updated: IMPALA-8386: Fix incorrect equivalence conjuncts not treated as identity

This is an automated email from the ASF dual-hosted git repository.

tarmstrong pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git


The following commit(s) were added to refs/heads/master by this push:
     new f25a899  IMPALA-8386: Fix incorrect equivalence conjuncts not treated as identity
f25a899 is described below

commit f25a899924856f705ecb581b237a149003279473
Author: stiga-huang <hu...@gmail.com>
AuthorDate: Fri Apr 5 09:20:44 2019 -0700

    IMPALA-8386: Fix incorrect equivalence conjuncts not treated as identity
    
    When generating single node plans for inline views, Impala will create
    some equivalence conjuncts based on slot equivalences. However, these
    conjuncts may finally be substituted to identity (e.g. a = a) which may
    incorrectly reject rows with nulls. We already have some logic to remove
    this kind of conjuncts but the existing checks have exceptions.
    
    For example, consider the following tables and a query:
    table A        table B            table C
    +------+  +------+--------+  +------+------+
    | a_id |  | b_id | amount |  | a_id | b_id |
    +------+  +------+--------+  +------+------+
    | 1    |  | 1    | 10     |  | 1    | 1    |
    | 2    |  | 1    | 20     |  | 2    | 2    |
    +------+  | 2    | NULL   |  +------+------+
              +------+--------+
        select * from (select t2.a_id, t2.amount1, t2.amount2
            from a
            left outer join (
                select c.a_id, amount as amount1, amount as amount2
                from b join c on b.b_id = c.b_id
            ) t2
            on a.a_id = t2.a_id
        ) t1;
    
    They query has 11 slots. The valueTransferGraph (slot equivalences) has
    3 strongly connected components:
     * {slot0 (b.b_id), slot1 (c.b_id)}
     * {slot2 (c.a_id), slot4 (t2.a_id), slot8 (t1.a_id)}
     * {slot3 (b.amount), slot5 (t2.amount1), slot6 (t2.amount2),
    slot9 (t1.amount1), slot10 (t1.amount2)}
    
    In SingleNodePlanner#migrateConjunctsToInlineView, when dealing with
    inline view t1, a predicate "t1.amount1 = t1.amount2" will first be
    created by Analyzer#createEquivConjuncts, then be substituted using the
    smap_ of the inline view and become "t2.amount1 = t2.amount2". It can
    still pass the IdentityPredicate check. However, the substituted one
    will finally be resolved to "amount = amount" and be assigned to the
    left outer join node. So nulls are incorrectly rejected.
    
    Actually, when checking IdentityPredicates, we need to check the final
    resolved version of them using base table slots (baseTblSmap_). So the
    predicate "t1.amount1 = t1.amount2" will be resolved to "amount = amount"
    and won't pass the IdentityPredicate check.
    
    Tests:
     * Add plan tests in PlannerTest/inline-view.test
     * Run all tests locally in CORE exploration strategy
    
    Change-Id: Ia87aa9db2de85f0716e4854a88727aad593773fa
    Reviewed-on: http://gerrit.cloudera.org:8080/12939
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 .../java/org/apache/impala/analysis/Analyzer.java  |  17 +-
 .../apache/impala/analysis/BinaryPredicate.java    |   4 +
 .../impala/analysis/ExprSubstitutionMap.java       |  37 +++
 .../org/apache/impala/analysis/InlineViewRef.java  |   1 +
 .../apache/impala/planner/SingleNodePlanner.java   |  65 +++--
 .../queries/PlannerTest/inline-view.test           | 306 +++++++++++++++++++++
 6 files changed, 410 insertions(+), 20 deletions(-)

diff --git a/fe/src/main/java/org/apache/impala/analysis/Analyzer.java b/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
index 5fd2b73..eded0d3 100644
--- a/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
+++ b/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
@@ -1813,7 +1813,11 @@ public class Analyzer {
         for (int j = 0; j < i; ++j) {
           SlotId lhs = slotIds.get(j);
           if (!partialEquivSlots.union(lhs, rhs)) continue;
-          conjuncts.add((T) createInferredEqPred(lhs, rhs));
+          T pred = (T) createInferredEqPred(lhs, rhs);
+          conjuncts.add(pred);
+          if (LOG.isTraceEnabled()) {
+            LOG.trace("Created inferred predicate: " + pred.debugString());
+          }
           // Check for early termination.
           if (partialEquivSlots.get(lhs).size() == slotIds.size()) {
             done = true;
@@ -2027,15 +2031,24 @@ public class Analyzer {
         Analyzer firstBlock = globalState_.blockBySlot.get(slotIds.first);
         Analyzer secondBlock = globalState_.blockBySlot.get(slotIds.second);
         if (LOG.isTraceEnabled()) {
-          LOG.trace("value transfer: from " + slotIds.first.toString());
+          LOG.trace("Considering value transfer between " + slotIds.first.toString() +
+              " and " + slotIds.second.toString());
         }
         if (!(secondBlock.hasLimitOffsetClause_ &&
             secondBlock.ancestors_.contains(firstBlock))) {
           g.addEdge(slotIds.first.asInt(), slotIds.second.asInt());
+          if (LOG.isTraceEnabled()) {
+            LOG.trace("value transfer: from " + slotIds.first.toString() + " to " +
+                slotIds.second.toString());
+          }
         }
         if (!(firstBlock.hasLimitOffsetClause_ &&
             firstBlock.ancestors_.contains(secondBlock))) {
           g.addEdge(slotIds.second.asInt(), slotIds.first.asInt());
+          if (LOG.isTraceEnabled()) {
+            LOG.trace("value transfer: from " + slotIds.second.toString() + " to " +
+                    slotIds.first.toString());
+          }
         }
         continue;
       }
diff --git a/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java b/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
index 0650c2b..2bb6625 100644
--- a/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
+++ b/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
@@ -135,6 +135,10 @@ public class BinaryPredicate extends Predicate {
   public boolean isInferred() { return isInferred_; }
   public void setIsInferred() { isInferred_ = true; }
 
+  public boolean hasIdenticalOperands() {
+    return getChild(0) != null && getChild(0).equals(getChild(1));
+  }
+
   @Override
   public String toSqlImpl(ToSqlOptions options) {
     return getChild(0).toSql(options) + " " + op_.toString() + " "
diff --git a/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java b/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
index 5e9f875..baf7b61 100644
--- a/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
+++ b/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
@@ -176,4 +176,41 @@ public final class ExprSubstitutionMap {
   public ExprSubstitutionMap clone() {
     return new ExprSubstitutionMap(Expr.cloneList(lhs_), Expr.cloneList(rhs_));
   }
+
+  /**
+   * Returns whether we are composed from the other map.
+   */
+  public boolean checkComposedFrom(ExprSubstitutionMap other) {
+    // If g is composed from f, then g(x) = h(f(x)), so "f(a) == f(b)" => "g(a) == g(b)".
+    for (int i = 0; i < other.lhs_.size() - 1; ++i) {
+      for (int j = i + 1; j < other.lhs_.size(); ++j) {
+        Expr a = other.lhs_.get(i);
+        Expr b = other.lhs_.get(j);
+        Expr finalA = get(a);
+        Expr finalB = get(b);
+        if (finalA == null || finalB == null) {
+          if (LOG.isTraceEnabled()) {
+            if (finalA == null) {
+              LOG.trace("current smap misses item for " + a.debugString());
+            }
+            if (finalB == null) {
+              LOG.trace("current smap misses item for " + b.debugString());
+            }
+          }
+          return false;
+        }
+        if (other.rhs_.get(i).equals(other.rhs_.get(j)) && !finalA.equals(finalB)) {
+          // f(a) == f(b) but g(a) != g(b)
+          if (LOG.isTraceEnabled()) {
+            LOG.trace(String.format("smap conflicts in substituting %s and %s. Result" +
+                " of the base map: %s. Results of current map: %s and %s",
+                a.debugString(), b.debugString(), other.rhs_.get(i).debugString(),
+                finalA.debugString(), finalB.debugString()));
+          }
+          return false;
+        }
+      }
+    }
+    return true;
+  }
 }
diff --git a/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java b/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
index 5c9503f..50c70de 100644
--- a/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
+++ b/fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
@@ -215,6 +215,7 @@ public class InlineViewRef extends TableRef {
       LOG.trace("inline view " + getUniqueAlias() + " baseTblSmap: " +
           baseTblSmap_.debugString());
     }
+    Preconditions.checkState(baseTblSmap_.checkComposedFrom(smap_));
 
     analyzeTableSample(analyzer);
     analyzeHints(analyzer);
diff --git a/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java b/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
index 9d2d382..6de0116 100644
--- a/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
+++ b/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
@@ -1146,10 +1146,13 @@ public class SingleNodePlanner {
    * makes the *output* of the computation visible to the enclosing scope, so that
    * filters from the enclosing scope can be safely applied (to the grouping cols, say).
    */
-  public void migrateConjunctsToInlineView(Analyzer analyzer,
-      InlineViewRef inlineViewRef) throws ImpalaException {
+  public void migrateConjunctsToInlineView(final Analyzer analyzer,
+      final InlineViewRef inlineViewRef) throws ImpalaException {
     List<Expr> unassignedConjuncts =
         analyzer.getUnassignedConjuncts(inlineViewRef.getId().asList(), true);
+    if (LOG. isTraceEnabled()) {
+      LOG.trace("unassignedConjuncts: " + Expr.debugString(unassignedConjuncts));
+    }
     if (!canMigrateConjuncts(inlineViewRef)) {
       // mark (fully resolve) slots referenced by unassigned conjuncts as
       // materialized
@@ -1163,35 +1166,61 @@ public class SingleNodePlanner {
     for (Expr e: unassignedConjuncts) {
       if (analyzer.canEvalPredicate(inlineViewRef.getId().asList(), e)) {
         preds.add(e);
+        if (LOG. isTraceEnabled()) {
+          LOG.trace(String.format("Can evaluate %s in inline view %s", e.debugString(),
+                  inlineViewRef.getExplicitAlias()));
+        }
       }
     }
     unassignedConjuncts.removeAll(preds);
+    // Migrate the conjuncts by marking the original ones as assigned. They will either
+    // be ignored if they are identity predicates (e.g. a = a), or be substituted into
+    // new ones (viewPredicates below). The substituted ones will be re-registered.
+    analyzer.markConjunctsAssigned(preds);
     // Generate predicates to enforce equivalences among slots of the inline view
     // tuple. These predicates are also migrated into the inline view.
     analyzer.createEquivConjuncts(inlineViewRef.getId(), preds);
 
+    // Remove unregistered predicates that finally resolved to predicates reference
+    // the same slot on both sides (e.g. a = a). Such predicates have been generated from
+    // slot equivalences and may incorrectly reject rows with nulls
+    // (IMPALA-1412/IMPALA-2643/IMPALA-8386).
+    Predicate<Expr> isIdentityPredicate = new Predicate<Expr>() {
+      @Override
+      public boolean apply(Expr e) {
+        if (!org.apache.impala.analysis.Predicate.isEquivalencePredicate(e)
+            || !((BinaryPredicate) e).isInferred()) {
+          return false;
+        }
+        try {
+          BinaryPredicate finalExpr = (BinaryPredicate) e.trySubstitute(
+              inlineViewRef.getBaseTblSmap(), analyzer, false);
+          boolean isIdentity = finalExpr.hasIdenticalOperands();
+
+          // Verity that "smap[e1] == smap[e2]" => "baseTblSmap[e1] == baseTblSmap[e2]"
+          // in case we have bugs in generating baseTblSmap.
+          BinaryPredicate midExpr = (BinaryPredicate) e.trySubstitute(
+              inlineViewRef.getSmap(), analyzer, false);
+          Preconditions.checkState(!midExpr.hasIdenticalOperands() || isIdentity);
+
+          if (LOG.isTraceEnabled() && isIdentity) {
+            LOG.trace("Removed identity predicate: " + finalExpr.debugString());
+          }
+          return isIdentity;
+        } catch (Exception ex) {
+          throw new IllegalStateException(
+                  "Failed analysis after expr substitution.", ex);
+        }
+      }
+    };
+    Iterables.removeIf(preds, isIdentityPredicate);
+
     // create new predicates against the inline view's unresolved result exprs, not
     // the resolved result exprs, in order to avoid skipping scopes (and ignoring
     // limit clauses on the way)
     List<Expr> viewPredicates =
         Expr.substituteList(preds, inlineViewRef.getSmap(), analyzer, false);
 
-    // Remove unregistered predicates that reference the same slot on
-    // both sides (e.g. a = a). Such predicates have been generated from slot
-    // equivalences and may incorrectly reject rows with nulls (IMPALA-1412/IMPALA-2643).
-    Predicate<Expr> isIdentityPredicate = new Predicate<Expr>() {
-      @Override
-      public boolean apply(Expr expr) {
-        return org.apache.impala.analysis.Predicate.isEquivalencePredicate(expr)
-            && ((BinaryPredicate) expr).isInferred()
-            && expr.getChild(0).equals(expr.getChild(1));
-      }
-    };
-    Iterables.removeIf(viewPredicates, isIdentityPredicate);
-
-    // Migrate the conjuncts by marking the original ones as assigned, and
-    // re-registering the substituted ones with new ids.
-    analyzer.markConjunctsAssigned(preds);
     // Unset the On-clause flag of the migrated conjuncts because the migrated conjuncts
     // apply to the post-join/agg/analytic result of the inline view.
     for (Expr e: viewPredicates) e.setIsOnClauseConjunct(false);
diff --git a/testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test b/testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
index a181af9..e2b7d98 100644
--- a/testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
+++ b/testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test
@@ -1534,3 +1534,309 @@ PLAN-ROOT SINK
    partitions=4/4 files=4 size=460B
    row-size=8B cardinality=8
 ====
+# IMPALA-8386: Predicates generated from slot equivalences won't be identities.
+# Without this patch, there will be a predicate "sum(c.int_col) = sum(c.int_col)"
+# in node 05, which may incorrectly reject rows with nulls.
+select count(1) from (
+    select t2.bigint_col, t2.amount1, t2.amount2
+    from functional.alltypes a
+    left outer join (
+        select c.bigint_col, sum(c.int_col) as amount1, sum(c.int_col) as amount2
+        from functional.alltypessmall b
+        join functional.alltypestiny c
+        on b.bigint_col = c.bigint_col
+        group by c.bigint_col
+    ) t2
+    on a.bigint_col = t2.bigint_col
+) t1;
+---- PLAN
+PLAN-ROOT SINK
+|
+06:AGGREGATE [FINALIZE]
+|  output: count(*)
+|  row-size=8B cardinality=1
+|
+05:HASH JOIN [LEFT OUTER JOIN]
+|  hash predicates: a.bigint_col = c.bigint_col
+|  row-size=16B cardinality=7.30K
+|
+|--04:AGGREGATE [FINALIZE]
+|  |  group by: c.bigint_col
+|  |  row-size=8B cardinality=2
+|  |
+|  03:HASH JOIN [INNER JOIN]
+|  |  hash predicates: b.bigint_col = c.bigint_col
+|  |  runtime filters: RF000 <- c.bigint_col
+|  |  row-size=16B cardinality=80
+|  |
+|  |--02:SCAN HDFS [functional.alltypestiny c]
+|  |     partitions=4/4 files=4 size=460B
+|  |     row-size=8B cardinality=8
+|  |
+|  01:SCAN HDFS [functional.alltypessmall b]
+|     partitions=4/4 files=4 size=6.32KB
+|     runtime filters: RF000 -> b.bigint_col
+|     row-size=8B cardinality=100
+|
+00:SCAN HDFS [functional.alltypes a]
+   partitions=24/24 files=24 size=478.45KB
+   row-size=8B cardinality=7.30K
+====
+# IMPALA-8386: Predicates generated from slot equivalences won't be identities.
+# Without this patch, there will be a predicate "c.int_col = c.int_col" in node 04,
+# which may incorrectly reject rows with nulls.
+select * from (
+    select t2.bigint_col, t2.amount1, t2.amount2
+    from functional.alltypes a
+    left outer join (
+        select c.bigint_col, c.int_col as amount1, c.int_col as amount2
+        from functional.alltypessmall b
+        join functional.alltypestiny c
+        on b.bigint_col = c.bigint_col
+    ) t2
+    on a.bigint_col = t2.bigint_col
+) t1;
+---- PLAN
+PLAN-ROOT SINK
+|
+04:HASH JOIN [LEFT OUTER JOIN]
+|  hash predicates: a.bigint_col = c.bigint_col
+|  row-size=28B cardinality=58.40K
+|
+|--03:HASH JOIN [INNER JOIN]
+|  |  hash predicates: b.bigint_col = c.bigint_col
+|  |  runtime filters: RF000 <- c.bigint_col
+|  |  row-size=20B cardinality=80
+|  |
+|  |--02:SCAN HDFS [functional.alltypestiny c]
+|  |     partitions=4/4 files=4 size=460B
+|  |     row-size=12B cardinality=8
+|  |
+|  01:SCAN HDFS [functional.alltypessmall b]
+|     partitions=4/4 files=4 size=6.32KB
+|     runtime filters: RF000 -> b.bigint_col
+|     row-size=8B cardinality=100
+|
+00:SCAN HDFS [functional.alltypes a]
+   partitions=24/24 files=24 size=478.45KB
+   row-size=8B cardinality=7.30K
+====
+# A more deeper inline view test for IMPALA-8386. No predicate "int_col = int_col" will
+# be generated.
+select * from (
+    select t2.id, t2.amount1, t2.amount2
+    from functional.alltypestiny a
+    left outer join (
+        select t3.id, t3.amount1, t3.amount2
+        from functional.alltypestiny b
+        left outer join (
+            select c.id, c.int_col as amount1, c.int_col as amount2
+            from functional.alltypestiny c
+            join functional.alltypestiny d
+            on c.id = d.id
+        ) t3
+        on b.id = t3.id
+    ) t2
+    on a.id = t2.id
+) t1;
+---- PLAN
+PLAN-ROOT SINK
+|
+06:HASH JOIN [RIGHT OUTER JOIN]
+|  hash predicates: c.id = a.id
+|  runtime filters: RF000 <- a.id
+|  row-size=20B cardinality=8
+|
+|--00:SCAN HDFS [functional.alltypestiny a]
+|     partitions=4/4 files=4 size=460B
+|     row-size=4B cardinality=8
+|
+05:HASH JOIN [RIGHT OUTER JOIN]
+|  hash predicates: c.id = b.id
+|  runtime filters: RF002 <- b.id
+|  row-size=16B cardinality=8
+|
+|--01:SCAN HDFS [functional.alltypestiny b]
+|     partitions=4/4 files=4 size=460B
+|     row-size=4B cardinality=8
+|
+04:HASH JOIN [INNER JOIN]
+|  hash predicates: c.id = d.id
+|  runtime filters: RF004 <- d.id
+|  row-size=12B cardinality=8
+|
+|--03:SCAN HDFS [functional.alltypestiny d]
+|     partitions=4/4 files=4 size=460B
+|     runtime filters: RF000 -> d.id, RF002 -> d.id
+|     row-size=4B cardinality=8
+|
+02:SCAN HDFS [functional.alltypestiny c]
+   partitions=4/4 files=4 size=460B
+   runtime filters: RF000 -> c.id, RF002 -> c.id, RF004 -> c.id
+   row-size=8B cardinality=8
+====
+# A minimal reproduce for IMPALA-8386. Though the query results are correct, without
+# this patch there's a wrong inferred predicate "int_col = int_col" assigned at the
+# Join node.
+select * from (
+    select t2.bigint_col, t2.amount1, t2.amount2
+    from functional.alltypessmall a
+    left outer join (
+        select bigint_col, int_col as amount1, int_col as amount2
+        from functional.alltypestiny
+    ) t2
+    on a.bigint_col = t2.bigint_col
+) t1;
+---- PLAN
+PLAN-ROOT SINK
+|
+02:HASH JOIN [LEFT OUTER JOIN]
+|  hash predicates: a.bigint_col = bigint_col
+|  row-size=20B cardinality=100
+|
+|--01:SCAN HDFS [functional.alltypestiny]
+|     partitions=4/4 files=4 size=460B
+|     row-size=12B cardinality=8
+|
+00:SCAN HDFS [functional.alltypessmall a]
+   partitions=4/4 files=4 size=6.32KB
+   row-size=8B cardinality=100
+====
+# IMPALA-8386: test coverage for ORDER BY/LIMIT
+select * from (
+    select t2.bigint_col, t2.amount1, t2.amount2
+    from functional.alltypessmall a
+    left outer join (
+        select bigint_col, int_col as amount1, int_col as amount2
+        from functional.alltypestiny
+        order by bigint_col limit 10
+    ) t2
+    on a.bigint_col = t2.bigint_col
+    order by 1 limit 10
+) t1;
+---- PLAN
+PLAN-ROOT SINK
+|
+04:TOP-N [LIMIT=10]
+|  order by: bigint_col ASC
+|  row-size=16B cardinality=10
+|
+03:HASH JOIN [LEFT OUTER JOIN]
+|  hash predicates: a.bigint_col = bigint_col
+|  row-size=20B cardinality=100
+|
+|--02:TOP-N [LIMIT=10]
+|  |  order by: bigint_col ASC
+|  |  row-size=12B cardinality=8
+|  |
+|  01:SCAN HDFS [functional.alltypestiny]
+|     HDFS partitions=4/4 files=4 size=460B
+|     row-size=12B cardinality=8
+|
+00:SCAN HDFS [functional.alltypessmall a]
+   HDFS partitions=4/4 files=4 size=6.32KB
+   row-size=8B cardinality=100
+====
+# IMPALA-8386: test coverage for analytic functions
+select * from (
+    select t2.bigint_col, t2.amount1, t2.amount2
+    from functional.alltypessmall a
+    left outer join (
+        select bigint_col, max(int_col) over (partition by bigint_col) as amount1,
+            max(int_col) over (partition by bigint_col) as amount2
+        from functional.alltypestiny
+    ) t2
+    on a.bigint_col = t2.bigint_col
+) t1;
+---- PLAN
+PLAN-ROOT SINK
+|
+04:HASH JOIN [LEFT OUTER JOIN]
+|  hash predicates: a.bigint_col = bigint_col
+|  row-size=24B cardinality=100
+|
+|--03:ANALYTIC
+|  |  functions: max(int_col)
+|  |  partition by: bigint_col
+|  |  row-size=16B cardinality=8
+|  |
+|  02:SORT
+|  |  order by: bigint_col ASC NULLS FIRST
+|  |  row-size=12B cardinality=8
+|  |
+|  01:SCAN HDFS [functional.alltypestiny]
+|     HDFS partitions=4/4 files=4 size=460B
+|     row-size=12B cardinality=8
+|
+00:SCAN HDFS [functional.alltypessmall a]
+   HDFS partitions=4/4 files=4 size=6.32KB
+   row-size=8B cardinality=100
+====
+# IMPALA-8386: test coverage for unions
+select * from (
+    select t2.bigint_col, t2.amount1, t2.amount2
+    from functional.alltypessmall a
+    left outer join (
+        select bigint_col, int_col as amount1, int_col as amount2
+        from (
+            select * from functional.alltypestiny where id < 4
+            union all
+            select * from functional.alltypestiny where id >= 4
+        ) t3
+    ) t2
+    on a.bigint_col = t2.bigint_col
+) t1;
+---- PLAN
+PLAN-ROOT SINK
+|
+04:HASH JOIN [LEFT OUTER JOIN]
+|  hash predicates: a.bigint_col = bigint_col
+|  row-size=20B cardinality=100
+|
+|--01:UNION
+|  |  row-size=12B cardinality=2
+|  |
+|  |--03:SCAN HDFS [functional.alltypestiny]
+|  |     HDFS partitions=4/4 files=4 size=460B
+|  |     predicates: id >= 4
+|  |     row-size=16B cardinality=1
+|  |
+|  02:SCAN HDFS [functional.alltypestiny]
+|     HDFS partitions=4/4 files=4 size=460B
+|     predicates: id < 4
+|     row-size=16B cardinality=1
+|
+00:SCAN HDFS [functional.alltypessmall a]
+   HDFS partitions=4/4 files=4 size=6.32KB
+   row-size=8B cardinality=100
+====
+# IMPALA-8386: test coverage for unions
+select * from (
+    select t2.bigint_col, t2.amount1, t2.amount2
+    from functional.alltypessmall a
+    left join (
+        select bigint_col, int_col as amount1, int_col as amount2
+        from functional.alltypestiny
+        union all values (NULL, NULL, NULL)
+    ) t2
+    on a.bigint_col = t2.bigint_col
+) t1;
+---- PLAN
+PLAN-ROOT SINK
+|
+03:HASH JOIN [LEFT OUTER JOIN]
+|  hash predicates: a.bigint_col = bigint_col
+|  row-size=24B cardinality=100
+|
+|--01:UNION
+|  |  constant-operands=1
+|  |  row-size=16B cardinality=9
+|  |
+|  02:SCAN HDFS [functional.alltypestiny]
+|     HDFS partitions=4/4 files=4 size=460B
+|     row-size=12B cardinality=8
+|
+00:SCAN HDFS [functional.alltypessmall a]
+   HDFS partitions=4/4 files=4 size=6.32KB
+   row-size=8B cardinality=100
+====