You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hive.apache.org by gu...@apache.org on 2014/10/21 23:12:49 UTC

svn commit: r1633468 - in /hive/branches/branch-0.14: data/files/ ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/ ql/src/java/org/apache/hadoop/hive/ql/plan/ ql/src/java/org/apache/hadoop/hive/ql/stats/ ql/src/test/queries/clientposit...

Author: gunther
Date: Tue Oct 21 21:12:49 2014
New Revision: 1633468

URL: http://svn.apache.org/r1633468
Log:
HIVE-8168: With dynamic partition enabled fact table selectivity is not taken into account when generating the physical plan (Use CBO cardinality using physical plan generation) (Prasanth J via Gunther Hagleitner)

Added:
    hive/branches/branch-0.14/data/files/customer_address.txt
    hive/branches/branch-0.14/data/files/store.txt
    hive/branches/branch-0.14/ql/src/test/queries/clientpositive/annotate_stats_join_pkfk.q
    hive/branches/branch-0.14/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out
Modified:
    hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
    hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/plan/ColStatistics.java
    hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java

Added: hive/branches/branch-0.14/data/files/customer_address.txt
URL: http://svn.apache.org/viewvc/hive/branches/branch-0.14/data/files/customer_address.txt?rev=1633468&view=auto
==============================================================================
--- hive/branches/branch-0.14/data/files/customer_address.txt (added)
+++ hive/branches/branch-0.14/data/files/customer_address.txt Tue Oct 21 21:12:49 2014
@@ -0,0 +1,20 @@
+1|AAAAAAAABAAAAAAA|18|Jackson |Parkway|Suite 280|Fairfield|Maricopa County|AZ|86192|United States|-7|condo|
+2|AAAAAAAACAAAAAAA|362|Washington 6th|RD|Suite 80|Fairview|Taos County|NM|85709|United States|-7|condo|
+3|AAAAAAAADAAAAAAA|585|Dogwood Washington|Circle|Suite Q|Pleasant Valley|York County|PA|12477|United States|-5|single family|
+4|AAAAAAAAEAAAAAAA|111|Smith |Wy|Suite A|Oak Ridge|Kit Carson County|CO|88371|United States|-7|condo|
+5|AAAAAAAAFAAAAAAA|31|College |Blvd|Suite 180|Glendale|Barry County|MO|63951|United States|-6|single family|
+6|AAAAAAAAGAAAAAAA|59|Williams Sixth|Parkway|Suite 100|Lakeview|Chelan County|WA|98579|United States|-8|single family|
+7|AAAAAAAAHAAAAAAA||Hill 7th|Road|Suite U|Farmington|||39145|United States|||
+8|AAAAAAAAIAAAAAAA|875|Lincoln |Ct.|Suite Y|Union|Bledsoe County|TN|38721|United States|-5|apartment|
+9|AAAAAAAAJAAAAAAA|819|1st Laurel|Ave|Suite 70|New Hope|Perry County|AL|39431|United States|-6|condo|
+10|AAAAAAAAKAAAAAAA|851|Woodland Poplar|ST|Suite Y|Martinsville|Haines Borough|AK|90419|United States|-9|condo|
+11|AAAAAAAALAAAAAAA|189|13th 2nd|Street|Suite 470|Maple Grove|Madison County|MT|68252|United States|-7|single family|
+12|AAAAAAAAMAAAAAAA|76|Ash 8th|Ct.|Suite O|Edgewood|Mifflin County|PA|10069|United States|-5|apartment|
+13|AAAAAAAANAAAAAAA|424|Main Second|Ln|Suite 130|Greenville|Noxubee County|MS|51387|United States|-6|single family|
+14|AAAAAAAAOAAAAAAA|923|Pine Oak|Dr.|Suite 100||Lipscomb County|TX|77752||-6||
+15|AAAAAAAAPAAAAAAA|314|Spring |Ct.|Suite B|Oakland|Washington County|OH|49843|United States|-5|apartment|
+16|AAAAAAAAABAAAAAA|576|Adams Center|Street|Suite J|Valley View|Oldham County|TX|75124|United States|-6|condo|
+17|AAAAAAAABBAAAAAA|801|Green |Dr.|Suite 0|Montpelier|Richland County|OH|48930|United States|-5|single family|
+18|AAAAAAAACBAAAAAA|460|Maple Spruce|Court|Suite 480|Somerville|Potter County|SD|57783|United States|-7|condo|
+19|AAAAAAAADBAAAAAA|611|Wilson |Way|Suite O|Oakdale|Tangipahoa Parish|LA|79584|United States|-6|apartment|
+20|AAAAAAAAEBAAAAAA|675|Elm Wilson|Street|Suite I|Hopewell|Williams County|OH|40587|United States|-5|condo|

Added: hive/branches/branch-0.14/data/files/store.txt
URL: http://svn.apache.org/viewvc/hive/branches/branch-0.14/data/files/store.txt?rev=1633468&view=auto
==============================================================================
--- hive/branches/branch-0.14/data/files/store.txt (added)
+++ hive/branches/branch-0.14/data/files/store.txt Tue Oct 21 21:12:49 2014
@@ -0,0 +1,12 @@
+1|AAAAAAAABAAAAAAA|1997-03-13||2451189|ought|245|5250760|8AM-4PM|William Ward|2|Unknown|Enough high areas stop expectations. Elaborate, local is|Charles Bartley|1|Unknown|1|Unknown|767|Spring |Wy|Suite 250|Midway|Williamson County|TN|31904|United States|-5|0.03|
+2|AAAAAAAACAAAAAAA|1997-03-13|2000-03-12||able|236|5285950|8AM-4PM|Scott Smith|8|Unknown|Parliamentary candidates wait then heavy, keen mil|David Lamontagne|1|Unknown|1|Unknown|255|Sycamore |Dr.|Suite 410|Midway|Williamson County|TN|31904|United States|-5|0.03|
+3|AAAAAAAACAAAAAAA|2000-03-13|||able|236|7557959|8AM-4PM|Scott Smith|7|Unknown|Impossible, true arms can treat constant, complete w|David Lamontagne|1|Unknown|1|Unknown|877|Park Laurel|Road|Suite T|Midway|Williamson County|TN|31904|United States|-5|0.03|
+4|AAAAAAAAEAAAAAAA|1997-03-13|1999-03-13|2451044|ese|218|9341467|8AM-4PM|Edwin Adams|4|Unknown|Events would achieve other, eastern hours. Mechanisms must not eat other, new org|Thomas Pollack|1|Unknown|1|Unknown|27|Lake |Ln|Suite 260|Midway|Williamson County|TN|31904|United States|-5|0.03|
+5|AAAAAAAAEAAAAAAA|1999-03-14|2001-03-12|2450910|anti|288|9078805|8AM-4PM|Edwin Adams|8|Unknown|Events would achieve other, eastern hours. Mechanisms must not eat other, new org|Thomas Pollack|1|Unknown|1|Unknown|27|Lee 6th|Court|Suite 80|Fairview|Williamson County|TN|35709|United States|-5|0.03|
+6|AAAAAAAAEAAAAAAA|2001-03-13|||cally|229|9026222|8AM-4PM|Edwin Adams|10|Unknown|Events would achieve other, eastern hours. Mechanisms must not eat other, new org|Thomas Pollack|1|Unknown|1|Unknown|220|6th |Lane|Suite 140|Midway|Williamson County|TN|31904|United States|-5|0.03|
+7|AAAAAAAAHAAAAAAA|1997-03-13|||ation|297|8954883|8AM-4PM|David Thomas|9|Unknown|Architects coul|Thomas Benton|1|Unknown|1|Unknown|811|Lee |Circle|Suite T|Midway|Williamson County|TN|31904|United States|-5|0.01|
+8|AAAAAAAAIAAAAAAA|1997-03-13|2000-03-12||eing|278|6995995|8AM-4PM|Brett Yates|2|Unknown|Various bars make most. Difficult levels introduce at a boots. Buildings welcome only never el|Dean Morrison|1|Unknown|1|Unknown|226|12th |Lane|Suite D|Fairview|Williamson County|TN|35709|United States|-5|0.08|
+9|AAAAAAAAIAAAAAAA|2000-03-13|||eing|271|6995995|8AM-4PM|Brett Yates|2|Unknown|Formal, psychological pounds relate reasonable, young principles. Black, |Dean Morrison|1|Unknown|1|Unknown|226|Hill |Boulevard|Suite 190|Midway|Williamson County|TN|31904|United States|-5|0.08|
+10|AAAAAAAAKAAAAAAA|1997-03-13|1999-03-13||bar|294|9294113|8AM-4PM|Raymond Jacobs|8|Unknown|Little expectations include yet forward meetings.|Michael Wilson|1|Unknown|1|Unknown|175|4th |Court|Suite C|Midway|Williamson County|TN|31904|United States|-5|0.06|
+11|AAAAAAAAKAAAAAAA|1999-03-14|2001-03-12||ought|294|9294113|8AM-4PM|Raymond Jacobs|6|Unknown|Mysterious employe|Michael Wilson|1|Unknown|1|Unknown|175|Park Green|Court|Suite 160|Midway|Williamson County|TN|31904|United States|-5|0.11|
+12|AAAAAAAAKAAAAAAA|2001-03-13|||ought|294|5219562|8AM-12AM|Robert Thompson|6|Unknown|Events develop i|Dustin Kelly|1|Unknown|1|Unknown|337|College |Boulevard|Suite 100|Fairview|Williamson County|TN|31904|United States|-5|0.01|

Modified: hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
URL: http://svn.apache.org/viewvc/hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java?rev=1633468&r1=1633467&r2=1633468&view=diff
==============================================================================
--- hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java (original)
+++ hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java Tue Oct 21 21:12:49 2014
@@ -1021,11 +1021,17 @@ public class StatsRulesProcFactory {
    */
   public static class JoinStatsRule extends DefaultStatsRule implements NodeProcessor {
 
+    private boolean pkfkInferred = false;
+    private long newNumRows = 0;
+    private List<Operator<? extends OperatorDesc>> parents;
+    private CommonJoinOperator<? extends JoinDesc> jop;
+    private int numAttr = 1;
+
     @Override
     public Object process(Node nd, Stack<Node> stack, NodeProcessorCtx procCtx,
         Object... nodeOutputs) throws SemanticException {
-      CommonJoinOperator<? extends JoinDesc> jop = (CommonJoinOperator<? extends JoinDesc>) nd;
-      List<Operator<? extends OperatorDesc>> parents = jop.getParentOperators();
+      jop = (CommonJoinOperator<? extends JoinDesc>) nd;
+      parents = jop.getParentOperators();
       AnnotateStatsProcCtx aspCtx = (AnnotateStatsProcCtx) procCtx;
       HiveConf conf = aspCtx.getConf();
       boolean allStatsAvail = true;
@@ -1052,22 +1058,25 @@ public class StatsRulesProcFactory {
           Statistics stats = new Statistics();
           Map<String, Long> rowCountParents = new HashMap<String, Long>();
           List<Long> distinctVals = Lists.newArrayList();
-
-          // 2 relations, multiple attributes
-          boolean multiAttr = false;
-          int numAttr = 1;
           int numParent = parents.size();
-
           Map<String, ColStatistics> joinedColStats = Maps.newHashMap();
           Map<Integer, List<String>> joinKeys = Maps.newHashMap();
           List<Long> rowCounts = Lists.newArrayList();
 
+          // detect if there are multiple attributes in join key
+          ReduceSinkOperator rsOp = (ReduceSinkOperator) jop.getParentOperators().get(0);
+          List<ExprNodeDesc> keyExprs = rsOp.getConf().getKeyCols();
+          numAttr = keyExprs.size();
+
+          // infer PK-FK relationship in single attribute join case
+          inferPKFKRelationship();
+
           // get the join keys from parent ReduceSink operators
           for (int pos = 0; pos < parents.size(); pos++) {
             ReduceSinkOperator parent = (ReduceSinkOperator) jop.getParentOperators().get(pos);
 
             Statistics parentStats = parent.getStatistics();
-            List<ExprNodeDesc> keyExprs = parent.getConf().getKeyCols();
+            keyExprs = parent.getConf().getKeyCols();
 
             // Parent RS may have column statistics from multiple parents.
             // Populate table alias to row count map, this will be used later to
@@ -1082,12 +1091,6 @@ public class StatsRulesProcFactory {
             }
             rowCounts.add(parentStats.getNumRows());
 
-            // multi-attribute join key
-            if (keyExprs.size() > 1) {
-              multiAttr = true;
-              numAttr = keyExprs.size();
-            }
-
             // compute fully qualified join key column names. this name will be
             // used to quickly look-up for column statistics of join key.
             // TODO: expressions in join condition will be ignored. assign
@@ -1110,7 +1113,7 @@ public class StatsRulesProcFactory {
           // attribute join, else max(V(R,y1), V(S,y1)) * max(V(R,y2), V(S,y2))
           // in case of multi-attribute join
           long denom = 1;
-          if (multiAttr) {
+          if (numAttr > 1) {
             List<Long> perAttrDVs = Lists.newArrayList();
             for (int idx = 0; idx < numAttr; idx++) {
               for (Integer i : joinKeys.keySet()) {
@@ -1149,9 +1152,7 @@ public class StatsRulesProcFactory {
           }
 
           // Update NDV of joined columns to be min(V(R,y), V(S,y))
-          if (multiAttr) {
-            updateJoinColumnsNDV(joinKeys, joinedColStats, numAttr);
-          }
+          updateJoinColumnsNDV(joinKeys, joinedColStats, numAttr);
 
           // column statistics from different sources are put together and rename
           // fully qualified column names based on output schema of join operator
@@ -1181,10 +1182,9 @@ public class StatsRulesProcFactory {
 
           // update join statistics
           stats.setColumnStats(outColStats);
-          long newRowCount = computeNewRowCount(rowCounts, denom);
+          long newRowCount = pkfkInferred ? newNumRows : computeNewRowCount(rowCounts, denom);
 
-          updateStatsForJoinType(stats, newRowCount, jop, rowCountParents,
-              outInTabAlias);
+          updateStatsForJoinType(stats, newRowCount, jop, rowCountParents,outInTabAlias);
           jop.setStatistics(stats);
 
           if (isDebugEnabled) {
@@ -1229,6 +1229,146 @@ public class StatsRulesProcFactory {
       return null;
     }
 
+    private void inferPKFKRelationship() {
+      if (numAttr == 1) {
+        List<Integer> parentsWithPK = getPrimaryKeyCandidates(parents);
+
+        // in case of fact to many dimensional tables join, the join key in fact table will be
+        // mostly foreign key which will have corresponding primary key in dimension table.
+        // The selectivity of fact table in that case will be product of all selectivities of
+        // dimension tables (assumes conjunctivity)
+        for (Integer id : parentsWithPK) {
+          ColStatistics csPK = null;
+          Operator<? extends OperatorDesc> parent = parents.get(id);
+          for (ColStatistics cs : parent.getStatistics().getColumnStats()) {
+            if (cs.isPrimaryKey()) {
+              csPK = cs;
+              break;
+            }
+          }
+
+          // infer foreign key candidates positions
+          List<Integer> parentsWithFK = getForeignKeyCandidates(parents, csPK);
+          if (parentsWithFK.size() == 1 &&
+              parentsWithFK.size() + parentsWithPK.size() == parents.size()) {
+            Operator<? extends OperatorDesc> parentWithFK = parents.get(parentsWithFK.get(0));
+            List<Float> parentsSel = getSelectivity(parents, parentsWithPK);
+            Float prodSelectivity = 1.0f;
+            for (Float selectivity : parentsSel) {
+              prodSelectivity *= selectivity;
+            }
+            newNumRows = (long) (parentWithFK.getStatistics().getNumRows() * prodSelectivity);
+            pkfkInferred = true;
+
+            // some debug information
+            if (isDebugEnabled) {
+              List<String> parentIds = Lists.newArrayList();
+
+              // print primary key containing parents
+              for (Integer i : parentsWithPK) {
+                parentIds.add(parents.get(i).toString());
+              }
+              LOG.debug("STATS-" + jop.toString() + ": PK parent id(s) - " + parentIds);
+              parentIds.clear();
+
+              // print foreign key containing parents
+              for (Integer i : parentsWithFK) {
+                parentIds.add(parents.get(i).toString());
+              }
+              LOG.debug("STATS-" + jop.toString() + ": FK parent id(s) - " + parentIds);
+            }
+          }
+        }
+      }
+    }
+
+    /**
+     * Get selectivity of reduce sink operators.
+     * @param ops - reduce sink operators
+     * @param opsWithPK - reduce sink operators with primary keys
+     * @return - list of selectivity for primary key containing operators
+     */
+    private List<Float> getSelectivity(List<Operator<? extends OperatorDesc>> ops,
+        List<Integer> opsWithPK) {
+      List<Float> result = Lists.newArrayList();
+      for (Integer idx : opsWithPK) {
+        Operator<? extends OperatorDesc> op = ops.get(idx);
+        TableScanOperator tsOp = OperatorUtils
+            .findSingleOperatorUpstream(op, TableScanOperator.class);
+        long inputRow = tsOp.getStatistics().getNumRows();
+        long outputRow = op.getStatistics().getNumRows();
+        result.add((float) outputRow / (float) inputRow);
+      }
+      return result;
+    }
+
+    /**
+     * Returns the index of parents whose join key column statistics ranges are within the specified
+     * primary key range (inferred as foreign keys).
+     * @param ops - operators
+     * @param csPK - column statistics of primary key
+     * @return - list of foreign key containing parent ids
+     */
+    private List<Integer> getForeignKeyCandidates(List<Operator<? extends OperatorDesc>> ops,
+        ColStatistics csPK) {
+      List<Integer> result = Lists.newArrayList();
+      if (csPK == null || ops == null) {
+        return result;
+      }
+
+      for (int i = 0; i < ops.size(); i++) {
+        Operator<? extends OperatorDesc> op = ops.get(i);
+        if (op != null && op instanceof ReduceSinkOperator) {
+          ReduceSinkOperator rsOp = (ReduceSinkOperator) op;
+          List<ExprNodeDesc> keys = rsOp.getConf().getKeyCols();
+          List<String> fqCols = StatsUtils.getFullQualifedColNameFromExprs(keys,
+              rsOp.getColumnExprMap());
+          if (fqCols.size() == 1) {
+            String joinCol = fqCols.get(0);
+            if (rsOp.getStatistics() != null) {
+              ColStatistics cs = rsOp.getStatistics().getColumnStatisticsFromFQColName(joinCol);
+              if (cs != null && !cs.isPrimaryKey()) {
+                if (StatsUtils.inferForeignKey(csPK, cs)) {
+                  result.add(i);
+                }
+              }
+            }
+          }
+        }
+      }
+      return result;
+    }
+
+    /**
+     * Returns the index of parents whose join key columns are infer as primary keys
+     * @param ops - operators
+     * @return - list of primary key containing parent ids
+     */
+    private List<Integer> getPrimaryKeyCandidates(List<Operator<? extends OperatorDesc>> ops) {
+      List<Integer> result = Lists.newArrayList();
+      if (ops != null || !ops.isEmpty()) {
+        for (int i = 0; i < ops.size(); i++) {
+          Operator<? extends OperatorDesc> op = ops.get(i);
+          if (op instanceof ReduceSinkOperator) {
+            ReduceSinkOperator rsOp = (ReduceSinkOperator) op;
+            List<ExprNodeDesc> keys = rsOp.getConf().getKeyCols();
+            List<String> fqCols = StatsUtils.getFullQualifedColNameFromExprs(keys,
+                rsOp.getColumnExprMap());
+            if (fqCols.size() == 1) {
+              String joinCol = fqCols.get(0);
+              if (rsOp.getStatistics() != null) {
+                ColStatistics cs = rsOp.getStatistics().getColumnStatisticsFromFQColName(joinCol);
+                if (cs != null && cs.isPrimaryKey()) {
+                  result.add(i);
+                }
+              }
+            }
+          }
+        }
+      }
+      return result;
+    }
+
     private Long getEasedOutDenominator(List<Long> distinctVals) {
       // Exponential back-off for NDVs.
       // 1) Descending order sort of NDVs

Modified: hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/plan/ColStatistics.java
URL: http://svn.apache.org/viewvc/hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/plan/ColStatistics.java?rev=1633468&r1=1633467&r2=1633468&view=diff
==============================================================================
--- hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/plan/ColStatistics.java (original)
+++ hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/plan/ColStatistics.java Tue Oct 21 21:12:49 2014
@@ -33,12 +33,14 @@ public class ColStatistics {
   private long numTrues;
   private long numFalses;
   private Range range;
+  private boolean isPrimaryKey;
 
   public ColStatistics(String tabAlias, String colName, String colType) {
     this.setTableAlias(tabAlias);
     this.setColumnName(colName);
     this.setColumnType(colType);
     this.setFullyQualifiedColName(StatsUtils.getFullyQualifiedColumnName(tabAlias, colName));
+    this.setPrimaryKey(false);
   }
 
   public ColStatistics() {
@@ -150,6 +152,12 @@ public class ColStatistics {
     sb.append(numTrues);
     sb.append(" numFalses: ");
     sb.append(numFalses);
+    if (range != null) {
+      sb.append(" ");
+      sb.append(range);
+    }
+    sb.append(" isPrimaryKey: ");
+    sb.append(isPrimaryKey);
     return sb.toString();
   }
 
@@ -162,24 +170,47 @@ public class ColStatistics {
     clone.setNumNulls(numNulls);
     clone.setNumTrues(numTrues);
     clone.setNumFalses(numFalses);
+    clone.setPrimaryKey(isPrimaryKey);
     if (range != null ) {
       clone.setRange(range.clone());
     }
     return clone;
   }
 
+  public boolean isPrimaryKey() {
+    return isPrimaryKey;
+  }
+
+  public void setPrimaryKey(boolean isPrimaryKey) {
+    this.isPrimaryKey = isPrimaryKey;
+  }
+
   public static class Range {
     public final Number minValue;
     public final Number maxValue;
+
     Range(Number minValue, Number maxValue) {
       super();
       this.minValue = minValue;
       this.maxValue = maxValue;
     }
+
     @Override
     public Range clone() {
       return new Range(minValue, maxValue);
     }
+
+    @Override
+    public String toString() {
+      StringBuilder sb = new StringBuilder();
+      sb.append("Range: [");
+      sb.append(" min: ");
+      sb.append(minValue);
+      sb.append(" max: ");
+      sb.append(maxValue);
+      sb.append(" ]");
+      return sb.toString();
+    }
   }
 
 }

Modified: hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
URL: http://svn.apache.org/viewvc/hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java?rev=1633468&r1=1633467&r2=1633468&view=diff
==============================================================================
--- hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java (original)
+++ hive/branches/branch-0.14/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java Tue Oct 21 21:12:49 2014
@@ -177,6 +177,9 @@ public class StatsUtils {
         colStats = getTableColumnStats(table, schema, neededColumns);
       }
 
+      // infer if any column can be primary key based on column statistics
+      inferAndSetPrimaryKey(stats.getNumRows(), colStats);
+
       stats.setColumnStatsState(deriveStatType(colStats, neededColumns));
       stats.addToColumnStats(colStats);
     } else if (partList != null) {
@@ -263,6 +266,9 @@ public class StatsUtils {
           addParitionColumnStats(neededColumns, referencedColumns, schema, table, partList,
               columnStats);
 
+          // infer if any column can be primary key based on column statistics
+          inferAndSetPrimaryKey(stats.getNumRows(), columnStats);
+
           stats.addToColumnStats(columnStats);
           State colState = deriveStatType(columnStats, referencedColumns);
           if (aggrStats.getPartsFound() != partNames.size() && colState != State.NONE) {
@@ -277,6 +283,58 @@ public class StatsUtils {
     return stats;
   }
 
+
+  /**
+   * Based on the provided column statistics and number of rows, this method infers if the column
+   * can be primary key. It checks if the difference between the min and max value is equal to
+   * number of rows specified.
+   * @param numRows - number of rows
+   * @param colStats - column statistics
+   */
+  public static void inferAndSetPrimaryKey(long numRows, List<ColStatistics> colStats) {
+    if (colStats != null) {
+      for (ColStatistics cs : colStats) {
+        if (cs != null && cs.getRange() != null && cs.getRange().minValue != null &&
+            cs.getRange().maxValue != null) {
+          if (numRows ==
+              ((cs.getRange().maxValue.longValue() - cs.getRange().minValue.longValue()) + 1)) {
+            cs.setPrimaryKey(true);
+          }
+        }
+      }
+    }
+  }
+
+  /**
+   * Infer foreign key relationship from given column statistics.
+   * @param csPK - column statistics of primary key
+   * @param csFK - column statistics of potential foreign key
+   * @return
+   */
+  public static boolean inferForeignKey(ColStatistics csPK, ColStatistics csFK) {
+    if (csPK != null && csFK != null) {
+      if (csPK.isPrimaryKey()) {
+        if (csPK.getRange() != null && csFK.getRange() != null) {
+          ColStatistics.Range pkRange = csPK.getRange();
+          ColStatistics.Range fkRange = csFK.getRange();
+          return isWithin(fkRange, pkRange);
+        }
+      }
+    }
+    return false;
+  }
+
+  private static boolean isWithin(ColStatistics.Range range1, ColStatistics.Range range2) {
+    if (range1.minValue != null && range2.minValue != null && range1.maxValue != null &&
+        range2.maxValue != null) {
+      if (range1.minValue.longValue() >= range2.minValue.longValue() &&
+          range1.maxValue.longValue() <= range2.maxValue.longValue()) {
+        return true;
+      }
+    }
+    return false;
+  }
+
   private static void addParitionColumnStats(List<String> neededColumns,
       List<String> referencedColumns, List<ColumnInfo> schema, Table table,
       PrunedPartitionList partList, List<ColStatistics> colStats)
@@ -533,6 +591,7 @@ public class StatsUtils {
       // Columns statistics for complex datatypes are not supported yet
       return null;
     }
+
     return cs;
   }
 

Added: hive/branches/branch-0.14/ql/src/test/queries/clientpositive/annotate_stats_join_pkfk.q
URL: http://svn.apache.org/viewvc/hive/branches/branch-0.14/ql/src/test/queries/clientpositive/annotate_stats_join_pkfk.q?rev=1633468&view=auto
==============================================================================
--- hive/branches/branch-0.14/ql/src/test/queries/clientpositive/annotate_stats_join_pkfk.q (added)
+++ hive/branches/branch-0.14/ql/src/test/queries/clientpositive/annotate_stats_join_pkfk.q Tue Oct 21 21:12:49 2014
@@ -0,0 +1,123 @@
+set hive.stats.fetch.column.stats=true;
+
+drop table store_sales;
+drop table store;
+drop table customer_address;
+
+-- s_store_sk is PK, ss_store_sk is FK
+-- ca_address_sk is PK, ss_addr_sk is FK
+
+create table store_sales
+(
+    ss_sold_date_sk           int,
+    ss_sold_time_sk           int,
+    ss_item_sk                int,
+    ss_customer_sk            int,
+    ss_cdemo_sk               int,
+    ss_hdemo_sk               int,
+    ss_addr_sk                int,
+    ss_store_sk               int,
+    ss_promo_sk               int,
+    ss_ticket_number          int,
+    ss_quantity               int,
+    ss_wholesale_cost         float,
+    ss_list_price             float,
+    ss_sales_price            float,
+    ss_ext_discount_amt       float,
+    ss_ext_sales_price        float,
+    ss_ext_wholesale_cost     float,
+    ss_ext_list_price         float,
+    ss_ext_tax                float,
+    ss_coupon_amt             float,
+    ss_net_paid               float,
+    ss_net_paid_inc_tax       float,
+    ss_net_profit             float
+)
+row format delimited fields terminated by '|';
+
+create table store
+(
+    s_store_sk                int,
+    s_store_id                string,
+    s_rec_start_date          string,
+    s_rec_end_date            string,
+    s_closed_date_sk          int,
+    s_store_name              string,
+    s_number_employees        int,
+    s_floor_space             int,
+    s_hours                   string,
+    s_manager                 string,
+    s_market_id               int,
+    s_geography_class         string,
+    s_market_desc             string,
+    s_market_manager          string,
+    s_division_id             int,
+    s_division_name           string,
+    s_company_id              int,
+    s_company_name            string,
+    s_street_number           string,
+    s_street_name             string,
+    s_street_type             string,
+    s_suite_number            string,
+    s_city                    string,
+    s_county                  string,
+    s_state                   string,
+    s_zip                     string,
+    s_country                 string,
+    s_gmt_offset              float,
+    s_tax_precentage          float
+)
+row format delimited fields terminated by '|';
+
+create table customer_address
+(
+    ca_address_sk             int,
+    ca_address_id             string,
+    ca_street_number          string,
+    ca_street_name            string,
+    ca_street_type            string,
+    ca_suite_number           string,
+    ca_city                   string,
+    ca_county                 string,
+    ca_state                  string,
+    ca_zip                    string,
+    ca_country                string,
+    ca_gmt_offset             float,
+    ca_location_type          string
+)
+row format delimited fields terminated by '|';
+
+load data local inpath '../../data/files/store.txt' overwrite into table store;
+load data local inpath '../../data/files/store_sales.txt' overwrite into table store_sales;
+load data local inpath '../../data/files/customer_address.txt' overwrite into table customer_address;
+
+analyze table store compute statistics;
+analyze table store compute statistics for columns s_store_sk, s_floor_space;
+analyze table store_sales compute statistics;
+analyze table store_sales compute statistics for columns ss_store_sk, ss_addr_sk, ss_quantity;
+analyze table customer_address compute statistics;
+analyze table customer_address compute statistics for columns ca_address_sk;
+
+explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk);
+
+explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) where s.s_store_sk > 0;
+
+explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) where s.s_company_id > 0 and ss.ss_quantity > 10;
+
+explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) where s.s_floor_space > 0;
+
+explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) where ss.ss_quantity > 10;
+
+explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join store s1 on (s1.s_store_sk = ss.ss_store_sk);
+
+explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join store s1 on (s1.s_store_sk = ss.ss_store_sk) where s.s_store_sk > 1000;
+
+explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join store s1 on (s1.s_store_sk = ss.ss_store_sk) where s.s_floor_space > 1000;
+
+explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join store s1 on (s1.s_store_sk = ss.ss_store_sk) where ss.ss_quantity > 10;
+
+explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join customer_address ca on (ca.ca_address_sk = ss.ss_addr_sk);
+
+drop table store_sales;
+drop table store;
+drop table customer_address;

Added: hive/branches/branch-0.14/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out
URL: http://svn.apache.org/viewvc/hive/branches/branch-0.14/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out?rev=1633468&view=auto
==============================================================================
--- hive/branches/branch-0.14/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out (added)
+++ hive/branches/branch-0.14/ql/src/test/results/clientpositive/annotate_stats_join_pkfk.q.out Tue Oct 21 21:12:49 2014
@@ -0,0 +1,987 @@
+PREHOOK: query: drop table store_sales
+PREHOOK: type: DROPTABLE
+POSTHOOK: query: drop table store_sales
+POSTHOOK: type: DROPTABLE
+PREHOOK: query: drop table store
+PREHOOK: type: DROPTABLE
+POSTHOOK: query: drop table store
+POSTHOOK: type: DROPTABLE
+PREHOOK: query: drop table customer_address
+PREHOOK: type: DROPTABLE
+POSTHOOK: query: drop table customer_address
+POSTHOOK: type: DROPTABLE
+PREHOOK: query: -- s_store_sk is PK, ss_store_sk is FK
+-- ca_address_sk is PK, ss_addr_sk is FK
+
+create table store_sales
+(
+    ss_sold_date_sk           int,
+    ss_sold_time_sk           int,
+    ss_item_sk                int,
+    ss_customer_sk            int,
+    ss_cdemo_sk               int,
+    ss_hdemo_sk               int,
+    ss_addr_sk                int,
+    ss_store_sk               int,
+    ss_promo_sk               int,
+    ss_ticket_number          int,
+    ss_quantity               int,
+    ss_wholesale_cost         float,
+    ss_list_price             float,
+    ss_sales_price            float,
+    ss_ext_discount_amt       float,
+    ss_ext_sales_price        float,
+    ss_ext_wholesale_cost     float,
+    ss_ext_list_price         float,
+    ss_ext_tax                float,
+    ss_coupon_amt             float,
+    ss_net_paid               float,
+    ss_net_paid_inc_tax       float,
+    ss_net_profit             float
+)
+row format delimited fields terminated by '|'
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@store_sales
+POSTHOOK: query: -- s_store_sk is PK, ss_store_sk is FK
+-- ca_address_sk is PK, ss_addr_sk is FK
+
+create table store_sales
+(
+    ss_sold_date_sk           int,
+    ss_sold_time_sk           int,
+    ss_item_sk                int,
+    ss_customer_sk            int,
+    ss_cdemo_sk               int,
+    ss_hdemo_sk               int,
+    ss_addr_sk                int,
+    ss_store_sk               int,
+    ss_promo_sk               int,
+    ss_ticket_number          int,
+    ss_quantity               int,
+    ss_wholesale_cost         float,
+    ss_list_price             float,
+    ss_sales_price            float,
+    ss_ext_discount_amt       float,
+    ss_ext_sales_price        float,
+    ss_ext_wholesale_cost     float,
+    ss_ext_list_price         float,
+    ss_ext_tax                float,
+    ss_coupon_amt             float,
+    ss_net_paid               float,
+    ss_net_paid_inc_tax       float,
+    ss_net_profit             float
+)
+row format delimited fields terminated by '|'
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@store_sales
+PREHOOK: query: create table store
+(
+    s_store_sk                int,
+    s_store_id                string,
+    s_rec_start_date          string,
+    s_rec_end_date            string,
+    s_closed_date_sk          int,
+    s_store_name              string,
+    s_number_employees        int,
+    s_floor_space             int,
+    s_hours                   string,
+    s_manager                 string,
+    s_market_id               int,
+    s_geography_class         string,
+    s_market_desc             string,
+    s_market_manager          string,
+    s_division_id             int,
+    s_division_name           string,
+    s_company_id              int,
+    s_company_name            string,
+    s_street_number           string,
+    s_street_name             string,
+    s_street_type             string,
+    s_suite_number            string,
+    s_city                    string,
+    s_county                  string,
+    s_state                   string,
+    s_zip                     string,
+    s_country                 string,
+    s_gmt_offset              float,
+    s_tax_precentage          float
+)
+row format delimited fields terminated by '|'
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@store
+POSTHOOK: query: create table store
+(
+    s_store_sk                int,
+    s_store_id                string,
+    s_rec_start_date          string,
+    s_rec_end_date            string,
+    s_closed_date_sk          int,
+    s_store_name              string,
+    s_number_employees        int,
+    s_floor_space             int,
+    s_hours                   string,
+    s_manager                 string,
+    s_market_id               int,
+    s_geography_class         string,
+    s_market_desc             string,
+    s_market_manager          string,
+    s_division_id             int,
+    s_division_name           string,
+    s_company_id              int,
+    s_company_name            string,
+    s_street_number           string,
+    s_street_name             string,
+    s_street_type             string,
+    s_suite_number            string,
+    s_city                    string,
+    s_county                  string,
+    s_state                   string,
+    s_zip                     string,
+    s_country                 string,
+    s_gmt_offset              float,
+    s_tax_precentage          float
+)
+row format delimited fields terminated by '|'
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@store
+PREHOOK: query: create table customer_address
+(
+    ca_address_sk             int,
+    ca_address_id             string,
+    ca_street_number          string,
+    ca_street_name            string,
+    ca_street_type            string,
+    ca_suite_number           string,
+    ca_city                   string,
+    ca_county                 string,
+    ca_state                  string,
+    ca_zip                    string,
+    ca_country                string,
+    ca_gmt_offset             float,
+    ca_location_type          string
+)
+row format delimited fields terminated by '|'
+PREHOOK: type: CREATETABLE
+PREHOOK: Output: database:default
+PREHOOK: Output: default@customer_address
+POSTHOOK: query: create table customer_address
+(
+    ca_address_sk             int,
+    ca_address_id             string,
+    ca_street_number          string,
+    ca_street_name            string,
+    ca_street_type            string,
+    ca_suite_number           string,
+    ca_city                   string,
+    ca_county                 string,
+    ca_state                  string,
+    ca_zip                    string,
+    ca_country                string,
+    ca_gmt_offset             float,
+    ca_location_type          string
+)
+row format delimited fields terminated by '|'
+POSTHOOK: type: CREATETABLE
+POSTHOOK: Output: database:default
+POSTHOOK: Output: default@customer_address
+PREHOOK: query: load data local inpath '../../data/files/store.txt' overwrite into table store
+PREHOOK: type: LOAD
+#### A masked pattern was here ####
+PREHOOK: Output: default@store
+POSTHOOK: query: load data local inpath '../../data/files/store.txt' overwrite into table store
+POSTHOOK: type: LOAD
+#### A masked pattern was here ####
+POSTHOOK: Output: default@store
+PREHOOK: query: load data local inpath '../../data/files/store_sales.txt' overwrite into table store_sales
+PREHOOK: type: LOAD
+#### A masked pattern was here ####
+PREHOOK: Output: default@store_sales
+POSTHOOK: query: load data local inpath '../../data/files/store_sales.txt' overwrite into table store_sales
+POSTHOOK: type: LOAD
+#### A masked pattern was here ####
+POSTHOOK: Output: default@store_sales
+PREHOOK: query: load data local inpath '../../data/files/customer_address.txt' overwrite into table customer_address
+PREHOOK: type: LOAD
+#### A masked pattern was here ####
+PREHOOK: Output: default@customer_address
+POSTHOOK: query: load data local inpath '../../data/files/customer_address.txt' overwrite into table customer_address
+POSTHOOK: type: LOAD
+#### A masked pattern was here ####
+POSTHOOK: Output: default@customer_address
+PREHOOK: query: analyze table store compute statistics
+PREHOOK: type: QUERY
+PREHOOK: Input: default@store
+PREHOOK: Output: default@store
+POSTHOOK: query: analyze table store compute statistics
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@store
+POSTHOOK: Output: default@store
+PREHOOK: query: analyze table store compute statistics for columns s_store_sk, s_floor_space
+PREHOOK: type: QUERY
+PREHOOK: Input: default@store
+#### A masked pattern was here ####
+POSTHOOK: query: analyze table store compute statistics for columns s_store_sk, s_floor_space
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@store
+#### A masked pattern was here ####
+PREHOOK: query: analyze table store_sales compute statistics
+PREHOOK: type: QUERY
+PREHOOK: Input: default@store_sales
+PREHOOK: Output: default@store_sales
+POSTHOOK: query: analyze table store_sales compute statistics
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@store_sales
+POSTHOOK: Output: default@store_sales
+PREHOOK: query: analyze table store_sales compute statistics for columns ss_store_sk, ss_addr_sk, ss_quantity
+PREHOOK: type: QUERY
+PREHOOK: Input: default@store_sales
+#### A masked pattern was here ####
+POSTHOOK: query: analyze table store_sales compute statistics for columns ss_store_sk, ss_addr_sk, ss_quantity
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@store_sales
+#### A masked pattern was here ####
+PREHOOK: query: analyze table customer_address compute statistics
+PREHOOK: type: QUERY
+PREHOOK: Input: default@customer_address
+PREHOOK: Output: default@customer_address
+POSTHOOK: query: analyze table customer_address compute statistics
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@customer_address
+POSTHOOK: Output: default@customer_address
+PREHOOK: query: analyze table customer_address compute statistics for columns ca_address_sk
+PREHOOK: type: QUERY
+PREHOOK: Input: default@customer_address
+#### A masked pattern was here ####
+POSTHOOK: query: analyze table customer_address compute statistics for columns ca_address_sk
+POSTHOOK: type: QUERY
+POSTHOOK: Input: default@customer_address
+#### A masked pattern was here ####
+PREHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk)
+PREHOOK: type: QUERY
+POSTHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk)
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Map Reduce
+      Map Operator Tree:
+          TableScan
+            alias: s
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: s_store_sk is not null (type: boolean)
+              Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: ss
+            Statistics: Num rows: 1000 Data size: 130523 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: ss_store_sk is not null (type: boolean)
+              Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: ss_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: ss_store_sk (type: int)
+                Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE
+      Reduce Operator Tree:
+        Join Operator
+          condition map:
+               Inner Join 0 to 1
+          condition expressions:
+            0 {KEY.reducesinkkey0}
+            1 
+          outputColumnNames: _col0
+          Statistics: Num rows: 964 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE
+          Select Operator
+            expressions: _col0 (type: int)
+            outputColumnNames: _col0
+            Statistics: Num rows: 964 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE
+            File Output Operator
+              compressed: false
+              Statistics: Num rows: 964 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE
+              table:
+                  input format: org.apache.hadoop.mapred.TextInputFormat
+                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) where s.s_store_sk > 0
+PREHOOK: type: QUERY
+POSTHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) where s.s_store_sk > 0
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Map Reduce
+      Map Operator Tree:
+          TableScan
+            alias: s
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: (s_store_sk is not null and (s_store_sk > 0)) (type: boolean)
+              Statistics: Num rows: 4 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 4 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: ss
+            Statistics: Num rows: 1000 Data size: 130523 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: (ss_store_sk is not null and (ss_store_sk > 0)) (type: boolean)
+              Statistics: Num rows: 321 Data size: 1236 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: ss_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: ss_store_sk (type: int)
+                Statistics: Num rows: 321 Data size: 1236 Basic stats: COMPLETE Column stats: COMPLETE
+      Reduce Operator Tree:
+        Join Operator
+          condition map:
+               Inner Join 0 to 1
+          condition expressions:
+            0 {KEY.reducesinkkey0}
+            1 
+          outputColumnNames: _col0
+          Statistics: Num rows: 107 Data size: 428 Basic stats: COMPLETE Column stats: COMPLETE
+          Select Operator
+            expressions: _col0 (type: int)
+            outputColumnNames: _col0
+            Statistics: Num rows: 107 Data size: 428 Basic stats: COMPLETE Column stats: COMPLETE
+            File Output Operator
+              compressed: false
+              Statistics: Num rows: 107 Data size: 428 Basic stats: COMPLETE Column stats: COMPLETE
+              table:
+                  input format: org.apache.hadoop.mapred.TextInputFormat
+                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) where s.s_company_id > 0 and ss.ss_quantity > 10
+PREHOOK: type: QUERY
+POSTHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) where s.s_company_id > 0 and ss.ss_quantity > 10
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Map Reduce
+      Map Operator Tree:
+          TableScan
+            alias: s
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: PARTIAL
+            Filter Operator
+              predicate: (s_store_sk is not null and (s_company_id > 0)) (type: boolean)
+              Statistics: Num rows: 4 Data size: 16 Basic stats: COMPLETE Column stats: PARTIAL
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 4 Data size: 16 Basic stats: COMPLETE Column stats: PARTIAL
+          TableScan
+            alias: ss
+            Statistics: Num rows: 1000 Data size: 130523 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: (ss_store_sk is not null and (ss_quantity > 10)) (type: boolean)
+              Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: ss_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: ss_store_sk (type: int)
+                Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE
+      Reduce Operator Tree:
+        Join Operator
+          condition map:
+               Inner Join 0 to 1
+          condition expressions:
+            0 {KEY.reducesinkkey0}
+            1 
+          outputColumnNames: _col0
+          Statistics: Num rows: 107 Data size: 428 Basic stats: COMPLETE Column stats: PARTIAL
+          Select Operator
+            expressions: _col0 (type: int)
+            outputColumnNames: _col0
+            Statistics: Num rows: 107 Data size: 428 Basic stats: COMPLETE Column stats: PARTIAL
+            File Output Operator
+              compressed: false
+              Statistics: Num rows: 107 Data size: 428 Basic stats: COMPLETE Column stats: PARTIAL
+              table:
+                  input format: org.apache.hadoop.mapred.TextInputFormat
+                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) where s.s_floor_space > 0
+PREHOOK: type: QUERY
+POSTHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) where s.s_floor_space > 0
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Map Reduce
+      Map Operator Tree:
+          TableScan
+            alias: s
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: (s_store_sk is not null and (s_floor_space > 0)) (type: boolean)
+              Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: ss
+            Statistics: Num rows: 1000 Data size: 130523 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: ss_store_sk is not null (type: boolean)
+              Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: ss_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: ss_store_sk (type: int)
+                Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE
+      Reduce Operator Tree:
+        Join Operator
+          condition map:
+               Inner Join 0 to 1
+          condition expressions:
+            0 {KEY.reducesinkkey0}
+            1 
+          outputColumnNames: _col0
+          Statistics: Num rows: 321 Data size: 1284 Basic stats: COMPLETE Column stats: COMPLETE
+          Select Operator
+            expressions: _col0 (type: int)
+            outputColumnNames: _col0
+            Statistics: Num rows: 321 Data size: 1284 Basic stats: COMPLETE Column stats: COMPLETE
+            File Output Operator
+              compressed: false
+              Statistics: Num rows: 321 Data size: 1284 Basic stats: COMPLETE Column stats: COMPLETE
+              table:
+                  input format: org.apache.hadoop.mapred.TextInputFormat
+                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) where ss.ss_quantity > 10
+PREHOOK: type: QUERY
+POSTHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) where ss.ss_quantity > 10
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Map Reduce
+      Map Operator Tree:
+          TableScan
+            alias: s
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: s_store_sk is not null (type: boolean)
+              Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: ss
+            Statistics: Num rows: 1000 Data size: 130523 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: (ss_store_sk is not null and (ss_quantity > 10)) (type: boolean)
+              Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: ss_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: ss_store_sk (type: int)
+                Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE
+      Reduce Operator Tree:
+        Join Operator
+          condition map:
+               Inner Join 0 to 1
+          condition expressions:
+            0 {KEY.reducesinkkey0}
+            1 
+          outputColumnNames: _col0
+          Statistics: Num rows: 321 Data size: 1284 Basic stats: COMPLETE Column stats: COMPLETE
+          Select Operator
+            expressions: _col0 (type: int)
+            outputColumnNames: _col0
+            Statistics: Num rows: 321 Data size: 1284 Basic stats: COMPLETE Column stats: COMPLETE
+            File Output Operator
+              compressed: false
+              Statistics: Num rows: 321 Data size: 1284 Basic stats: COMPLETE Column stats: COMPLETE
+              table:
+                  input format: org.apache.hadoop.mapred.TextInputFormat
+                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join store s1 on (s1.s_store_sk = ss.ss_store_sk)
+PREHOOK: type: QUERY
+POSTHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join store s1 on (s1.s_store_sk = ss.ss_store_sk)
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Map Reduce
+      Map Operator Tree:
+          TableScan
+            alias: s1
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: s_store_sk is not null (type: boolean)
+              Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: s
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: s_store_sk is not null (type: boolean)
+              Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: ss
+            Statistics: Num rows: 1000 Data size: 130523 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: ss_store_sk is not null (type: boolean)
+              Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: ss_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: ss_store_sk (type: int)
+                Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE
+      Reduce Operator Tree:
+        Join Operator
+          condition map:
+               Inner Join 0 to 1
+               Inner Join 1 to 2
+          condition expressions:
+            0 {KEY.reducesinkkey0}
+            1 
+            2 
+          outputColumnNames: _col0
+          Statistics: Num rows: 964 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE
+          Select Operator
+            expressions: _col0 (type: int)
+            outputColumnNames: _col0
+            Statistics: Num rows: 964 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE
+            File Output Operator
+              compressed: false
+              Statistics: Num rows: 964 Data size: 3856 Basic stats: COMPLETE Column stats: COMPLETE
+              table:
+                  input format: org.apache.hadoop.mapred.TextInputFormat
+                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join store s1 on (s1.s_store_sk = ss.ss_store_sk) where s.s_store_sk > 1000
+PREHOOK: type: QUERY
+POSTHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join store s1 on (s1.s_store_sk = ss.ss_store_sk) where s.s_store_sk > 1000
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Map Reduce
+      Map Operator Tree:
+          TableScan
+            alias: s1
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: (s_store_sk is not null and (s_store_sk > 1000)) (type: boolean)
+              Statistics: Num rows: 4 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 4 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: s
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: (s_store_sk is not null and (s_store_sk > 1000)) (type: boolean)
+              Statistics: Num rows: 4 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 4 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: ss
+            Statistics: Num rows: 1000 Data size: 130523 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: (ss_store_sk is not null and (ss_store_sk > 1000)) (type: boolean)
+              Statistics: Num rows: 321 Data size: 1236 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: ss_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: ss_store_sk (type: int)
+                Statistics: Num rows: 321 Data size: 1236 Basic stats: COMPLETE Column stats: COMPLETE
+      Reduce Operator Tree:
+        Join Operator
+          condition map:
+               Inner Join 0 to 1
+               Inner Join 1 to 2
+          condition expressions:
+            0 {KEY.reducesinkkey0}
+            1 
+            2 
+          outputColumnNames: _col0
+          Statistics: Num rows: 35 Data size: 140 Basic stats: COMPLETE Column stats: COMPLETE
+          Select Operator
+            expressions: _col0 (type: int)
+            outputColumnNames: _col0
+            Statistics: Num rows: 35 Data size: 140 Basic stats: COMPLETE Column stats: COMPLETE
+            File Output Operator
+              compressed: false
+              Statistics: Num rows: 35 Data size: 140 Basic stats: COMPLETE Column stats: COMPLETE
+              table:
+                  input format: org.apache.hadoop.mapred.TextInputFormat
+                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join store s1 on (s1.s_store_sk = ss.ss_store_sk) where s.s_floor_space > 1000
+PREHOOK: type: QUERY
+POSTHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join store s1 on (s1.s_store_sk = ss.ss_store_sk) where s.s_floor_space > 1000
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Map Reduce
+      Map Operator Tree:
+          TableScan
+            alias: s1
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: s_store_sk is not null (type: boolean)
+              Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: s
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: (s_store_sk is not null and (s_floor_space > 1000)) (type: boolean)
+              Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 4 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: ss
+            Statistics: Num rows: 1000 Data size: 130523 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: ss_store_sk is not null (type: boolean)
+              Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: ss_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: ss_store_sk (type: int)
+                Statistics: Num rows: 964 Data size: 3716 Basic stats: COMPLETE Column stats: COMPLETE
+      Reduce Operator Tree:
+        Join Operator
+          condition map:
+               Inner Join 0 to 1
+               Inner Join 1 to 2
+          condition expressions:
+            0 {KEY.reducesinkkey0}
+            1 
+            2 
+          outputColumnNames: _col0
+          Statistics: Num rows: 321 Data size: 1284 Basic stats: COMPLETE Column stats: COMPLETE
+          Select Operator
+            expressions: _col0 (type: int)
+            outputColumnNames: _col0
+            Statistics: Num rows: 321 Data size: 1284 Basic stats: COMPLETE Column stats: COMPLETE
+            File Output Operator
+              compressed: false
+              Statistics: Num rows: 321 Data size: 1284 Basic stats: COMPLETE Column stats: COMPLETE
+              table:
+                  input format: org.apache.hadoop.mapred.TextInputFormat
+                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join store s1 on (s1.s_store_sk = ss.ss_store_sk) where ss.ss_quantity > 10
+PREHOOK: type: QUERY
+POSTHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join store s1 on (s1.s_store_sk = ss.ss_store_sk) where ss.ss_quantity > 10
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+    Map Reduce
+      Map Operator Tree:
+          TableScan
+            alias: s1
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: s_store_sk is not null (type: boolean)
+              Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: s
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: s_store_sk is not null (type: boolean)
+              Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: ss
+            Statistics: Num rows: 1000 Data size: 130523 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: (ss_store_sk is not null and (ss_quantity > 10)) (type: boolean)
+              Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: ss_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: ss_store_sk (type: int)
+                Statistics: Num rows: 321 Data size: 2460 Basic stats: COMPLETE Column stats: COMPLETE
+      Reduce Operator Tree:
+        Join Operator
+          condition map:
+               Inner Join 0 to 1
+               Inner Join 1 to 2
+          condition expressions:
+            0 {KEY.reducesinkkey0}
+            1 
+            2 
+          outputColumnNames: _col0
+          Statistics: Num rows: 321 Data size: 1284 Basic stats: COMPLETE Column stats: COMPLETE
+          Select Operator
+            expressions: _col0 (type: int)
+            outputColumnNames: _col0
+            Statistics: Num rows: 321 Data size: 1284 Basic stats: COMPLETE Column stats: COMPLETE
+            File Output Operator
+              compressed: false
+              Statistics: Num rows: 321 Data size: 1284 Basic stats: COMPLETE Column stats: COMPLETE
+              table:
+                  input format: org.apache.hadoop.mapred.TextInputFormat
+                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join customer_address ca on (ca.ca_address_sk = ss.ss_addr_sk)
+PREHOOK: type: QUERY
+POSTHOOK: query: explain select s.s_store_sk from store s join store_sales ss on (s.s_store_sk = ss.ss_store_sk) join customer_address ca on (ca.ca_address_sk = ss.ss_addr_sk)
+POSTHOOK: type: QUERY
+STAGE DEPENDENCIES:
+  Stage-2 is a root stage
+  Stage-1 depends on stages: Stage-2
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-2
+    Map Reduce
+      Map Operator Tree:
+          TableScan
+            alias: s
+            Statistics: Num rows: 12 Data size: 3143 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: s_store_sk is not null (type: boolean)
+              Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: s_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: s_store_sk (type: int)
+                Statistics: Num rows: 12 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            alias: ss
+            Statistics: Num rows: 1000 Data size: 130523 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: (ss_store_sk is not null and ss_addr_sk is not null) (type: boolean)
+              Statistics: Num rows: 916 Data size: 7012 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: ss_store_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: ss_store_sk (type: int)
+                Statistics: Num rows: 916 Data size: 7012 Basic stats: COMPLETE Column stats: COMPLETE
+                value expressions: ss_addr_sk (type: int)
+      Reduce Operator Tree:
+        Join Operator
+          condition map:
+               Inner Join 0 to 1
+          condition expressions:
+            0 {KEY.reducesinkkey0}
+            1 {VALUE._col6}
+          outputColumnNames: _col0, _col38
+          Statistics: Num rows: 916 Data size: 7328 Basic stats: COMPLETE Column stats: COMPLETE
+          File Output Operator
+            compressed: false
+            table:
+                input format: org.apache.hadoop.mapred.SequenceFileInputFormat
+                output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+                serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe
+
+  Stage: Stage-1
+    Map Reduce
+      Map Operator Tree:
+          TableScan
+            alias: ca
+            Statistics: Num rows: 20 Data size: 2114 Basic stats: COMPLETE Column stats: COMPLETE
+            Filter Operator
+              predicate: ca_address_sk is not null (type: boolean)
+              Statistics: Num rows: 20 Data size: 80 Basic stats: COMPLETE Column stats: COMPLETE
+              Reduce Output Operator
+                key expressions: ca_address_sk (type: int)
+                sort order: +
+                Map-reduce partition columns: ca_address_sk (type: int)
+                Statistics: Num rows: 20 Data size: 80 Basic stats: COMPLETE Column stats: COMPLETE
+          TableScan
+            Reduce Output Operator
+              key expressions: _col38 (type: int)
+              sort order: +
+              Map-reduce partition columns: _col38 (type: int)
+              Statistics: Num rows: 916 Data size: 7328 Basic stats: COMPLETE Column stats: COMPLETE
+              value expressions: _col0 (type: int)
+      Reduce Operator Tree:
+        Join Operator
+          condition map:
+               Inner Join 0 to 1
+          condition expressions:
+            0 {VALUE._col0}
+            1 
+          outputColumnNames: _col0
+          Statistics: Num rows: 916 Data size: 3664 Basic stats: COMPLETE Column stats: COMPLETE
+          Select Operator
+            expressions: _col0 (type: int)
+            outputColumnNames: _col0
+            Statistics: Num rows: 916 Data size: 3664 Basic stats: COMPLETE Column stats: COMPLETE
+            File Output Operator
+              compressed: false
+              Statistics: Num rows: 916 Data size: 3664 Basic stats: COMPLETE Column stats: COMPLETE
+              table:
+                  input format: org.apache.hadoop.mapred.TextInputFormat
+                  output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+
+  Stage: Stage-0
+    Fetch Operator
+      limit: -1
+      Processor Tree:
+        ListSink
+
+PREHOOK: query: drop table store_sales
+PREHOOK: type: DROPTABLE
+PREHOOK: Input: default@store_sales
+PREHOOK: Output: default@store_sales
+POSTHOOK: query: drop table store_sales
+POSTHOOK: type: DROPTABLE
+POSTHOOK: Input: default@store_sales
+POSTHOOK: Output: default@store_sales
+PREHOOK: query: drop table store
+PREHOOK: type: DROPTABLE
+PREHOOK: Input: default@store
+PREHOOK: Output: default@store
+POSTHOOK: query: drop table store
+POSTHOOK: type: DROPTABLE
+POSTHOOK: Input: default@store
+POSTHOOK: Output: default@store
+PREHOOK: query: drop table customer_address
+PREHOOK: type: DROPTABLE
+PREHOOK: Input: default@customer_address
+PREHOOK: Output: default@customer_address
+POSTHOOK: query: drop table customer_address
+POSTHOOK: type: DROPTABLE
+POSTHOOK: Input: default@customer_address
+POSTHOOK: Output: default@customer_address