You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ta...@apache.org on 2017/08/24 02:48:06 UTC

[1/6] incubator-impala git commit: IMPALA-5512: [DOCS] Distinguish GROUP BY and ORDER BY in SELECT syntax

Repository: incubator-impala
Updated Branches:
  refs/heads/master cb645b1bc -> c67b198a1


IMPALA-5512: [DOCS] Distinguish GROUP BY and ORDER BY in SELECT syntax

Turned one instance of GROUP BY in the syntax declaration
into ORDER BY. Moved NULLS FIRST and NULLS LAST from first
instance of GROUP BY to the new ORDER BY line.

Change-Id: I9df24cf5af4f2b0aabdb388b4e46a75e0e5275a7
Reviewed-on: http://gerrit.cloudera.org:8080/7789
Reviewed-by: Alex Behm <al...@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/91fc5b58
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/91fc5b58
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/91fc5b58

Branch: refs/heads/master
Commit: 91fc5b587926bca079bfbe52d7471d5460788418
Parents: cb645b1
Author: John Russell <jr...@cloudera.com>
Authored: Wed Aug 23 15:10:18 2017 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Wed Aug 23 23:55:04 2017 +0000

----------------------------------------------------------------------
 docs/topics/impala_select.xml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/91fc5b58/docs/topics/impala_select.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_select.xml b/docs/topics/impala_select.xml
index c28890a..52954e7 100644
--- a/docs/topics/impala_select.xml
+++ b/docs/topics/impala_select.xml
@@ -60,9 +60,9 @@ FROM <i>table_reference</i> [, <i>table_reference</i> ...]
   JOIN <i>table_reference</i>
   [ON <i>join_equality_clauses</i> | USING (<varname>col1</varname>[, <varname>col2</varname> ...]] ...
 WHERE <i>conditions</i>
-GROUP BY { <i>column</i> | <i>expression</i> [ASC | DESC] [NULLS FIRST | NULLS LAST] [, ...] }
+GROUP BY { <i>column</i> | <i>expression</i> [, ...] }
 HAVING <codeph>conditions</codeph>
-GROUP BY { <i>column</i> | <i>expression</i> [ASC | DESC] [, ...] }
+ORDER BY { <i>column</i> | <i>expression</i> [ASC | DESC] [NULLS FIRST | NULLS LAST] [, ...] }
 LIMIT <i>expression</i> [OFFSET <i>expression</i>]
 [UNION [ALL] <i>select_statement</i>] ...]
 </codeblock>


[3/6] incubator-impala git commit: IMPALA-5775: (Addendum) Make SSL cluster actually come up in test_client_ssl.py

Posted by ta...@apache.org.
IMPALA-5775: (Addendum) Make SSL cluster actually come up in test_client_ssl.py

The non-wildcard certs in test_client_ssl.py require that the hostname
of the process is 'localhost' for clients to validate them. This wasn't
the case for one test, and so the cluster wouldn't actually
start. Although the test would still pass (because the shell wasn't
actually checking the certificate), it's better hygiene to have the
cluster correctly configured to make sure we're testing what we think we
are.

Testing: test continues to pass

Change-Id: Idad8bbf3b8be853d3406bcbaed24909501500ea9
Reviewed-on: http://gerrit.cloudera.org:8080/7732
Reviewed-by: Henry Robinson <he...@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/81c3d883
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/81c3d883
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/81c3d883

Branch: refs/heads/master
Commit: 81c3d883b9be13f1afe766477f2d056afd9a3a8a
Parents: 74dad17
Author: Henry Robinson <he...@cloudera.com>
Authored: Fri Aug 18 16:22:29 2017 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Thu Aug 24 02:23:21 2017 +0000

----------------------------------------------------------------------
 tests/custom_cluster/test_client_ssl.py | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/81c3d883/tests/custom_cluster/test_client_ssl.py
----------------------------------------------------------------------
diff --git a/tests/custom_cluster/test_client_ssl.py b/tests/custom_cluster/test_client_ssl.py
index 4858ce0..487b802 100644
--- a/tests/custom_cluster/test_client_ssl.py
+++ b/tests/custom_cluster/test_client_ssl.py
@@ -61,10 +61,15 @@ class TestClientSsl(CustomClusterTestSuite):
                           "--ssl_private_key=%s/wildcard-san-cert.key"
                           % (CERT_DIR, CERT_DIR, CERT_DIR))
 
+  SSL_ARGS = ("--ssl_client_ca_certificate=%s/server-cert.pem "
+              "--ssl_server_certificate=%s/server-cert.pem "
+              "--ssl_private_key=%s/server-key.pem "
+              "--hostname=localhost " # Required to match hostname in certificate
+              % (CERT_DIR, CERT_DIR, CERT_DIR))
+
   @pytest.mark.execute_serially
-  @CustomClusterTestSuite.with_args("--ssl_server_certificate=%s/server-cert.pem "
-                                    "--ssl_private_key=%s/server-key.pem"
-                                    % (CERT_DIR, CERT_DIR))
+  @CustomClusterTestSuite.with_args(impalad_args=SSL_ARGS, statestored_args=SSL_ARGS,
+                                    catalogd_args=SSL_ARGS)
   def test_ssl(self, vector):
 
     self._verify_negative_cases()


[4/6] incubator-impala git commit: IMPALA-5602: Fix query optimization for kudu and datasource tables

Posted by ta...@apache.org.
IMPALA-5602: Fix query optimization for kudu and datasource tables

Fix a bug where the following queries on kudu and datasource tables
were incorrectly being optimized as a 'small query' and therefore
running on a single node with a single scanner thread:

(A) that have all their predicates pushed to the underlying storage
layer and have a limit
(B) table stats missing + Conditions in (A)

Testing:
Added frontend planner tests.

Change-Id: I93822d67ebda41d5d0456095c429e3915a3f40c4
Reviewed-on: http://gerrit.cloudera.org:8080/7560
Reviewed-by: Matthew Jacobs <mj...@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/6f20df81
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/6f20df81
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/6f20df81

Branch: refs/heads/master
Commit: 6f20df81f727f89dfb1efef9a86f39cf5a4ef88a
Parents: 81c3d88
Author: Bikramjeet Vig <bi...@cloudera.com>
Authored: Tue Aug 1 16:34:15 2017 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Thu Aug 24 02:32:13 2017 +0000

----------------------------------------------------------------------
 .../org/apache/impala/catalog/KuduTable.java    | 44 ++++++++++-------
 .../impala/planner/DataSourceScanNode.java      |  3 ++
 .../apache/impala/planner/HBaseScanNode.java    |  3 ++
 .../org/apache/impala/planner/KuduScanNode.java |  3 ++
 .../org/apache/impala/planner/ScanNode.java     | 16 +++++-
 .../impala/util/MaxRowsProcessedVisitor.java    |  5 +-
 .../apache/impala/common/FrontendTestBase.java  | 52 +++++++++++++-------
 .../org/apache/impala/planner/PlannerTest.java  |  3 ++
 .../queries/PlannerTest/data-source-tables.test | 15 ++++++
 .../queries/PlannerTest/kudu.test               | 27 ++++++++++
 10 files changed, 131 insertions(+), 40 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/6f20df81/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
----------------------------------------------------------------------
diff --git a/fe/src/main/java/org/apache/impala/catalog/KuduTable.java b/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
index cb94503..7e13ac5 100644
--- a/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
+++ b/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
@@ -182,6 +182,30 @@ public class KuduTable extends Table {
   }
 
   /**
+   * Load schema and partitioning schemes directly from Kudu.
+   */
+  public void loadSchemaFromKudu() throws ImpalaRuntimeException {
+    // This is set to 0 for Kudu tables.
+    // TODO: Change this to reflect the number of pk columns and modify all the
+    // places (e.g. insert stmt) that currently make use of this parameter.
+    numClusteringCols_ = 0;
+    org.apache.kudu.client.KuduTable kuduTable = null;
+    // Connect to Kudu to retrieve table metadata
+    KuduClient kuduClient = KuduUtil.getKuduClient(getKuduMasterHosts());
+    try {
+      kuduTable = kuduClient.openTable(kuduTableName_);
+    } catch (KuduException e) {
+      throw new ImpalaRuntimeException(
+          String.format("Error opening Kudu table '%s', Kudu error: %s", kuduTableName_,
+              e.getMessage()));
+    }
+    Preconditions.checkNotNull(kuduTable);
+
+    loadSchema(kuduTable);
+    loadPartitionByParams(kuduTable);
+  }
+
+  /**
    * Loads the metadata of a Kudu table.
    *
    * Schema and partitioning schemes are loaded directly from Kudu whereas column stats
@@ -192,32 +216,14 @@ public class KuduTable extends Table {
   public void load(boolean dummy /* not used */, IMetaStoreClient msClient,
       org.apache.hadoop.hive.metastore.api.Table msTbl) throws TableLoadingException {
     msTable_ = msTbl;
-    // This is set to 0 for Kudu tables.
-    // TODO: Change this to reflect the number of pk columns and modify all the
-    // places (e.g. insert stmt) that currently make use of this parameter.
-    numClusteringCols_ = 0;
     kuduTableName_ = msTable_.getParameters().get(KuduTable.KEY_TABLE_NAME);
     Preconditions.checkNotNull(kuduTableName_);
     kuduMasters_ = msTable_.getParameters().get(KuduTable.KEY_MASTER_HOSTS);
     Preconditions.checkNotNull(kuduMasters_);
-    org.apache.kudu.client.KuduTable kuduTable = null;
     setTableStats(msTable_);
-
-    // Connect to Kudu to retrieve table metadata
-    KuduClient kuduClient = KuduUtil.getKuduClient(getKuduMasterHosts());
-    try {
-      kuduTable = kuduClient.openTable(kuduTableName_);
-    } catch (KuduException e) {
-      throw new TableLoadingException(String.format(
-          "Error opening Kudu table '%s', Kudu error: %s",
-          kuduTableName_, e.getMessage()));
-    }
-    Preconditions.checkNotNull(kuduTable);
-
     // Load metadata from Kudu and HMS
     try {
-      loadSchema(kuduTable);
-      loadPartitionByParams(kuduTable);
+      loadSchemaFromKudu();
       loadAllColumnStats(msClient);
     } catch (ImpalaRuntimeException e) {
       throw new TableLoadingException("Error loading metadata for Kudu table " +

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/6f20df81/fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
----------------------------------------------------------------------
diff --git a/fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java b/fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
index cea9b53..e6679da 100644
--- a/fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
+++ b/fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
@@ -362,4 +362,7 @@ public class DataSourceScanNode extends ScanNode {
     }
     return output.toString();
   }
+
+  @Override
+  public boolean hasStorageLayerConjuncts() { return !acceptedConjuncts_.isEmpty(); }
 }

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/6f20df81/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
----------------------------------------------------------------------
diff --git a/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java b/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
index d56aa98..d2e47ad 100644
--- a/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
+++ b/fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
@@ -508,4 +508,7 @@ public class HBaseScanNode extends ScanNode {
     // TODO: What's a good estimate of memory consumption?
     return 1024L * 1024L * 1024L;
   }
+
+  @Override
+  public boolean hasStorageLayerConjuncts() { return !filters_.isEmpty(); }
 }

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/6f20df81/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
----------------------------------------------------------------------
diff --git a/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java b/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
index 37a4e5c..cbc132b 100644
--- a/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
+++ b/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
@@ -521,4 +521,7 @@ public class KuduScanNode extends ScanNode {
       default: return null;
     }
   }
+
+  @Override
+  public boolean hasStorageLayerConjuncts() { return !kuduConjuncts_.isEmpty(); }
 }

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/6f20df81/fe/src/main/java/org/apache/impala/planner/ScanNode.java
----------------------------------------------------------------------
diff --git a/fe/src/main/java/org/apache/impala/planner/ScanNode.java b/fe/src/main/java/org/apache/impala/planner/ScanNode.java
index 1373e89..d6b7813 100644
--- a/fe/src/main/java/org/apache/impala/planner/ScanNode.java
+++ b/fe/src/main/java/org/apache/impala/planner/ScanNode.java
@@ -193,7 +193,9 @@ abstract public class ScanNode extends PlanNode {
 
   @Override
   public long getInputCardinality() {
-    if (getConjuncts().isEmpty() && hasLimit()) return getLimit();
+    if (!hasScanConjuncts() && !hasStorageLayerConjuncts() && hasLimit()) {
+      return getLimit();
+    }
     return inputCardinality_;
   }
 
@@ -210,4 +212,16 @@ abstract public class ScanNode extends PlanNode {
       return desc_.getPath().toString();
     }
   }
+
+  /**
+   * Returns true if this node has conjuncts to be evaluated by Impala against the scan
+   * tuple.
+   */
+  public boolean hasScanConjuncts() { return !getConjuncts().isEmpty(); }
+
+  /**
+   * Returns true if this node has conjuncts to be evaluated by the underlying storage
+   * engine.
+   */
+  public boolean hasStorageLayerConjuncts() { return false; }
 }

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/6f20df81/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
----------------------------------------------------------------------
diff --git a/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java b/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
index 56cf047..e338ecb 100644
--- a/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
+++ b/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
@@ -52,8 +52,9 @@ public class MaxRowsProcessedVisitor implements Visitor<PlanNode> {
       boolean missingStats = scan.isTableMissingStats() || scan.hasCorruptTableStats();
       // In the absence of collection stats, treat scans on collections as if they
       // have no limit.
-      if (scan.isAccessingCollectionType()
-          || (missingStats && !(scan.hasLimit() && scan.getConjuncts().isEmpty()))) {
+      if (scan.isAccessingCollectionType() ||
+          (missingStats && !(scan.hasLimit() && !scan.hasScanConjuncts() &&
+              !scan.hasStorageLayerConjuncts()))) {
         valid_ = false;
         return;
       }

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/6f20df81/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
----------------------------------------------------------------------
diff --git a/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java b/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
index 033b0e2..aa96490 100644
--- a/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
+++ b/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
@@ -44,6 +44,7 @@ import org.apache.impala.catalog.Db;
 import org.apache.impala.catalog.Function;
 import org.apache.impala.catalog.HdfsTable;
 import org.apache.impala.catalog.ImpaladCatalog;
+import org.apache.impala.catalog.KuduTable;
 import org.apache.impala.catalog.ScalarFunction;
 import org.apache.impala.catalog.ScalarType;
 import org.apache.impala.catalog.Table;
@@ -165,9 +166,9 @@ public class FrontendTestBase {
   }
 
   /**
-   * Add a new dummy table to the catalog based on the given CREATE TABLE sql.
-   * The dummy table only has the column definitions and the metastore table set, but no
-   * other metadata.
+   * Add a new dummy table to the catalog based on the given CREATE TABLE sql. The
+   * returned table only has its metadata partially set, but is capable of being planned.
+   * Only HDFS tables and external Kudu tables are supported.
    * Returns the new dummy table.
    * The test tables are registered in testTables_ and removed in the @After method.
    */
@@ -177,21 +178,36 @@ public class FrontendTestBase {
     Preconditions.checkNotNull(db, "Test tables must be created in an existing db.");
     org.apache.hadoop.hive.metastore.api.Table msTbl =
         CatalogOpExecutor.createMetaStoreTable(createTableStmt.toThrift());
-    HdfsTable dummyTable = new HdfsTable(msTbl, db,
-        createTableStmt.getTbl(), createTableStmt.getOwner());
-    List<ColumnDef> columnDefs = Lists.newArrayList(
-        createTableStmt.getPartitionColumnDefs());
-    dummyTable.setNumClusteringCols(columnDefs.size());
-    columnDefs.addAll(createTableStmt.getColumnDefs());
-    for (int i = 0; i < columnDefs.size(); ++i) {
-      ColumnDef colDef = columnDefs.get(i);
-      dummyTable.addColumn(new Column(colDef.getColName(), colDef.getType(), i));
-    }
-    try {
-    dummyTable.addDefaultPartition(msTbl.getSd());
-    } catch (CatalogException e) {
-      e.printStackTrace();
-      fail("Failed to add test table:\n" + createTableSql);
+    Table dummyTable = Table.fromMetastoreTable(db, msTbl);
+    if (dummyTable instanceof HdfsTable) {
+      List<ColumnDef> columnDefs = Lists.newArrayList(
+          createTableStmt.getPartitionColumnDefs());
+      dummyTable.setNumClusteringCols(columnDefs.size());
+      columnDefs.addAll(createTableStmt.getColumnDefs());
+      for (int i = 0; i < columnDefs.size(); ++i) {
+        ColumnDef colDef = columnDefs.get(i);
+        dummyTable.addColumn(new Column(colDef.getColName(), colDef.getType(), i));
+      }
+      try {
+        HdfsTable hdfsTable = (HdfsTable) dummyTable;
+        hdfsTable.addDefaultPartition(msTbl.getSd());
+      } catch (CatalogException e) {
+        e.printStackTrace();
+        fail("Failed to add test table:\n" + createTableSql);
+      }
+    } else if (dummyTable instanceof KuduTable) {
+      if (!Table.isExternalTable(msTbl)) {
+        fail("Failed to add table, external kudu table expected:\n" + createTableSql);
+      }
+      try {
+        KuduTable kuduTable = (KuduTable) dummyTable;
+        kuduTable.loadSchemaFromKudu();
+      } catch (ImpalaRuntimeException e) {
+        e.printStackTrace();
+        fail("Failed to add test table:\n" + createTableSql);
+      }
+    } else {
+      fail("Test table type not supported:\n" + createTableSql);
     }
     db.addTable(dummyTable);
     testTables_.add(dummyTable);

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/6f20df81/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
----------------------------------------------------------------------
diff --git a/fe/src/test/java/org/apache/impala/planner/PlannerTest.java b/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
index 3bb8083..fc8ceab 100644
--- a/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
+++ b/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
@@ -314,6 +314,9 @@ public class PlannerTest extends PlannerTestBase {
   @Test
   public void testKudu() {
     Assume.assumeTrue(RuntimeEnv.INSTANCE.isKuduSupported());
+    addTestDb("kudu_planner_test", "Test DB for Kudu Planner.");
+    addTestTable("CREATE EXTERNAL TABLE kudu_planner_test.no_stats STORED AS KUDU " +
+        "TBLPROPERTIES ('kudu.table_name' = 'impala::functional_kudu.alltypes');");
     runPlannerTestFile("kudu");
   }
 

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/6f20df81/testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test b/testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
index 7cc5c04..ce4dbd7 100644
--- a/testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
+++ b/testdata/workloads/functional-planner/queries/PlannerTest/data-source-tables.test
@@ -96,3 +96,18 @@ PLAN-ROOT SINK
 |
 00:EMPTYSET
 ====
+---- QUERY
+# IMPALA-5602: If a query contains predicates that are all pushed to the datasource and
+# there is a limit, then the query should not incorrectly run with 'small query'
+# optimization.
+select * from functional.alltypes_datasource where id = 1 limit 15
+---- DISTRIBUTEDPLAN
+PLAN-ROOT SINK
+|
+01:EXCHANGE [UNPARTITIONED]
+|  limit: 15
+|
+00:SCAN DATA SOURCE [functional.alltypes_datasource]
+data source predicates: id = 1
+   limit: 15
+====

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/6f20df81/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
----------------------------------------------------------------------
diff --git a/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test b/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
index 079d291..e620ad6 100644
--- a/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
+++ b/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
@@ -424,3 +424,30 @@ INSERT INTO KUDU [functional_kudu.alltypes]
 00:SCAN HDFS [functional.alltypes]
    partitions=24/24 files=24 size=478.45KB
 ====
+# IMPALA-5602: If a query contains predicates that are all pushed to kudu and there is a
+# limit, then the query should not incorrectly run with 'small query' optimization.
+select * from functional_kudu.alltypesagg where tinyint_col = 9 limit 10;
+---- DISTRIBUTEDPLAN
+PLAN-ROOT SINK
+|
+01:EXCHANGE [UNPARTITIONED]
+|  limit: 10
+|
+00:SCAN KUDU [functional_kudu.alltypesagg_idx]
+   kudu predicates: functional_kudu.alltypesagg_idx.tinyint_col = 9
+   limit: 10
+====
+# IMPALA-5602: If a query contains predicates that are all pushed to kudu, there is a
+# limit, and no table stats, then the query should not incorrectly run with 'small query'
+# optimization.
+select * from kudu_planner_test.no_stats where tinyint_col = 9 limit 10;
+---- DISTRIBUTEDPLAN
+PLAN-ROOT SINK
+|
+01:EXCHANGE [UNPARTITIONED]
+|  limit: 10
+|
+00:SCAN KUDU [kudu_planner_test.no_stats]
+   kudu predicates: tinyint_col = 9
+   limit: 10
+====


[6/6] incubator-impala git commit: IMPALA-5784: Separate planner and user set query options in profile

Posted by ta...@apache.org.
IMPALA-5784: Separate planner and user set query options in profile

This separation will help the user better understand the query
runtime profile.

Testing:
Modified an existing test case.

Change-Id: Ibfc7832963fa0bd278a45c06a5a54e1bf40d8876
Reviewed-on: http://gerrit.cloudera.org:8080/7721
Reviewed-by: Matthew Jacobs <mj...@cloudera.com>
Reviewed-by: Dan Hecht <dh...@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/c67b198a
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/c67b198a
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/c67b198a

Branch: refs/heads/master
Commit: c67b198a19eda12c99905c2b9db30cc86f6e332d
Parents: ff5e9b6
Author: Bikramjeet Vig <bi...@cloudera.com>
Authored: Thu Aug 17 18:28:27 2017 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Thu Aug 24 02:42:01 2017 +0000

----------------------------------------------------------------------
 be/src/service/client-request-state.cc            |  4 +++-
 tests/custom_cluster/test_admission_controller.py |  4 ++--
 tests/query_test/test_observability.py            | 16 ++++++++--------
 3 files changed, 13 insertions(+), 11 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/c67b198a/be/src/service/client-request-state.cc
----------------------------------------------------------------------
diff --git a/be/src/service/client-request-state.cc b/be/src/service/client-request-state.cc
index 013e552..60f799c 100644
--- a/be/src/service/client-request-state.cc
+++ b/be/src/service/client-request-state.cc
@@ -146,7 +146,9 @@ Status ClientRequestState::Exec(TExecRequest* exec_request) {
 
   profile_.AddChild(&server_profile_);
   summary_profile_.AddInfoString("Query Type", PrintTStmtType(stmt_type()));
-  summary_profile_.AddInfoString("Query Options (non default)",
+  summary_profile_.AddInfoString("Query Options (set by configuration)",
+      DebugQueryOptions(query_ctx_.client_request.query_options));
+  summary_profile_.AddInfoString("Query Options (set by configuration and planner)",
       DebugQueryOptions(exec_request_.query_options));
 
   switch (exec_request->stmt_type) {

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/c67b198a/tests/custom_cluster/test_admission_controller.py
----------------------------------------------------------------------
diff --git a/tests/custom_cluster/test_admission_controller.py b/tests/custom_cluster/test_admission_controller.py
index ff4f4c6..cf072b9 100644
--- a/tests/custom_cluster/test_admission_controller.py
+++ b/tests/custom_cluster/test_admission_controller.py
@@ -84,7 +84,7 @@ _STATESTORED_ARGS = "-statestore_heartbeat_frequency_ms=%s "\
                     (STATESTORE_RPC_FREQUENCY_MS, STATESTORE_RPC_FREQUENCY_MS)
 
 # Key in the query profile for the query options.
-PROFILE_QUERY_OPTIONS_KEY = "Query Options (non default): "
+PROFILE_QUERY_OPTIONS_KEY = "Query Options (set by configuration): "
 
 def impalad_admission_ctrl_flags(max_requests, max_queued, pool_max_mem,
     proc_mem_limit = None):
@@ -153,7 +153,7 @@ class TestAdmissionController(TestAdmissionControllerBase, HS2TestSuite):
         rhs = re.split(": ", line)[1]
         confs = re.split(",", rhs)
         break
-    assert len(confs) >= len(expected_query_options)
+    assert len(confs) == len(expected_query_options)
     confs = map(str.lower, confs)
     for expected in expected_query_options:
       assert expected.lower() in confs,\

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/c67b198a/tests/query_test/test_observability.py
----------------------------------------------------------------------
diff --git a/tests/query_test/test_observability.py b/tests/query_test/test_observability.py
index 5a82a25..cf0527b 100644
--- a/tests/query_test/test_observability.py
+++ b/tests/query_test/test_observability.py
@@ -98,13 +98,13 @@ class TestObservability(ImpalaTestSuite):
     """Test that the query profile shows expected non-default query options, both set
     explicitly through client and those set by planner"""
     # Set a query option explicitly through client
-    self.execute_query("set mem_limit = 8589934592")
-    # For this query, the planner sets NUM_NODES=1, NUM_SCANNER_THREADS=1 and
-    # RUNTIME_FILTER_MODE=0
-    expected_string = "Query Options (non default): MEM_LIMIT=8589934592,NUM_NODES=1," \
-        "NUM_SCANNER_THREADS=1,RUNTIME_FILTER_MODE=0,MT_DOP=0\n"
-    assert expected_string in self.execute_query("select 1").runtime_profile
-
+    self.execute_query("set MEM_LIMIT = 8589934592")
     # Make sure explicitly set default values are not shown in the profile
     self.execute_query("set MAX_IO_BUFFERS = 0")
-    assert expected_string in self.execute_query("select 1").runtime_profile
+    runtime_profile = self.execute_query("select 1").runtime_profile
+    assert "Query Options (set by configuration): MEM_LIMIT=8589934592" in runtime_profile
+    # For this query, the planner sets NUM_NODES=1, NUM_SCANNER_THREADS=1,
+    # RUNTIME_FILTER_MODE=0 and MT_DOP=0
+    assert "Query Options (set by configuration and planner): MEM_LIMIT=8589934592," \
+        "NUM_NODES=1,NUM_SCANNER_THREADS=1,RUNTIME_FILTER_MODE=0,MT_DOP=0\n" \
+        in runtime_profile


[2/6] incubator-impala git commit: Hide some deprecated flags

Posted by ta...@apache.org.
Hide some deprecated flags

Hidden flags do not show up in /varz or --help.

Change-Id: I948b46cd6853f1d8ebaaadaba7b801dca886c7ad
Reviewed-on: http://gerrit.cloudera.org:8080/7786
Reviewed-by: Lars Volker <lv...@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/74dad170
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/74dad170
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/74dad170

Branch: refs/heads/master
Commit: 74dad1702aaa58f22ea98cbde25765190ff452a2
Parents: 91fc5b5
Author: Henry Robinson <he...@cloudera.com>
Authored: Tue Aug 22 15:17:00 2017 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Thu Aug 24 02:04:13 2017 +0000

----------------------------------------------------------------------
 be/src/exec/exec-node.cc                  | 4 ++--
 be/src/exec/partitioned-hash-join-node.cc | 2 +-
 be/src/scheduling/query-schedule.cc       | 6 +++---
 be/src/service/impala-server.cc           | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/74dad170/be/src/exec/exec-node.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/exec-node.cc b/be/src/exec/exec-node.cc
index b326d94..94e9ed1 100644
--- a/be/src/exec/exec-node.cc
+++ b/be/src/exec/exec-node.cc
@@ -71,8 +71,8 @@ using strings::Substitute;
 
 DECLARE_int32(be_port);
 DECLARE_string(hostname);
-DEFINE_bool(enable_partitioned_hash_join, true, "Deprecated - has no effect");
-DEFINE_bool(enable_partitioned_aggregation, true, "Deprecated - has no effect");
+DEFINE_bool_hidden(enable_partitioned_hash_join, true, "Deprecated - has no effect");
+DEFINE_bool_hidden(enable_partitioned_aggregation, true, "Deprecated - has no effect");
 
 namespace impala {
 

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/74dad170/be/src/exec/partitioned-hash-join-node.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/partitioned-hash-join-node.cc b/be/src/exec/partitioned-hash-join-node.cc
index 13d660f..2a1544c 100644
--- a/be/src/exec/partitioned-hash-join-node.cc
+++ b/be/src/exec/partitioned-hash-join-node.cc
@@ -38,7 +38,7 @@
 
 #include "common/names.h"
 
-DEFINE_bool(enable_phj_probe_side_filtering, true, "Deprecated.");
+DEFINE_bool_hidden(enable_phj_probe_side_filtering, true, "Deprecated.");
 
 static const string PREPARE_FOR_READ_FAILED_ERROR_MSG =
     "Failed to acquire initial read buffer for stream in hash join node $0. Reducing "

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/74dad170/be/src/scheduling/query-schedule.cc
----------------------------------------------------------------------
diff --git a/be/src/scheduling/query-schedule.cc b/be/src/scheduling/query-schedule.cc
index 244bef3..e14af78 100644
--- a/be/src/scheduling/query-schedule.cc
+++ b/be/src/scheduling/query-schedule.cc
@@ -36,9 +36,9 @@ using boost::uuids::uuid;
 using namespace impala;
 
 // TODO: Remove for Impala 3.0.
-DEFINE_bool(rm_always_use_defaults, false, "Deprecated");
-DEFINE_string(rm_default_memory, "4G", "Deprecated");
-DEFINE_int32(rm_default_cpu_vcores, 2, "Deprecated");
+DEFINE_bool_hidden(rm_always_use_defaults, false, "Deprecated");
+DEFINE_string_hidden(rm_default_memory, "4G", "Deprecated");
+DEFINE_int32_hidden(rm_default_cpu_vcores, 2, "Deprecated");
 
 namespace impala {
 

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/74dad170/be/src/service/impala-server.cc
----------------------------------------------------------------------
diff --git a/be/src/service/impala-server.cc b/be/src/service/impala-server.cc
index b440845..41f84b1 100644
--- a/be/src/service/impala-server.cc
+++ b/be/src/service/impala-server.cc
@@ -210,7 +210,7 @@ DEFINE_bool(is_executor, true, "If true, this Impala daemon will execute query "
 #endif
 
 // TODO: Remove for Impala 3.0.
-DEFINE_string(local_nodemanager_url, "", "Deprecated");
+DEFINE_string_hidden(local_nodemanager_url, "", "Deprecated");
 
 DECLARE_bool(compact_catalog_topic);
 


[5/6] incubator-impala git commit: IMPALA-5811: Add 'backends' tab to query details pages

Posted by ta...@apache.org.
IMPALA-5811: Add 'backends' tab to query details pages

Add a 'backends' tab to query details pages which shows:

  * host
  * total number of fragment instances for that query on that backend
  * number of still-running fragment instances
  * if the backend is complete (i.e. all instances finished)
  * peak memory consumption
  * the time, in ms, since a status report was received at the
  * coordinator from that backend.

The table refreshes itself every second, controllable by a check-box. If
the query has completed, no information is displayed.

Testing: Add a new smoketest to test_web_pages.py.

Change-Id: Ib5b3b0fb8f4188da56da593199f41ce6fab99767
Reviewed-on: http://gerrit.cloudera.org:8080/7711
Reviewed-by: Dan Hecht <dh...@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/ff5e9b6c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/ff5e9b6c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/ff5e9b6c

Branch: refs/heads/master
Commit: ff5e9b6c9a35e2869c0c845d7bbb7beb16b5d45e
Parents: 6f20df8
Author: Henry Robinson <he...@cloudera.com>
Authored: Tue Aug 15 22:21:18 2017 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Thu Aug 24 02:40:28 2017 +0000

----------------------------------------------------------------------
 be/src/runtime/coordinator-backend-state.cc | 26 +++++++
 be/src/runtime/coordinator-backend-state.h  |  7 ++
 be/src/runtime/coordinator.cc               | 12 +++
 be/src/runtime/coordinator.h                | 19 +++--
 be/src/service/impala-http-handler.cc       | 22 ++++++
 be/src/service/impala-http-handler.h        |  5 ++
 tests/webserver/test_web_pages.py           | 85 ++++++++++++++-------
 www/query_backends.tmpl                     | 94 ++++++++++++++++++++++++
 www/query_detail_tabs.tmpl                  |  1 +
 9 files changed, 237 insertions(+), 34 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/ff5e9b6c/be/src/runtime/coordinator-backend-state.cc
----------------------------------------------------------------------
diff --git a/be/src/runtime/coordinator-backend-state.cc b/be/src/runtime/coordinator-backend-state.cc
index 88f7f21..34e0671 100644
--- a/be/src/runtime/coordinator-backend-state.cc
+++ b/be/src/runtime/coordinator-backend-state.cc
@@ -47,6 +47,7 @@
 #include "common/names.h"
 
 using namespace impala;
+using namespace rapidjson;
 namespace accumulators = boost::accumulators;
 
 Coordinator::BackendState::BackendState(
@@ -249,6 +250,8 @@ bool Coordinator::BackendState::ApplyExecStatusReport(
     ProgressUpdater* scan_range_progress) {
   lock_guard<SpinLock> l1(exec_summary->lock);
   lock_guard<mutex> l2(lock_);
+  last_report_time_ms_ = MonotonicMillis();
+
   // If this backend completed previously, don't apply the update.
   if (IsDone()) return false;
   for (const TFragmentInstanceExecStatus& instance_exec_status:
@@ -561,3 +564,26 @@ void Coordinator::FragmentStats::AddExecStats() {
   avg_profile_->AddInfoString("execution rates", rates_label.str());
   avg_profile_->AddInfoString("num instances", lexical_cast<string>(num_instances_));
 }
+
+void Coordinator::BackendState::ToJson(Value* value, Document* document) {
+  lock_guard<mutex> l(lock_);
+  value->AddMember("num_instances", fragments_.size(), document->GetAllocator());
+  value->AddMember("done", IsDone(), document->GetAllocator());
+  value->AddMember(
+      "peak_mem_consumption", peak_consumption_, document->GetAllocator());
+
+  string host = TNetworkAddressToString(impalad_address());
+  Value val(host.c_str(), document->GetAllocator());
+  value->AddMember("host", val, document->GetAllocator());
+
+  value->AddMember("rpc_latency", rpc_latency(), document->GetAllocator());
+  value->AddMember("time_since_last_heard_from", MonotonicMillis() - last_report_time_ms_,
+      document->GetAllocator());
+
+  string status_str = status_.ok() ? "OK" : status_.GetDetail();
+  Value status_val(status_str.c_str(), document->GetAllocator());
+  value->AddMember("status", status_val, document->GetAllocator());
+
+  value->AddMember(
+      "num_remaining_instances", num_remaining_instances_, document->GetAllocator());
+}

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/ff5e9b6c/be/src/runtime/coordinator-backend-state.h
----------------------------------------------------------------------
diff --git a/be/src/runtime/coordinator-backend-state.h b/be/src/runtime/coordinator-backend-state.h
index 0846119..ccc3618 100644
--- a/be/src/runtime/coordinator-backend-state.h
+++ b/be/src/runtime/coordinator-backend-state.h
@@ -113,6 +113,10 @@ class Coordinator::BackendState {
   /// debugging aid for backend deadlocks.
   static void LogFirstInProgress(std::vector<BackendState*> backend_states);
 
+  /// Serializes backend state to JSON by adding members to 'value', including total
+  /// number of instances, peak memory consumption, host and status amongst others.
+  void ToJson(rapidjson::Value* value, rapidjson::Document* doc);
+
  private:
   /// Execution stats for a single fragment instance.
   /// Not thread-safe.
@@ -213,6 +217,9 @@ class Coordinator::BackendState {
   /// peak_consumption()
   int64_t peak_consumption_;
 
+  /// Set in ApplyExecStatusReport(). Uses MonotonicMillis().
+  int64_t last_report_time_ms_ = 0;
+
   /// Fill in rpc_params based on state. Uses filter_routing_table to remove filters
   /// that weren't selected during its construction.
   void SetRpcParams(const DebugOptions& debug_options,

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/ff5e9b6c/be/src/runtime/coordinator.cc
----------------------------------------------------------------------
diff --git a/be/src/runtime/coordinator.cc b/be/src/runtime/coordinator.cc
index a9936ad..029e0bc 100644
--- a/be/src/runtime/coordinator.cc
+++ b/be/src/runtime/coordinator.cc
@@ -71,6 +71,7 @@
 #include "common/names.h"
 
 using namespace apache::thrift;
+using namespace rapidjson;
 using namespace strings;
 using boost::algorithm::iequals;
 using boost::algorithm::is_any_of;
@@ -1221,4 +1222,15 @@ void Coordinator::GetTExecSummary(TExecSummary* exec_summary) {
 MemTracker* Coordinator::query_mem_tracker() const {
   return query_state()->query_mem_tracker();
 }
+
+void Coordinator::BackendsToJson(Document* doc) {
+  lock_guard<mutex> l(lock_);
+  Value states(kArrayType);
+  for (BackendState* state : backend_states_) {
+    Value val(kObjectType);
+    state->ToJson(&val, doc);
+    states.PushBack(val, doc->GetAllocator());
+  }
+  doc->AddMember("backend_states", states, doc->GetAllocator());
+}
 }

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/ff5e9b6c/be/src/runtime/coordinator.h
----------------------------------------------------------------------
diff --git a/be/src/runtime/coordinator.h b/be/src/runtime/coordinator.h
index 03d03df..4edef88 100644
--- a/be/src/runtime/coordinator.h
+++ b/be/src/runtime/coordinator.h
@@ -18,20 +18,21 @@
 #ifndef IMPALA_RUNTIME_COORDINATOR_H
 #define IMPALA_RUNTIME_COORDINATOR_H
 
-#include <vector>
 #include <string>
-#include <boost/scoped_ptr.hpp>
+#include <vector>
 #include <boost/accumulators/accumulators.hpp>
-#include <boost/accumulators/statistics/stats.hpp>
-#include <boost/accumulators/statistics/min.hpp>
+#include <boost/accumulators/statistics/max.hpp>
 #include <boost/accumulators/statistics/mean.hpp>
 #include <boost/accumulators/statistics/median.hpp>
-#include <boost/accumulators/statistics/max.hpp>
+#include <boost/accumulators/statistics/min.hpp>
+#include <boost/accumulators/statistics/stats.hpp>
 #include <boost/accumulators/statistics/variance.hpp>
+#include <boost/scoped_ptr.hpp>
+#include <boost/thread/condition_variable.hpp>
+#include <boost/thread/mutex.hpp>
 #include <boost/unordered_map.hpp>
 #include <boost/unordered_set.hpp>
-#include <boost/thread/mutex.hpp>
-#include <boost/thread/condition_variable.hpp>
+#include <rapidjson/document.h>
 
 #include "common/global-types.h"
 #include "common/hdfs.h"
@@ -186,6 +187,10 @@ class Coordinator { // NOLINT: The member variables could be re-ordered to save
   /// filter to fragment instances.
   void UpdateFilter(const TUpdateFilterParams& params);
 
+  /// Adds to 'document' a serialized array of all backends in a member named
+  /// 'backend_states'.
+  void BackendsToJson(rapidjson::Document* document);
+
  private:
   class BackendState;
   struct FilterTarget;

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/ff5e9b6c/be/src/service/impala-http-handler.cc
----------------------------------------------------------------------
diff --git a/be/src/service/impala-http-handler.cc b/be/src/service/impala-http-handler.cc
index 79903b4..e93aacf 100644
--- a/be/src/service/impala-http-handler.cc
+++ b/be/src/service/impala-http-handler.cc
@@ -101,6 +101,9 @@ void ImpalaHttpHandler::RegisterHandlers(Webserver* webserver) {
   webserver->RegisterUrlCallback("/query_memory", "query_memory.tmpl",
       MakeCallback(this, &ImpalaHttpHandler::QueryMemoryHandler), false);
 
+  webserver->RegisterUrlCallback("/query_backends", "query_backends.tmpl",
+      MakeCallback(this, &ImpalaHttpHandler::QueryBackendsHandler), false);
+
   webserver->RegisterUrlCallback("/cancel_query", "common-pre.tmpl",
       MakeCallback(this, &ImpalaHttpHandler::CancelQueryHandler), false);
 
@@ -691,6 +694,25 @@ void PlanToJson(const vector<TPlanFragment>& fragments, const TExecSummary& summ
 
 }
 
+void ImpalaHttpHandler::QueryBackendsHandler(
+    const Webserver::ArgumentMap& args, Document* document) {
+  TUniqueId query_id;
+  Status status = ParseIdFromArguments(args, &query_id, "query_id");
+  Value query_id_val(PrintId(query_id).c_str(), document->GetAllocator());
+  document->AddMember("query_id", query_id_val, document->GetAllocator());
+  if (!status.ok()) {
+    // Redact the error message, it may contain part or all of the query.
+    Value json_error(RedactCopy(status.GetDetail()).c_str(), document->GetAllocator());
+    document->AddMember("error", json_error, document->GetAllocator());
+    return;
+  }
+
+  shared_ptr<ClientRequestState> request_state = server_->GetClientRequestState(query_id);
+  if (request_state.get() == nullptr || request_state->coord() == nullptr) return;
+
+  request_state->coord()->BackendsToJson(document);
+}
+
 void ImpalaHttpHandler::QuerySummaryHandler(bool include_json_plan, bool include_summary,
     const Webserver::ArgumentMap& args, Document* document) {
   TUniqueId query_id;

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/ff5e9b6c/be/src/service/impala-http-handler.h
----------------------------------------------------------------------
diff --git a/be/src/service/impala-http-handler.h b/be/src/service/impala-http-handler.h
index 485f6db..8ad84bd 100644
--- a/be/src/service/impala-http-handler.h
+++ b/be/src/service/impala-http-handler.h
@@ -91,6 +91,11 @@ class ImpalaHttpHandler {
   void QuerySummaryHandler(bool include_plan_json, bool include_summary,
       const Webserver::ArgumentMap& args, rapidjson::Document* document);
 
+  /// If 'args' contains a query id, serializes all backend states for that query to
+  /// 'document'.
+  void QueryBackendsHandler(
+      const Webserver::ArgumentMap& args, rapidjson::Document* document);
+
   /// Cancels an in-flight query and writes the result to 'contents'.
   void CancelQueryHandler(const Webserver::ArgumentMap& args,
       rapidjson::Document* document);

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/ff5e9b6c/tests/webserver/test_web_pages.py
----------------------------------------------------------------------
diff --git a/tests/webserver/test_web_pages.py b/tests/webserver/test_web_pages.py
index 4a2d872..2586399 100644
--- a/tests/webserver/test_web_pages.py
+++ b/tests/webserver/test_web_pages.py
@@ -17,6 +17,7 @@
 
 from tests.common.impala_cluster import ImpalaCluster
 from tests.common.impala_test_suite import ImpalaTestSuite
+import json
 import requests
 
 class TestWebPage(ImpalaTestSuite):
@@ -28,6 +29,7 @@ class TestWebPage(ImpalaTestSuite):
   RESET_GLOG_LOGLEVEL_URL = "http://localhost:{0}/reset_glog_level"
   CATALOG_URL = "http://localhost:{0}/catalog"
   CATALOG_OBJECT_URL = "http://localhost:{0}/catalog_object"
+  QUERY_BACKENDS_URL = "http://localhost:{0}/query_backends"
   # log4j changes do not apply to the statestore since it doesn't
   # have an embedded JVM. So we make two sets of ports to test the
   # log level endpoints, one without the statestore port and the
@@ -54,89 +56,95 @@ class TestWebPage(ImpalaTestSuite):
     result = impalad.service.read_debug_webpage("query_profile_encoded?query_id=123")
     assert result.startswith("Could not obtain runtime profile: Query id")
 
-  def get_and_check_status(self, url, string_to_search = "", without_ss = True):
+  def get_and_check_status(self, url, string_to_search = "", ports_to_test = None):
     """Helper method that polls a given url and asserts the return code is ok and
-    the response contains the input string. 'without_ss', when true, excludes the
-    statestore endpoint of the url. Should be applied only for log4j logging changes."""
-    ports_to_test = self.TEST_PORTS_WITHOUT_SS if without_ss else self.TEST_PORTS_WITH_SS
+    the response contains the input string."""
+    if ports_to_test is None:
+      ports_to_test = self.TEST_PORTS_WITH_SS
     for port in ports_to_test:
       input_url = url.format(port)
       response = requests.get(input_url)
       assert response.status_code == requests.codes.ok\
           and string_to_search in response.text, "Offending url: " + input_url
+    return response.text
+
+  def get_and_check_status_jvm(self, url, string_to_search = ""):
+    """Calls get_and_check_status() for impalad and catalogd only"""
+    return self.get_and_check_status(url, string_to_search,
+                                     ports_to_test=self.TEST_PORTS_WITHOUT_SS)
 
   def test_log_level(self):
     """Test that the /log_level page outputs are as expected and work well on basic and
     malformed inputs. This however does not test that the log level changes are actually
     in effect."""
     # Check that the log_level end points are accessible.
-    self.get_and_check_status(self.GET_JAVA_LOGLEVEL_URL)
-    self.get_and_check_status(self.SET_JAVA_LOGLEVEL_URL)
-    self.get_and_check_status(self.RESET_JAVA_LOGLEVEL_URL)
-    self.get_and_check_status(self.SET_GLOG_LOGLEVEL_URL, without_ss=False)
-    self.get_and_check_status(self.RESET_GLOG_LOGLEVEL_URL, without_ss=False)
+    self.get_and_check_status_jvm(self.GET_JAVA_LOGLEVEL_URL)
+    self.get_and_check_status_jvm(self.SET_JAVA_LOGLEVEL_URL)
+    self.get_and_check_status_jvm(self.RESET_JAVA_LOGLEVEL_URL)
+    self.get_and_check_status(self.SET_GLOG_LOGLEVEL_URL)
+    self.get_and_check_status(self.RESET_GLOG_LOGLEVEL_URL)
     # Try getting log level of a class.
     get_loglevel_url = (self.GET_JAVA_LOGLEVEL_URL + "?class" +
         "=org.apache.impala.catalog.HdfsTable")
-    self.get_and_check_status(get_loglevel_url, "DEBUG")
+    self.get_and_check_status_jvm(get_loglevel_url, "DEBUG")
 
     # Set the log level of a class to TRACE and confirm the setting is in place
     set_loglevel_url = (self.SET_JAVA_LOGLEVEL_URL + "?class" +
         "=org.apache.impala.catalog.HdfsTable&level=trace")
-    self.get_and_check_status(set_loglevel_url, "Effective log level: TRACE")
+    self.get_and_check_status_jvm(set_loglevel_url, "Effective log level: TRACE")
 
     get_loglevel_url = (self.GET_JAVA_LOGLEVEL_URL + "?class" +
         "=org.apache.impala.catalog.HdfsTable")
-    self.get_and_check_status(get_loglevel_url, "TRACE")
+    self.get_and_check_status_jvm(get_loglevel_url, "TRACE")
     # Check the log level of a different class and confirm it is still DEBUG
     get_loglevel_url = (self.GET_JAVA_LOGLEVEL_URL + "?class" +
         "=org.apache.impala.catalog.HdfsPartition")
-    self.get_and_check_status(get_loglevel_url, "DEBUG")
+    self.get_and_check_status_jvm(get_loglevel_url, "DEBUG")
 
     # Reset Java logging levels and check the logging level of the class again
-    self.get_and_check_status(self.RESET_JAVA_LOGLEVEL_URL, "Java log levels reset.")
+    self.get_and_check_status_jvm(self.RESET_JAVA_LOGLEVEL_URL, "Java log levels reset.")
     get_loglevel_url = (self.GET_JAVA_LOGLEVEL_URL + "?class" +
         "=org.apache.impala.catalog.HdfsTable")
-    self.get_and_check_status(get_loglevel_url, "DEBUG")
+    self.get_and_check_status_jvm(get_loglevel_url, "DEBUG")
 
     # Set a new glog level and make sure the setting has been applied.
     set_glog_url = (self.SET_GLOG_LOGLEVEL_URL + "?glog=3")
-    self.get_and_check_status(set_glog_url, "v set to 3", False)
+    self.get_and_check_status(set_glog_url, "v set to 3")
 
     # Try resetting the glog logging defaults again.
-    self.get_and_check_status( self.RESET_GLOG_LOGLEVEL_URL, "v set to ", False)
+    self.get_and_check_status( self.RESET_GLOG_LOGLEVEL_URL, "v set to ")
 
     # Try to get the log level of an empty class input
     get_loglevel_url = (self.GET_JAVA_LOGLEVEL_URL + "?class=")
-    self.get_and_check_status(get_loglevel_url, without_ss=True)
+    self.get_and_check_status_jvm(get_loglevel_url)
 
     # Same as above, for set log level request
     set_loglevel_url = (self.SET_JAVA_LOGLEVEL_URL + "?class=")
-    self.get_and_check_status(get_loglevel_url, without_ss=True)
+    self.get_and_check_status_jvm(get_loglevel_url)
 
     # Empty input for setting a glog level request
     set_glog_url = (self.SET_GLOG_LOGLEVEL_URL + "?glog=")
-    self.get_and_check_status(set_glog_url, without_ss=False)
+    self.get_and_check_status(set_glog_url)
 
     # Try setting a non-existent log level on a valid class. In such cases,
     # log4j automatically sets it as DEBUG. This is the behavior of
     # Level.toLevel() method.
     set_loglevel_url = (self.SET_JAVA_LOGLEVEL_URL + "?class" +
         "=org.apache.impala.catalog.HdfsTable&level=foo&")
-    self.get_and_check_status(set_loglevel_url, "Effective log level: DEBUG")
+    self.get_and_check_status_jvm(set_loglevel_url, "Effective log level: DEBUG")
 
     # Try setting an invalid glog level.
     set_glog_url = self.SET_GLOG_LOGLEVEL_URL + "?glog=foo"
-    self.get_and_check_status(set_glog_url, "Bad glog level input", False)
+    self.get_and_check_status(set_glog_url, "Bad glog level input")
 
     # Try a non-existent endpoint on log_level URL.
     bad_loglevel_url = self.SET_GLOG_LOGLEVEL_URL + "?badurl=foo"
-    self.get_and_check_status(bad_loglevel_url, without_ss=False)
+    self.get_and_check_status(bad_loglevel_url)
 
   def test_catalog(self):
     """Tests the /catalog and /catalog_object endpoints."""
-    self.get_and_check_status(self.CATALOG_URL, "functional", without_ss=True)
-    self.get_and_check_status(self.CATALOG_URL, "alltypes", without_ss=True)
+    self.get_and_check_status_jvm(self.CATALOG_URL, "functional")
+    self.get_and_check_status_jvm(self.CATALOG_URL, "alltypes")
     # IMPALA-5028: Test toThrift() of a partitioned table via the WebUI code path.
     self.__test_catalog_object("functional", "alltypes")
     self.__test_catalog_object("functional_parquet", "alltypes")
@@ -149,8 +157,31 @@ class TestWebPage(ImpalaTestSuite):
     self.client.execute("invalidate metadata %s.%s" % (db_name, tbl_name))
     self.get_and_check_status(self.CATALOG_OBJECT_URL +
       "?object_type=TABLE&object_name=%s.%s" % (db_name, tbl_name), tbl_name,
-      without_ss=True)
+      ports_to_test=self.TEST_PORTS_WITHOUT_SS)
     self.client.execute("select count(*) from %s.%s" % (db_name, tbl_name))
     self.get_and_check_status(self.CATALOG_OBJECT_URL +
       "?object_type=TABLE&object_name=%s.%s" % (db_name, tbl_name), tbl_name,
-      without_ss=True)
+      ports_to_test=self.TEST_PORTS_WITHOUT_SS)
+
+  def test_query_details(self, unique_database):
+    """Test that /query_backends returns the list of backend states for DML or queries;
+    nothing for DDL statements"""
+    CROSS_JOIN = ("select count(*) from functional.alltypes a "
+                  "CROSS JOIN functional.alltypes b CROSS JOIN functional.alltypes c")
+    for q in [CROSS_JOIN,
+              "CREATE TABLE {0}.foo AS {1}".format(unique_database, CROSS_JOIN),
+              "DESCRIBE functional.alltypes"]:
+      query_handle =  self.client.execute_async(q)
+      try:
+        response = self.get_and_check_status(
+          self.QUERY_BACKENDS_URL + "?query_id=%s&json" % query_handle.get_handle().id,
+          ports_to_test=[25000])
+
+        response_json = json.loads(response)
+
+        if "DESCRIBE" not in q:
+          assert len(response_json['backend_states']) > 0
+        else:
+          assert 'backend_states' not in response_json
+      finally:
+        self.client.cancel(query_handle)

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/ff5e9b6c/www/query_backends.tmpl
----------------------------------------------------------------------
diff --git a/www/query_backends.tmpl b/www/query_backends.tmpl
new file mode 100644
index 0000000..07d1b57
--- /dev/null
+++ b/www/query_backends.tmpl
@@ -0,0 +1,94 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+{{> www/common-header.tmpl }}
+{{> www/query_detail_tabs.tmpl }}
+<br/>
+{{?backend_states}}
+<div>
+  <label>
+    <input type="checkbox" checked="true" id="toggle" onClick="toggleRefresh()"/>
+    <span id="refresh_on">Auto-refresh on</span>
+  </label>  Last updated: <span id="last-updated"></span>
+</div>
+
+<br/>
+<table id="backends" class='table table-hover table-bordered'>
+  <thead>
+    <tr>
+      <th>Host</th>
+      <th>Num. instances</th>
+      <th>Num. remaining instances</th>
+      <th>Done</th>
+      <th>Peak mem. consumption</th>
+      <th>Time since last report (ms)</th>
+    </tr>
+  </thead>
+  <tbody>
+
+  </tbody>
+</table>
+
+<script>
+document.getElementById("backends-tab").className = "active";
+
+var intervalId = 0;
+var table = null;
+var refresh = function () {
+    table.ajax.reload();
+    document.getElementById("last-updated").textContent = new Date();
+};
+
+$(document).ready(function() {
+    table = $('#backends').DataTable({
+        ajax: { url: "/query_backends?query_id={{query_id}}&json",
+                dataSrc: "backend_states",
+              },
+        "columns": [ {data: 'host'},
+                     {data: 'num_instances'},
+                     {data: 'num_remaining_instances'},
+                     {data: 'done'},
+                     {data: 'peak_mem_consumption'},
+                     {data: 'time_since_last_heard_from'}],
+        "order": [[ 0, "desc" ]],
+        "pageLength": 100
+    });
+    intervalId = setInterval( refresh, 1000 );
+});
+
+function toggleRefresh() {
+    if (document.getElementById("toggle").checked == true) {
+        intervalId = setInterval(refresh, 1000);
+        document.getElementById("refresh_on").textContent = "Auto-refresh on";
+    } else {
+        clearInterval(intervalId);
+        document.getElementById("refresh_on").textContent = "Auto-refresh off";
+    }
+}
+
+</script>
+{{/backend_states}}
+
+{{^backend_states}}
+<div class="alert alert-info" role="alert">
+Query <strong>{{query_id}}</strong> has completed, or has no backends.
+</div>
+{{/backend_states}}
+
+{{> www/common-footer.tmpl }}

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/ff5e9b6c/www/query_detail_tabs.tmpl
----------------------------------------------------------------------
diff --git a/www/query_detail_tabs.tmpl b/www/query_detail_tabs.tmpl
index 64781f6..0318761 100644
--- a/www/query_detail_tabs.tmpl
+++ b/www/query_detail_tabs.tmpl
@@ -27,4 +27,5 @@ under the License.
   <li id="summary-tab" role="presentation"><a href="/query_summary?query_id={{query_id}}">Summary</a></li>
   <li id="profile-tab" role="presentation"><a href="/query_profile?query_id={{query_id}}">Profile</a></li>
   <li id="memory-tab" role="presentation"><a href="/query_memory?query_id={{query_id}}">Memory</a></li>
+  <li id="backends-tab" role="presentation"><a href="/query_backends?query_id={{query_id}}">Backends</a></li>
 </ul>