You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Sahil Takiar (Code Review)" <ge...@cloudera.org> on 2019/04/09 15:58:02 UTC

[Impala-ASF-CR] IMPALA-6050: Query profiles should indicate storage layer(s) used

Hello Lars Volker, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12282

to look at the new patch set (#4).

Change subject: IMPALA-6050: Query profiles should indicate storage layer(s) used
......................................................................

IMPALA-6050: Query profiles should indicate storage layer(s) used

This patch updates Impala explain plans so that the Scan Node section clearly
displays which filesystems the Scan Node is reading data from (support
has been added for scans from HDFS, S3, ADLS, and the local filesystem).

Before this patch, if an Impala query scanned a table with partitions
across different storage layers, the explain plan would look like this:

 PLAN-ROOT SINK
 |
 01:EXCHANGE [UNPARTITIONED]
 |
 00:SCAN HDFS [functional.alltypes]
    partitions=24/24 files=24 size=478.45KB

Now the explain plan will look like this:

 PLAN-ROOT SINK
 |
 01:EXCHANGE [UNPARTITIONED]
 |
 00:SCAN S3 [functional.alltypes]
    ADLS partitions=4/24 files=4 size=478.45KB
    HDFS partitions=10/24 files=10 size=478.45KB
    S3 partitions=10/24 files=10 size=478.45KB

The explain plan differentiates "SCAN HDFS" vs "SCAN S3" by using the
root table path. This means that even scans of non-partitioned tables
will see their explain plans change from "SCAN HDFS" to "SCAN
[storage-layer-name]". This change affects explain plans that are stored on
an single storage layer as well: 'partitions=...' will become
'HDFS partitions-...'.

This patch makes several changes to PlannerTest.java so that by default
test files do not validate the value of the storage layer displayed in
the explain plan. This is necessary to support classes such as
S3PlannerTest which run test files against S3. It makes several changes
to impala_test_suite.py as well in order to support validation of
explain plans in test files that run via Python. Specifically, it adds
support for a new substitution variable in test files called
$FILESYSTEM_NAME which is the name of the storage layer the test is
being run against.

Testing:
* Ran core tests
* Added new tests to PlannerTest
* Added ExplainTest to allow for more fine-grained testing of explain
plan logic

Change-Id: I4b1b4a1bc1a24e9614e3b4dc5a61dc96d075d1c3
---
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
A fe/src/main/java/org/apache/impala/catalog/FsType.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/common/FrontendFixture.java
A fe/src/test/java/org/apache/impala/planner/ExplainTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
A testdata/workloads/functional-planner/queries/PlannerTest/scan-node-fs-scheme.test
M testdata/workloads/functional-query/queries/QueryTest/partition-col-types.test
M tests/common/impala_test_suite.py
M tests/util/filesystem_utils.py
19 files changed, 618 insertions(+), 87 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/12282/4
-- 
To view, visit http://gerrit.cloudera.org:8080/12282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4b1b4a1bc1a24e9614e3b4dc5a61dc96d075d1c3
Gerrit-Change-Number: 12282
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar <st...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>