You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "Amos Bird (Code Review)" <ge...@cloudera.org> on 2016/09/08 13:59:58 UTC

[Impala-ASF-CR] IMPALA-1654: General partition exprs in DDL operations.

Amos Bird has uploaded a new patch set (#7).

Change subject: IMPALA-1654: General partition exprs in DDL operations.
......................................................................

IMPALA-1654: General partition exprs in DDL operations.

This commit handles partition related DDL in a more general way. We can
now use compound predicates to specify a list of partitions in
statements like ALTER TABLE DROP PARTITION and COMPUTE INCREMENTAL
STATS, etc. It will also make sure some statements only accept one
partition at a time, such as PARTITION SET LOCATION and LOAD DATA. ALTER
TABLE ADD PARTITION remains using the old PartitionKeyValue's logic.

The changed partition related DDLs are as follows,

Table: p (i int) partitioned by (j int, k string)
Partitions:
+-------+---+-------+--------+------+--------------+-------------------+
| j     | k | #Rows | #Files | Size | Bytes Cached | Cache Replication |
+-------+---+-------+--------+------+--------------+-------------------+
| 1     | a | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| 1     | b | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| 1     | c | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| 2     | d | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| 2     | e | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| 2     | f | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| Total |   | -1    | 0      | 0B   | 0B           |                   |
+-------+---+-------+--------+------+--------------+-------------------+

1. show files in p partition (j<2, k='a');
2. alter table p partition (j<2, k in ("b","c") set cached in 'testPool';

// j can appear more than once,
3.1. alter table p partition (j<2, j>0, k<>"d") set uncached;
// it is the same as
3.2. alter table p partition (j<2 and j>0, not k="e") set uncached;
// we can also do 'or'
3.3. alter table p partition (j<2 or j>0, k like "%") set uncached;

// missing 'k' matches all values of k
4. alter table p partition (j<2) set fileformat textfile;
5. alter table p partition (k rlike ".*") set serdeproperties ("k"="v");
6. alter table p partition (j is not null) set tblproperties ("k"="v");
7. alter table p drop partition (j<2);
8. compute incremental stats p partition(j<2);

The remaining old partition related DDLs are as follows,

1. load data inpath '/path/from' into table p partition (j=2, k="d");
2. alter table p add partition (j=2, k="g");
3. alter table p partition (j=2, k="g") set location '/path/to';
4. insert into p partition (j=2, k="g") values (1), (2), (3);

General partition expressions or partially specified partition specs
allows partition predicates to return empty partition set no matter
'IF EXISTS' is specified.

Examples:

[localhost.localdomain:21000] >
alter table p drop partition (j=2, k="f");
Query: alter table p drop partition (j=2, k="f")
+-------------------------+
| summary                 |
+-------------------------+
| Dropped 1 partition(s). |
+-------------------------+
Fetched 1 row(s) in 0.78s
[localhost.localdomain:21000] >
alter table p drop partition (j=2, k<"f");
Query: alter table p drop partition (j=2, k<"f")
+-------------------------+
| summary                 |
+-------------------------+
| Dropped 2 partition(s). |
+-------------------------+
Fetched 1 row(s) in 0.41s
[localhost.localdomain:21000] >
alter table p drop partition (k="a");
Query: alter table p drop partition (k="a")
+-------------------------+
| summary                 |
+-------------------------+
| Dropped 1 partition(s). |
+-------------------------+
Fetched 1 row(s) in 0.25s
[localhost.localdomain:21000] > show partitions p;
Query: show partitions p
+-------+---+-------+--------+------+--------------+-------------------+
| j     | k | #Rows | #Files | Size | Bytes Cached | Cache Replication |
+-------+---+-------+--------+------+--------------+-------------------+
| 1     | b | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| 1     | c | -1    | 0      | 0B   | NOT CACHED   | NOT CACHED        |
| Total |   | -1    | 0      | 0B   | 0B           |                   |
+-------+---+-------+--------+------+--------------+-------------------+
Fetched 3 row(s) in 0.01s

Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
---
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M common/thrift/CatalogService.thrift
M common/thrift/Frontend.thrift
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetCachedStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetFileFormatStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetLocationStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableSetTblProperties.java
M fe/src/main/java/com/cloudera/impala/analysis/AlterTableStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/DropStatsStmt.java
A fe/src/main/java/com/cloudera/impala/analysis/PartitionSet.java
M fe/src/main/java/com/cloudera/impala/analysis/PartitionSpec.java
A fe/src/main/java/com/cloudera/impala/analysis/PartitionSpecBase.java
M fe/src/main/java/com/cloudera/impala/analysis/ShowFilesStmt.java
M fe/src/main/java/com/cloudera/impala/analysis/TupleDescriptor.java
M fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java
M fe/src/main/java/com/cloudera/impala/catalog/Table.java
M fe/src/main/java/com/cloudera/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/com/cloudera/impala/planner/SingleNodePlanner.java
M fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java
M fe/src/main/java/com/cloudera/impala/service/Frontend.java
M fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/com/cloudera/impala/analysis/ParserTest.java
M shell/impala_client.py
M testdata/workloads/functional-query/queries/QueryTest/alter-table.test
A testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates.test
M tests/metadata/test_ddl.py
33 files changed, 1,302 insertions(+), 574 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/3942/7
-- 
To view, visit http://gerrit.cloudera.org:8080/3942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2c9162fcf9d227b8daf4c2e761d57bab4e26408f
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Amos Bird <am...@gmail.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: Amos Bird <am...@gmail.com>