You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2017/03/15 19:39:41 UTC
[jira] [Created] (DRILL-5357) Partition pruning information not
available in query plan for COUNT aggregate query
Khurram Faraaz created DRILL-5357:
-------------------------------------
Summary: Partition pruning information not available in query plan for COUNT aggregate query
Key: DRILL-5357
URL: https://issues.apache.org/jira/browse/DRILL-5357
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 1.10.0
Environment: 3 node CentOS cluster
Reporter: Khurram Faraaz
Priority: Critical
We are not seeing partition pruning information in the query plan for the below, COUNT(*) and COUNT(<col-name>) query ?
Drill 1.10.0-SNAPSHOT
git commit id: b657d44f
parquet table has 6 columns
total number of rows = 1638640
{noformat}
0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_prtn_prune_01 PARTITION BY (col_state)
AS
SELECT CAST(columns[0] AS DATE) col_date,
CAST(columns[1] AS CHAR(3)) col_state,
CAST(columns[2] AS INTEGER) col_prime,
CAST(columns[3] AS VARCHAR(256)) col_varstr,
CAST(columns[4] AS INTEGER) col_id,
CAST(columns[5] AS VARCHAR(50)) col_name
from `partition_prune_data.csv`;
+-----------+----------------------------+
| Fragment | Number of records written |
+-----------+----------------------------+
| 0_0 | 1638640 |
+-----------+----------------------------+
1 row selected (17.675 seconds)
0: jdbc:drill:schema=dfs.tmp> select COUNT(*) from tbl_prtn_prune_01 where col_state = 'CA';
+---------+
| EXPR$0 |
+---------+
| 35653 |
+---------+
1 row selected (0.471 seconds)
0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from tbl_prtn_prune_01 where col_state = 'CA';
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(EXPR$0=[$0])
00-02 Project(EXPR$0=[$0])
00-03 Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@1d4bb67d[columns = null, isStarQuery = false, isSkipQuery = false]])
{noformat}
And then I did a REFRESH TABLE METADATA on the parquet table
{noformat}
0: jdbc:drill:schema=dfs.tmp> refresh table metadata tbl_prtn_prune_01;
+-------+-------------------------------------------------------------+
| ok | summary |
+-------+-------------------------------------------------------------+
| true | Successfully updated metadata for table tbl_prtn_prune_01. |
+-------+-------------------------------------------------------------+
1 row selected (0.321 seconds)
0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_state) from tbl_prtn_prune_01 where col_state = 'CA';
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(EXPR$0=[$0])
00-02 Project(EXPR$0=[$0])
00-03 Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@2e0f4be9[columns = null, isStarQuery = false, isSkipQuery = false]])
0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(*) from tbl_prtn_prune_01 where col_state = 'CA';
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(EXPR$0=[$0])
00-02 Project(EXPR$0=[$0])
00-03 Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@3fc1f8e7[columns = null, isStarQuery = false, isSkipQuery = false]])
0: jdbc:drill:schema=dfs.tmp> explain plan for select COUNT(col_date) from tbl_prtn_prune_01 where col_state = 'CA';
+------+------+
| text | json |
+------+------+
| 00-00 Screen
00-01 Project(EXPR$0=[$0])
00-02 Project(EXPR$0=[$0])
00-03 Scan(groupscan=[org.apache.drill.exec.store.pojo.PojoRecordReader@7afc851e[columns = null, isStarQuery = false, isSkipQuery = false]])
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)