You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Arina Ielchiieva (Jira)" <ji...@apache.org> on 2019/10/22 14:33:00 UTC
[jira] [Created] (DRILL-7418) MetadataDirectGroupScan improvements
Arina Ielchiieva created DRILL-7418:
---------------------------------------
Summary: MetadataDirectGroupScan improvements
Key: DRILL-7418
URL: https://issues.apache.org/jira/browse/DRILL-7418
Project: Apache Drill
Issue Type: Improvement
Affects Versions: 1.16.0
Reporter: Arina Ielchiieva
Assignee: Arina Ielchiieva
Fix For: 1.17.0
When count is converted to direct scan (case when statistics and table metadata are available and there is no need to perform count operation), {{MetadataDirectGroupScan}} is used. Proposed {{MetadataDirectGroupScan}} enhancements:
1. show table root instead listing all table files. If users= has lots of files, query plan gets polluted with files enumeration. Since files are not used for calculation (only metadata), they are not relevant and can be excluded from plan.
Before:
{noformat}
| 00-00 Screen
00-01 Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3])
00-02 DirectScan(groupscan=[files = [/drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_0.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_5.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_4.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_9.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_3.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_6.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_7.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_10.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_2.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_1.parquet, /drill/testdata/metadata_cache/store_sales_null_blocks_all/0_0_8.parquet], numFiles = 11, usedMetadataSummaryFile = false, DynamicPojoRecordReader{records = [[1560060, 2880404, 2880404, 0]]}])
{noformat}
After:
{noformat}
| 00-00 Screen
00-01 Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3])
00-02 DirectScan(groupscan=[selectionRoot = /drill/testdata/metadata_cache/store_sales_null_blocks_all, numFiles = 11,
usedMetadataSummaryFile = false, DynamicPojoRecordReader{records = [[1560060, 2880404, 2880404, 0]]}])
{noformat}
2. Submission of physical plan which contains {{MetadataDirectGroupScan}} fails with deserialization errors, proper ser / de should be implemented.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)