You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2017/08/24 15:29:00 UTC
[jira] [Commented] (DRILL-5739) Query reads all files after issuing
REFRESH TABLE METADATA command
[ https://issues.apache.org/jira/browse/DRILL-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140172#comment-16140172 ]
Khurram Faraaz commented on DRILL-5739:
---------------------------------------
Can you please share the SQL query here ?
> Query reads all files after issuing REFRESH TABLE METADATA command
> -------------------------------------------------------------------
>
> Key: DRILL-5739
> URL: https://issues.apache.org/jira/browse/DRILL-5739
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.10.0
> Reporter: Divya
>
> Hi,
> Query takes lot of time after issuing refresh metadata command as it is reading all the files .
> ||Value||Before Refresh Metadata||After Refresh Metadata||
> |Fragments|1|13|
> |DURATION|01 min 0.233 sec|18 min 0.744 sec|
> |PLANNING|59.818 sec|33.087 sec|
> |QUEUED|Not Available|Not Available|
> |EXECUTION|0.415 sec|17 min 27.657 sec|
> I cant paste the whole physical plan for the query
> Pasting the relevant one :
> *Physical Plan Before Refresh Meta *
> numFiles=4, usedMetadataFile=false, columns=
> rowcount = 12.0, cumulative cost = {12.0 rows, 780.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 9395
> *Physical Plan After Refresh Meta *
> numFiles=102290, usedMetadataFile=true, cacheFileRoot=<Path to file >
> rowcount = 1182008.0, cumulative cost = {1182008.0 rows, 7.683052E7 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 9685
> *Additional Info :*
> file format - Parquet
> table - partitioned by year,month,day,hour
> query format - selecting all the columns with by using partition column in where conditions
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)