You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Manish Maheshwari (JIRA)" <ji...@apache.org> on 2019/06/26 06:54:00 UTC

[jira] [Commented] (IMPALA-8708) Impala should ignore deleted files

    [ https://issues.apache.org/jira/browse/IMPALA-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873001#comment-16873001 ] 

Manish Maheshwari commented on IMPALA-8708:
-------------------------------------------

This can potentially be setup as an optional config flag to allow dirty reads / incomplete queries. Alternatively we can also run a RT / RTP on demand when we see this situation as part of zero touch metadata, however we will need to figure out how to run few/limited RT / RTP and not run them multiple times in parallel for each query

 

 

> Impala should ignore deleted files
> ----------------------------------
>
>                 Key: IMPALA-8708
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8708
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 3.2.0
>            Reporter: Gautam Gopalakrishnan
>            Priority: Major
>
> When querying an S3 backed table that is being modified (e.g. distcp content from another cluster) and Impala is able to determine that a file in that table has been deleted (e.g. using the S3guard feature in CDH), queries still fail with a {{FileNotFound}} exception.
> Performing a metadata refresh after the copy completes does resolve the problem. However this doesn't help during the copy phase. Requesting an enhancement where Impala can ignore files if knows that they've been deleted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org