You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2019/04/25 23:09:00 UTC

[jira] [Commented] (IMPALA-8454) Recursively list files within transactional tables

    [ https://issues.apache.org/jira/browse/IMPALA-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826527#comment-16826527 ] 

Todd Lipcon commented on IMPALA-8454:
-------------------------------------

Chatting with Gopal, it turns out that actually the Hive-on-Tez behavior is to always recursively list directories, including for partitions of external tables. This is actually important for interop on external tables because Tez will write to subdirectories for an insert if the insert contains a 'UNION ALL' (even in Hive 2 non-ACID).

We should probably consider making this a global flag and enabling by default (with the ability to roll back in case it breaks someone)

> Recursively list files within transactional tables
> --------------------------------------------------
>
>                 Key: IMPALA-8454
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8454
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Major
>
> For transactional tables, the data files are not directly within the partition directories, but instead are stored within subdirectories corresponding to writeIds, compactions, etc. To support this, we need to be able to recursively load file lists within partition directories.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org