You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Igor Kryvenko (JIRA)" <ji...@apache.org> on 2018/10/03 10:18:00 UTC

[jira] [Assigned] (HIVE-20681) Support custom path filter for ORC tables

     [ https://issues.apache.org/jira/browse/HIVE-20681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Igor Kryvenko reassigned HIVE-20681:
------------------------------------


> Support custom path filter for ORC tables
> -----------------------------------------
>
>                 Key: HIVE-20681
>                 URL: https://issues.apache.org/jira/browse/HIVE-20681
>             Project: Hive
>          Issue Type: Improvement
>          Components: Transactions
>            Reporter: Igor Kryvenko
>            Assignee: Igor Kryvenko
>            Priority: Minor
>
> Currently, Orc file input format does not take in path filters set in the property "mapreduce.input.pathfilter.class" OR " mapred.input.pathfilter.class ". So, we cannot use custom filters with Orc files.
> AcidUtils class has a static filter called "hiddenFilters" which is used by ORC to filter input paths. If we can pass the custom filter classes(set in the property mentioned above) to AcidUtils and replace hiddenFilter with a filter that does an "and" operation over hiddenFilter+customFilters, the filters would work well.
> It would be useful to have the ability to filter out rows based on path/filenames, current ORC features like bloom filters and indexes are not good enough for them to minimize the number of disk read operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)