You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Adriano (Jira)" <ji...@apache.org> on 2021/12/06 08:56:00 UTC

[jira] [Created] (ATLAS-4501) Table 'Allow list' with a default deny to load only a subset of tables

Adriano created ATLAS-4501:
------------------------------

             Summary: Table 'Allow list' with a default deny to load only a subset of tables
                 Key: ATLAS-4501
                 URL: https://issues.apache.org/jira/browse/ATLAS-4501
             Project: Atlas
          Issue Type: New Feature
          Components:  atlas-core, atlas-intg, hive-integration
            Reporter: Adriano


There are some huge environments where the warehouse has a thousand databases and hundred thousand tables with many columns and most of them are dropped, created, updated at a fast pace. In these environments, the Atlas processing time can slow down increasing the backlog as it starts moving slower than the changes in the warehouse and the {{prune.pattern}} e/o {{ignore.pattern}} it is not suitable.

It will be nice to have the opportunity to have a default deny behaviour for all the tables and then to 'allow' the import of a subset of tables specified in a parameter regex (in order to process only some important tables): basically that works in the opposite way to the {{prune.pattern}} and {{ignore.pattern.}}

As far as I know, there is a similar feature for S3 and ADLS but not for hive.
If this is the case, will be nice to get the feature onboarded in your backlog.




--
This message was sent by Atlassian Jira
(v8.20.1#820001)