You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Solbi Choi (Jira)" <ji...@apache.org> on 2021/10/22 10:16:00 UTC

[jira] [Updated] (ATLAS-4460) Search API gets deleted partitionKeys(and columns) of Hive table

     [ https://issues.apache.org/jira/browse/ATLAS-4460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Solbi Choi updated ATLAS-4460:
------------------------------
    Summary: Search API gets deleted partitionKeys(and columns) of Hive table  (was: Search API gets deleted partitionKeys of Hive table)

> Search API gets deleted partitionKeys(and columns) of Hive table
> ----------------------------------------------------------------
>
>                 Key: ATLAS-4460
>                 URL: https://issues.apache.org/jira/browse/ATLAS-4460
>             Project: Atlas
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 2.2.0
>            Reporter: Solbi Choi
>            Priority: Major
>         Attachments: 스크린샷 2021-10-22 오후 5.07.47.png
>
>
> Problems
> When one of the partitionKeys in hive table deleted, atlas search API still gets all partitionKeys including deleted one.
> Adding
> {code:java}
> "excludeDeletedEntities": True{code}
> in request json doesn't work in this situation.
>  
> Reproduce
>  * Create hive table with partition key and sync it using hive-import.
>  * Delete the hive table and re-create hive table with same name but without partition key this time. Re-sync using hive-import.
>  * Then you can see the partitionKey deleted in Atlas web view.
> !스크린샷 2021-10-22 오후 5.07.47.png!
>  * But when trying search API to get the hive table entity using
> {code:java}
>    request = { "typeName": "hive_table", 
>                "attributes": [ "db", "name", "partitionKeys" ],
>                "excludeDeletedEntities": True,
>                "limit": limit,
>                "offset": offset
>               }
> {code}
>  * You get the deleted partitionKey also.
> {code:java}
> 'partitionKeys': [
> {'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': {'qualifiedName': 'foo.test_partition_drop.ds@primary'}}, 
> {'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': {'qualifiedName': 'foo.test_partition_drop.ts@primary'}}
> ]
> {code}
>  
>  
> Additionally, this is reproduced within *hive columns, too.*
> After changing column name by alter table statement, (eg. foo -> bar)
> the Search API gives 2 columns(foo and bar) as result of the hive table even with "excludeDeletedEntities" {color:#172b4d}option.{color}
>  
> {color:#172b4d}!https://media.oss.navercorp.com/user/16858/files/94f0c600-336a-11ec-8388-329a5c8a6323!{color}
>  
>  
> {code:java}
> 'columns': [
> {'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': {'qualifiedName': 'db_name.test_partition_drop.bar@primary'}}, 
> {'guid': '****', 'typeName': 'hive_column', 'uniqueAttributes': {'qualifiedName': 'db_name.test_partition_drop.foo@primary'}}
> ]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)