You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Pinal Shah <pi...@freestoneinfotech.com> on 2020/08/28 13:33:19 UTC
Review Request 72815: ENGESC-3520: Deletion of non existing hive
entities
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72815/
-----------------------------------------------------------
Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
Repository: atlas
Description
-------
**Problem:** Whenever database or table is dropped in hive, and HiveHook is not enabled, we dont have anyway to get the database and table sync with hive.
**Workaround:** Added support to delete hive entities in Atlas which are dropped in hive.
**Usage:** ./import-hive.sh -deleteNonExisting
Diffs
-----
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java e659ca041
client/client-v2/src/main/java/org/apache/atlas/AtlasClientV2.java 6968e8358
Diff: https://reviews.apache.org/r/72815/diff/1/
Testing
-------
Manually tested
Thanks,
Pinal Shah
Re: Review Request 72815: Deletion of non existing hive entities
Posted by Pinal Shah <pi...@freestoneinfotech.com>.
> On Sept. 4, 2020, 4:04 p.m., Madhan Neethiraj wrote:
> > addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
> > Lines 959 (patched)
> > <https://reviews.apache.org/r/72815/diff/2/?file=2239187#file2239187line968>
> >
> > Returning after deleting one entity? Please review.
thanks for pointing, it is incorrect
- Pinal
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72815/#review221806
-----------------------------------------------------------
On Sept. 5, 2020, 5:02 a.m., Pinal Shah wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72815/
> -----------------------------------------------------------
>
> (Updated Sept. 5, 2020, 5:02 a.m.)
>
>
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Problem:** Whenever database or table is dropped in hive, and HiveHook is not enabled, we dont have anyway to get the database and table sync with hive.
>
> **Workaround:** Added support to delete hive entities in Atlas which are dropped in hive.
>
> **Usage:** ./import-hive.sh -deleteNonExisting
>
> **NOTE:** **atlas.hook.hive.page.limit** property is added to configure the pageSize/limit while fetching entities from Atlas
>
>
> Diffs
> -----
>
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java e659ca041
> client/client-v2/src/main/java/org/apache/atlas/AtlasClientV2.java 6968e8358
>
>
> Diff: https://reviews.apache.org/r/72815/diff/3/
>
>
> Testing
> -------
>
> Manually tested
>
>
> Thanks,
>
> Pinal Shah
>
>
Re: Review Request 72815: Deletion of non existing hive entities
Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72815/#review221806
-----------------------------------------------------------
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 862 (patched)
<https://reviews.apache.org/r/72815/#comment310888>
Since there is only one criteria in the filter, there is no need for innerFc. Consider the following:
SearchParameters.FilterCriteria fc = new SearchParameters.FilterCriteria();
fc.setAttributeName(ATTRIBUTE_CLUSTER_NAME);
fc.setAttributeValue(clusterName);
fc.setOperator(SearchParameters.Operator.EQ);
Also, the filter criteira is the same across all iteration of the for loop at #858. Consider setting up the criteira before once before entering the for loop.
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 959 (patched)
<https://reviews.apache.org/r/72815/#comment310890>
Returning after deleting one entity? Please review.
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 968 (patched)
<https://reviews.apache.org/r/72815/#comment310892>
deleteHiveMetadata() => deleteEntitiesForNonExistingHiveMetadata()
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 999 (patched)
<https://reviews.apache.org/r/72815/#comment310891>
There is only one parameter to replace, as LOG.error() will print execption details for 'e'. Please review and remove ', {}' at the end of the message.
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 1006 (patched)
<https://reviews.apache.org/r/72815/#comment310889>
Consider replacing:
Queue => List
LinkedList => ArrayList
- Madhan Neethiraj
On Sept. 3, 2020, 12:21 p.m., Pinal Shah wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72815/
> -----------------------------------------------------------
>
> (Updated Sept. 3, 2020, 12:21 p.m.)
>
>
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Problem:** Whenever database or table is dropped in hive, and HiveHook is not enabled, we dont have anyway to get the database and table sync with hive.
>
> **Workaround:** Added support to delete hive entities in Atlas which are dropped in hive.
>
> **Usage:** ./import-hive.sh -deleteNonExisting
>
> **NOTE:** **atlas.hook.hive.page.limit** property is added to configure the pageSize/limit while fetching entities from Atlas
>
>
> Diffs
> -----
>
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java e659ca041
> client/client-v2/src/main/java/org/apache/atlas/AtlasClientV2.java 6968e8358
>
>
> Diff: https://reviews.apache.org/r/72815/diff/2/
>
>
> Testing
> -------
>
> Manually tested
>
>
> Thanks,
>
> Pinal Shah
>
>
Re: Review Request 72815: Deletion of non existing hive entities
Posted by Pinal Shah <pi...@freestoneinfotech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72815/
-----------------------------------------------------------
(Updated Sept. 11, 2020, 12:27 p.m.)
Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
Changes
-------
updating patch for branch-0.8
Repository: atlas
Description
-------
**Problem:** Whenever database or table is dropped in hive, and HiveHook is not enabled, we dont have anyway to get the database and table sync with hive.
**Workaround:** Added support to delete hive entities in Atlas which are dropped in hive.
**Usage:** ./import-hive.sh -deleteNonExisting
**NOTE:** **atlas.hook.hive.page.limit** property is added to configure the pageSize/limit while fetching entities from Atlas
Diffs (updated)
-----
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java f18d01b7c
client/client-v2/src/main/java/org/apache/atlas/AtlasClientV2.java 6968e8358
Diff: https://reviews.apache.org/r/72815/diff/4/
Changes: https://reviews.apache.org/r/72815/diff/3-4/
Testing
-------
Manually tested
Thanks,
Pinal Shah
Re: Review Request 72815: Deletion of non existing hive entities
Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72815/#review221808
-----------------------------------------------------------
Ship it!
Ship It!
- Madhan Neethiraj
On Sept. 5, 2020, 5:02 a.m., Pinal Shah wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72815/
> -----------------------------------------------------------
>
> (Updated Sept. 5, 2020, 5:02 a.m.)
>
>
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Problem:** Whenever database or table is dropped in hive, and HiveHook is not enabled, we dont have anyway to get the database and table sync with hive.
>
> **Workaround:** Added support to delete hive entities in Atlas which are dropped in hive.
>
> **Usage:** ./import-hive.sh -deleteNonExisting
>
> **NOTE:** **atlas.hook.hive.page.limit** property is added to configure the pageSize/limit while fetching entities from Atlas
>
>
> Diffs
> -----
>
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java e659ca041
> client/client-v2/src/main/java/org/apache/atlas/AtlasClientV2.java 6968e8358
>
>
> Diff: https://reviews.apache.org/r/72815/diff/3/
>
>
> Testing
> -------
>
> Manually tested
>
>
> Thanks,
>
> Pinal Shah
>
>
Re: Review Request 72815: Deletion of non existing hive entities
Posted by Pinal Shah <pi...@freestoneinfotech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72815/
-----------------------------------------------------------
(Updated Sept. 5, 2020, 5:02 a.m.)
Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
Changes
-------
addressed comments
Repository: atlas
Description
-------
**Problem:** Whenever database or table is dropped in hive, and HiveHook is not enabled, we dont have anyway to get the database and table sync with hive.
**Workaround:** Added support to delete hive entities in Atlas which are dropped in hive.
**Usage:** ./import-hive.sh -deleteNonExisting
**NOTE:** **atlas.hook.hive.page.limit** property is added to configure the pageSize/limit while fetching entities from Atlas
Diffs (updated)
-----
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java e659ca041
client/client-v2/src/main/java/org/apache/atlas/AtlasClientV2.java 6968e8358
Diff: https://reviews.apache.org/r/72815/diff/3/
Changes: https://reviews.apache.org/r/72815/diff/2-3/
Testing
-------
Manually tested
Thanks,
Pinal Shah
Re: Review Request 72815: Deletion of non existing hive entities
Posted by Pinal Shah <pi...@freestoneinfotech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72815/
-----------------------------------------------------------
(Updated Sept. 3, 2020, 12:21 p.m.)
Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
Repository: atlas
Description (updated)
-------
**Problem:** Whenever database or table is dropped in hive, and HiveHook is not enabled, we dont have anyway to get the database and table sync with hive.
**Workaround:** Added support to delete hive entities in Atlas which are dropped in hive.
**Usage:** ./import-hive.sh -deleteNonExisting
**NOTE:** **atlas.hook.hive.page.limit** property is added to configure the pageSize/limit while fetching entities from Atlas
Diffs
-----
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java e659ca041
client/client-v2/src/main/java/org/apache/atlas/AtlasClientV2.java 6968e8358
Diff: https://reviews.apache.org/r/72815/diff/2/
Testing
-------
Manually tested
Thanks,
Pinal Shah
Re: Review Request 72815: Deletion of non existing hive entities
Posted by Pinal Shah <pi...@freestoneinfotech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72815/
-----------------------------------------------------------
(Updated Sept. 3, 2020, 10:26 a.m.)
Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
Changes
-------
Addressed review comments
Repository: atlas
Description
-------
**Problem:** Whenever database or table is dropped in hive, and HiveHook is not enabled, we dont have anyway to get the database and table sync with hive.
**Workaround:** Added support to delete hive entities in Atlas which are dropped in hive.
**Usage:** ./import-hive.sh -deleteNonExisting
Diffs (updated)
-----
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java e659ca041
client/client-v2/src/main/java/org/apache/atlas/AtlasClientV2.java 6968e8358
Diff: https://reviews.apache.org/r/72815/diff/2/
Changes: https://reviews.apache.org/r/72815/diff/1-2/
Testing
-------
Manually tested
Thanks,
Pinal Shah
Re: Review Request 72815: Deletion of non existing hive entities
Posted by Pinal Shah <pi...@freestoneinfotech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72815/
-----------------------------------------------------------
(Updated Aug. 31, 2020, 8:51 a.m.)
Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
Summary (updated)
-----------------
Deletion of non existing hive entities
Repository: atlas
Description
-------
**Problem:** Whenever database or table is dropped in hive, and HiveHook is not enabled, we dont have anyway to get the database and table sync with hive.
**Workaround:** Added support to delete hive entities in Atlas which are dropped in hive.
**Usage:** ./import-hive.sh -deleteNonExisting
Diffs
-----
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java e659ca041
client/client-v2/src/main/java/org/apache/atlas/AtlasClientV2.java 6968e8358
Diff: https://reviews.apache.org/r/72815/diff/1/
Testing
-------
Manually tested
Thanks,
Pinal Shah
Re: Review Request 72815: ENGESC-3520: Deletion of non existing hive
entities
Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72815/#review221744
-----------------------------------------------------------
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 103 (patched)
<https://reviews.apache.org/r/72815/#comment310794>
Delete database and table entities in Atlas if not present in Hive
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 843 (patched)
<https://reviews.apache.org/r/72815/#comment310799>
getAllDatabaseInCluster(): consider retrieving databases where clusterName == atlas.metdata.namespace, to ensure that databases and tables that belong to another metadata namespace are not deleted.
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 846 (patched)
<https://reviews.apache.org/r/72815/#comment310796>
Consider reading limit (pageSize) from configuration.
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 850 (patched)
<https://reviews.apache.org/r/72815/#comment310795>
Consider following for better readability:
final List<AtlasEntityHeader> entities = new ArrayList<>();
final int pageSize = 10000;
for (int i = 0; ; i++) {
int offset = pageSize * i;
LOG.info("retrieving databases: offset={}, pageSize={}", offset, pageSize);
AtlasSearchResult searchResult = atlasClientV2.basicSearch(HIVE_TYPE_DB, null, null, true, pageSize, offset);
List<AtlasEntityHeader> entityHeaders = searchResult == null ? null : searchResult.getEntities();
int dbCount = entityHeaders == null ? 0 : entityHeaders.size();
LOG.info("retrieved {} databases", dbCount);
if (dbCount > 0) {
entities.addAll(entityHeaders);
}
if (dbCount < pageSize) { // last page
break;
}
}
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 872 (patched)
<https://reviews.apache.org/r/72815/#comment310797>
Consider updating getAllTablesInDb() per comment above for getAllDatabaseInCluster().
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 939 (patched)
<https://reviews.apache.org/r/72815/#comment310798>
Move #939 inside else block at #936.
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 966 (patched)
<https://reviews.apache.org/r/72815/#comment310800>
db.getAttribute(ATTRIBUTE_QUALIFIED_NAME).toString() => (String) db.getAttribute(ATTRIBUTE_QUALIFIED_NAME)
- this will handle NULL value from db.getAttribute(ATTRIBUTE_QUALIFIED_NAME)
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 969 (patched)
<https://reviews.apache.org/r/72815/#comment310801>
Include guid in the log message
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 978 (patched)
<https://reviews.apache.org/r/72815/#comment310802>
Failed to retrieve table entities for database {} from Atlas.
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
Lines 985 (patched)
<https://reviews.apache.org/r/72815/#comment310803>
Queue<String> guidsToDelete = new LinkedList<>();
if (!hiveClient.databaseExists(hiveDbName)) {
if (CollectionUtils.isNotEmpty(tables)) {
for (AtlasEntityHeader table : tables) {
guidsToDelete.add(table.getGuid());
}
}
guidsToDelete.add(db.getGuid());
} else {
if (CollectionUtils.isNotEmpty(tables)) {
for (AtlasEntityHeader table : tables) {
String hiveTableName = getHiveTableName((String) table.getAttribute(ATTRIBUTE_QUALIFIED_NAME), true);
if (StringUtils.isEmpty(hiveTableName)) {
LOG.error("Cannot get table from qualifiedName {} ", (String) table.getAttribute(ATTRIBUTE_QUALIFIED_NAME));
continue;
}
try {
hiveClient.getTable(hiveDbName, hiveTableName, true);
} catch (InvalidTableException e) { // table doesn't exist
LOG.info("Added table {}.{} to delete", hiveDbName, hiveTableName);
guidsToDelete.add(table.getGuid());
} catch (HiveException e) {
LOG.error("Failed to get table {}.{} from Hive", hiveDbName, hiveTableName, e);
if(failOnError) {
throw e;
}
}
}
}
}
if (CollectionUtils.isNotEmpty(guidsToDelete)) {
try {
deleteByGuid(guidsToDelete);
} catch (AtlasServiceException e) {
LOG.error("Failed to delete Atlas entities for database {} , {}", hiveDbName, e);
if(failOnError) {
throw e;
}
}
}
- Madhan Neethiraj
On Aug. 28, 2020, 1:33 p.m., Pinal Shah wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72815/
> -----------------------------------------------------------
>
> (Updated Aug. 28, 2020, 1:33 p.m.)
>
>
> Review request for atlas, Jayendra Parab, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Problem:** Whenever database or table is dropped in hive, and HiveHook is not enabled, we dont have anyway to get the database and table sync with hive.
>
> **Workaround:** Added support to delete hive entities in Atlas which are dropped in hive.
>
> **Usage:** ./import-hive.sh -deleteNonExisting
>
>
> Diffs
> -----
>
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java e659ca041
> client/client-v2/src/main/java/org/apache/atlas/AtlasClientV2.java 6968e8358
>
>
> Diff: https://reviews.apache.org/r/72815/diff/1/
>
>
> Testing
> -------
>
> Manually tested
>
>
> Thanks,
>
> Pinal Shah
>
>