You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Hemanth Yamijala (JIRA)" <ji...@apache.org> on 2016/01/04 13:33:39 UTC
[jira] [Updated] (ATLAS-415) Hive import fails when importing a
table that is already imported without StorageDescriptor information
[ https://issues.apache.org/jira/browse/ATLAS-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hemanth Yamijala updated ATLAS-415:
-----------------------------------
Attachment: ATLAS-415.patch
Attaching a patch for quick review.
The main fix is in using the API {{AtlasClient.updateEntity}} when we find a table is already registered with Atlas. The rest of the changes are to assist a unit test I wrote {{HiveMetaStoreBridgeTest}} and some refactoring.
With this patch, hive-import works for the case I described in the bug and updates the created table properly.
Couple of points that I want to call out to discuss in review:
* This might add additional calls to the server even when there's absolutely no change to the entity. Guess this will have a performance impact, but I am unsure how we can detect if there's any change on the client.
* Currently, I am doing the update only for Tables. Is this needed for DB and partitions as well? (I guess yes)
> Hive import fails when importing a table that is already imported without StorageDescriptor information
> -------------------------------------------------------------------------------------------------------
>
> Key: ATLAS-415
> URL: https://issues.apache.org/jira/browse/ATLAS-415
> Project: Atlas
> Issue Type: Bug
> Reporter: Hemanth Yamijala
> Assignee: Hemanth Yamijala
> Attachments: ATLAS-415.patch
>
>
> I found this when testing patches that integrate Storm with Atlas, but guess this may occur in other scenarios as well.
> To reproduce:
> * Run a storm topology with Atlas Hook enabled that has a HiveBolt (requires patches for ATLAS-181 and friends).
> * Run hive-import following the above.
> The first step creates a Hive DB and table setting just the required attributes. Note that the StorageDescriptor is an optional attribute as per the Hive DataModel now.
> The second step fails with this exception:
> {code}
> Exception in thread "main" java.lang.NullPointerException
> at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.getSDForTable(HiveMetaStoreBridge.java:345)
> at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importTables(HiveMetaStoreBridge.java:219)
> at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importDatabases(HiveMetaStoreBridge.java:104)
> at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importHiveMetadata(HiveMetaStoreBridge.java:96)
> at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBridge.java:503)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)