You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Suma Shivaprasad <su...@gmail.com> on 2016/04/06 01:54:19 UTC
Review Request 45784: Hve Hook - Support tracking lineage for External
Tables( Create/alter) , Load, import, export
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45784/
-----------------------------------------------------------
Review request for atlas.
Bugs: ATLAS-527
https://issues.apache.org/jira/browse/ATLAS-527
Repository: atlas
Description
-------
Added support to track lineage between HDFS Paths and hive tables in
a. LOAD( at table, partition level) - input is a HDFS path and output is table( even though we dont create partition entities, we are still tracking the lineage at table level for partitions. This could be an issue if there are large number of partition queries which is not being addressed in this jira - https://issues.apache.org/jira/browse/ATLAS-619) . refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
b. IMPORT, EXPORT to and from hdfs paths - Refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
c. CREATE EXTERNAL TABLE - input is hdfs path and o/p is table
d. ALTER TABLE LOCATION for an external table - input is the new hdfs path and o/p is the table.
Diffs
-----
addons/hdfs-model/src/main/java/org/apache/atlas/fs/model/FSDataModelGenerator.java 555d565
addons/hdfs-model/src/main/scala/org/apache/atlas/fs/model/FSDataModel.scala c964f73
addons/hive-bridge/pom.xml e125f18
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 3a802d7
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 68e32ff
addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e17afb8
addons/storm-bridge/src/main/java/org/apache/atlas/storm/hook/StormAtlasHook.java 5665856
repository/src/main/java/org/apache/atlas/services/DefaultMetadataService.java 0a04c5f
repository/src/main/java/org/apache/atlas/services/ReservedTypesRegistrar.java 430bb6b
Diff: https://reviews.apache.org/r/45784/diff/
Testing
-------
Thanks,
Suma Shivaprasad
Re: Review Request 45784: Hve Hook - Support tracking lineage for
External Tables( Create/alter) , Load, import, export
Posted by Shwetha GS <ss...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45784/#review127565
-----------------------------------------------------------
Ship it!
Ship It!
- Shwetha GS
On April 6, 2016, 7:08 p.m., Suma Shivaprasad wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45784/
> -----------------------------------------------------------
>
> (Updated April 6, 2016, 7:08 p.m.)
>
>
> Review request for atlas.
>
>
> Bugs: ATLAS-527
> https://issues.apache.org/jira/browse/ATLAS-527
>
>
> Repository: atlas
>
>
> Description
> -------
>
> Added support to track lineage between HDFS Paths and hive tables in
>
> a. LOAD( at table, partition level) - input is a HDFS path and output is table( even though we dont create partition entities, we are still tracking the lineage at table level for partitions. This could be an issue if there are large number of partition queries which is not being addressed in this jira - https://issues.apache.org/jira/browse/ATLAS-619) . refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
> b. IMPORT, EXPORT to and from hdfs paths - Refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
> c. CREATE EXTERNAL TABLE - input is hdfs path and o/p is table
> d. ALTER TABLE LOCATION for an external table - input is the new hdfs path and o/p is the table.
>
> Also changed the ordering of model registration by sorting them by modifiedTime to ensure they are registered in correct order
>
>
> Diffs
> -----
>
> addons/hdfs-model/src/main/java/org/apache/atlas/fs/model/FSDataModelGenerator.java 555d565
> addons/hdfs-model/src/main/scala/org/apache/atlas/fs/model/FSDataModel.scala c964f73
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 3a802d7
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 68e32ff
> addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e17afb8
> addons/storm-bridge/src/main/java/org/apache/atlas/storm/hook/StormAtlasHook.java 5665856
> client/src/main/java/org/apache/atlas/AtlasClient.java c3b4ba9
> repository/src/main/java/org/apache/atlas/services/ReservedTypesRegistrar.java 430bb6b
>
> Diff: https://reviews.apache.org/r/45784/diff/
>
>
> Testing
> -------
>
> Added tests in HiveHookIT
>
>
> Thanks,
>
> Suma Shivaprasad
>
>
Re: Review Request 45784: Hve Hook - Support tracking lineage for
External Tables( Create/alter) , Load, import, export
Posted by Suma Shivaprasad <su...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45784/
-----------------------------------------------------------
(Updated April 6, 2016, 7:08 p.m.)
Review request for atlas.
Changes
-------
Removed clusterName attribute since this may be incorrect
Bugs: ATLAS-527
https://issues.apache.org/jira/browse/ATLAS-527
Repository: atlas
Description
-------
Added support to track lineage between HDFS Paths and hive tables in
a. LOAD( at table, partition level) - input is a HDFS path and output is table( even though we dont create partition entities, we are still tracking the lineage at table level for partitions. This could be an issue if there are large number of partition queries which is not being addressed in this jira - https://issues.apache.org/jira/browse/ATLAS-619) . refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
b. IMPORT, EXPORT to and from hdfs paths - Refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
c. CREATE EXTERNAL TABLE - input is hdfs path and o/p is table
d. ALTER TABLE LOCATION for an external table - input is the new hdfs path and o/p is the table.
Also changed the ordering of model registration by sorting them by modifiedTime to ensure they are registered in correct order
Diffs (updated)
-----
addons/hdfs-model/src/main/java/org/apache/atlas/fs/model/FSDataModelGenerator.java 555d565
addons/hdfs-model/src/main/scala/org/apache/atlas/fs/model/FSDataModel.scala c964f73
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 3a802d7
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 68e32ff
addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e17afb8
addons/storm-bridge/src/main/java/org/apache/atlas/storm/hook/StormAtlasHook.java 5665856
client/src/main/java/org/apache/atlas/AtlasClient.java c3b4ba9
repository/src/main/java/org/apache/atlas/services/ReservedTypesRegistrar.java 430bb6b
Diff: https://reviews.apache.org/r/45784/diff/
Testing
-------
Added tests in HiveHookIT
Thanks,
Suma Shivaprasad
Re: Review Request 45784: Hve Hook - Support tracking lineage for
External Tables( Create/alter) , Load, import, export
Posted by Suma Shivaprasad <su...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45784/
-----------------------------------------------------------
(Updated April 6, 2016, 6:10 p.m.)
Review request for atlas.
Bugs: ATLAS-527
https://issues.apache.org/jira/browse/ATLAS-527
Repository: atlas
Description
-------
Added support to track lineage between HDFS Paths and hive tables in
a. LOAD( at table, partition level) - input is a HDFS path and output is table( even though we dont create partition entities, we are still tracking the lineage at table level for partitions. This could be an issue if there are large number of partition queries which is not being addressed in this jira - https://issues.apache.org/jira/browse/ATLAS-619) . refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
b. IMPORT, EXPORT to and from hdfs paths - Refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
c. CREATE EXTERNAL TABLE - input is hdfs path and o/p is table
d. ALTER TABLE LOCATION for an external table - input is the new hdfs path and o/p is the table.
Also changed the ordering of model registration by sorting them by modifiedTime to ensure they are registered in correct order
Diffs (updated)
-----
addons/hdfs-model/src/main/java/org/apache/atlas/fs/model/FSDataModelGenerator.java 555d565
addons/hdfs-model/src/main/scala/org/apache/atlas/fs/model/FSDataModel.scala c964f73
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 3a802d7
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 68e32ff
addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e17afb8
addons/storm-bridge/src/main/java/org/apache/atlas/storm/hook/StormAtlasHook.java 5665856
client/src/main/java/org/apache/atlas/AtlasClient.java c3b4ba9
repository/src/main/java/org/apache/atlas/services/ReservedTypesRegistrar.java 430bb6b
Diff: https://reviews.apache.org/r/45784/diff/
Testing
-------
Added tests in HiveHookIT
Thanks,
Suma Shivaprasad
Re: Review Request 45784: Hve Hook - Support tracking lineage for
External Tables( Create/alter) , Load, import, export
Posted by Suma Shivaprasad <su...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45784/
-----------------------------------------------------------
(Updated April 6, 2016, 6:08 p.m.)
Review request for atlas.
Changes
-------
Removed extra constants from DMS
Bugs: ATLAS-527
https://issues.apache.org/jira/browse/ATLAS-527
Repository: atlas
Description
-------
Added support to track lineage between HDFS Paths and hive tables in
a. LOAD( at table, partition level) - input is a HDFS path and output is table( even though we dont create partition entities, we are still tracking the lineage at table level for partitions. This could be an issue if there are large number of partition queries which is not being addressed in this jira - https://issues.apache.org/jira/browse/ATLAS-619) . refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
b. IMPORT, EXPORT to and from hdfs paths - Refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
c. CREATE EXTERNAL TABLE - input is hdfs path and o/p is table
d. ALTER TABLE LOCATION for an external table - input is the new hdfs path and o/p is the table.
Also changed the ordering of model registration by sorting them by modifiedTime to ensure they are registered in correct order
Diffs (updated)
-----
addons/hdfs-model/src/main/java/org/apache/atlas/fs/model/FSDataModelGenerator.java 555d565
addons/hdfs-model/src/main/scala/org/apache/atlas/fs/model/FSDataModel.scala c964f73
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 3a802d7
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 68e32ff
addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e17afb8
addons/storm-bridge/src/main/java/org/apache/atlas/storm/hook/StormAtlasHook.java 5665856
client/src/main/java/org/apache/atlas/AtlasClient.java c3b4ba9
repository/src/main/java/org/apache/atlas/services/ReservedTypesRegistrar.java 430bb6b
Diff: https://reviews.apache.org/r/45784/diff/
Testing
-------
Added tests in HiveHookIT
Thanks,
Suma Shivaprasad
Re: Review Request 45784: Hve Hook - Support tracking lineage for
External Tables( Create/alter) , Load, import, export
Posted by Suma Shivaprasad <su...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45784/
-----------------------------------------------------------
(Updated April 6, 2016, 5:44 p.m.)
Review request for atlas.
Changes
-------
Fixed review comments
Bugs: ATLAS-527
https://issues.apache.org/jira/browse/ATLAS-527
Repository: atlas
Description
-------
Added support to track lineage between HDFS Paths and hive tables in
a. LOAD( at table, partition level) - input is a HDFS path and output is table( even though we dont create partition entities, we are still tracking the lineage at table level for partitions. This could be an issue if there are large number of partition queries which is not being addressed in this jira - https://issues.apache.org/jira/browse/ATLAS-619) . refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
b. IMPORT, EXPORT to and from hdfs paths - Refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
c. CREATE EXTERNAL TABLE - input is hdfs path and o/p is table
d. ALTER TABLE LOCATION for an external table - input is the new hdfs path and o/p is the table.
Also changed the ordering of model registration by sorting them by modifiedTime to ensure they are registered in correct order
Diffs (updated)
-----
addons/hdfs-model/src/main/java/org/apache/atlas/fs/model/FSDataModelGenerator.java 555d565
addons/hdfs-model/src/main/scala/org/apache/atlas/fs/model/FSDataModel.scala c964f73
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 3a802d7
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 68e32ff
addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e17afb8
addons/storm-bridge/src/main/java/org/apache/atlas/storm/hook/StormAtlasHook.java 5665856
client/src/main/java/org/apache/atlas/AtlasClient.java c3b4ba9
repository/src/main/java/org/apache/atlas/services/DefaultMetadataService.java 0a04c5f
repository/src/main/java/org/apache/atlas/services/ReservedTypesRegistrar.java 430bb6b
Diff: https://reviews.apache.org/r/45784/diff/
Testing
-------
Added tests in HiveHookIT
Thanks,
Suma Shivaprasad
Re: Review Request 45784: Hve Hook - Support tracking lineage for
External Tables( Create/alter) , Load, import, export
Posted by Suma Shivaprasad <su...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45784/
-----------------------------------------------------------
(Updated April 6, 2016, 5:03 p.m.)
Review request for atlas.
Bugs: ATLAS-527
https://issues.apache.org/jira/browse/ATLAS-527
Repository: atlas
Description
-------
Added support to track lineage between HDFS Paths and hive tables in
a. LOAD( at table, partition level) - input is a HDFS path and output is table( even though we dont create partition entities, we are still tracking the lineage at table level for partitions. This could be an issue if there are large number of partition queries which is not being addressed in this jira - https://issues.apache.org/jira/browse/ATLAS-619) . refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
b. IMPORT, EXPORT to and from hdfs paths - Refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
c. CREATE EXTERNAL TABLE - input is hdfs path and o/p is table
d. ALTER TABLE LOCATION for an external table - input is the new hdfs path and o/p is the table.
Also changed the ordering of model registration by sorting them by modifiedTime to ensure they are registered in correct order
Diffs (updated)
-----
addons/hdfs-model/src/main/java/org/apache/atlas/fs/model/FSDataModelGenerator.java 555d565
addons/hdfs-model/src/main/scala/org/apache/atlas/fs/model/FSDataModel.scala c964f73
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 3a802d7
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 68e32ff
addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e17afb8
addons/storm-bridge/src/main/java/org/apache/atlas/storm/hook/StormAtlasHook.java 5665856
client/src/main/java/org/apache/atlas/AtlasClient.java c3b4ba9
repository/src/main/java/org/apache/atlas/services/ReservedTypesRegistrar.java 430bb6b
Diff: https://reviews.apache.org/r/45784/diff/
Testing
-------
Added tests in HiveHookIT
Thanks,
Suma Shivaprasad
Re: Review Request 45784: Hve Hook - Support tracking lineage for
External Tables( Create/alter) , Load, import, export
Posted by Suma Shivaprasad <su...@gmail.com>.
> On April 6, 2016, 11:21 a.m., Shwetha GS wrote:
> > addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java, line 519
> > <https://reviews.apache.org/r/45784/diff/1/?file=1327185#file1327185line519>
> >
> > Aren't there cases where input/output is local fs, for example load from local path?
I am filtering out the cases where it is LOCAL_DIR by checking getType = DFS_DIR and theres also test case for LOAD local DIR and INSERT into local dir which confirms that this case is addressed. You are suggesting we ignore local dirs right?
> On April 6, 2016, 11:21 a.m., Shwetha GS wrote:
> > addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java, line 558
> > <https://reviews.apache.org/r/45784/diff/1/?file=1327185#file1327185line558>
> >
> > This should be part of HiveMetaStoreBridge and should be used in import-hive as well?
> >
> > Because this lineage will be created in import-hive, process name should be just tablename for create table so that its created just once.
Initially this was my thought too. However not sure how to get the query for the create table itself. I checked how show create table constructs this and it is on the fly and it does not store in metadata. Also, if we dont address this, tt will look different from the other lineages where we will always hav the query in the process . So did nto want to address this now till we figure out how we can construct the query itself. Created a separate issue to track this - https://issues.apache.org/jira/browse/ATLAS-642
- Suma
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45784/#review127310
-----------------------------------------------------------
On April 5, 2016, 11:58 p.m., Suma Shivaprasad wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45784/
> -----------------------------------------------------------
>
> (Updated April 5, 2016, 11:58 p.m.)
>
>
> Review request for atlas.
>
>
> Bugs: ATLAS-527
> https://issues.apache.org/jira/browse/ATLAS-527
>
>
> Repository: atlas
>
>
> Description
> -------
>
> Added support to track lineage between HDFS Paths and hive tables in
>
> a. LOAD( at table, partition level) - input is a HDFS path and output is table( even though we dont create partition entities, we are still tracking the lineage at table level for partitions. This could be an issue if there are large number of partition queries which is not being addressed in this jira - https://issues.apache.org/jira/browse/ATLAS-619) . refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
> b. IMPORT, EXPORT to and from hdfs paths - Refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
> c. CREATE EXTERNAL TABLE - input is hdfs path and o/p is table
> d. ALTER TABLE LOCATION for an external table - input is the new hdfs path and o/p is the table.
>
> Also changed the ordering of model registration by sorting them by modifiedTime to ensure they are registered in correct order
>
>
> Diffs
> -----
>
> addons/hdfs-model/src/main/java/org/apache/atlas/fs/model/FSDataModelGenerator.java 555d565
> addons/hdfs-model/src/main/scala/org/apache/atlas/fs/model/FSDataModel.scala c964f73
> addons/hive-bridge/pom.xml e125f18
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 3a802d7
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 68e32ff
> addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e17afb8
> addons/storm-bridge/src/main/java/org/apache/atlas/storm/hook/StormAtlasHook.java 5665856
> repository/src/main/java/org/apache/atlas/services/DefaultMetadataService.java 0a04c5f
> repository/src/main/java/org/apache/atlas/services/ReservedTypesRegistrar.java 430bb6b
>
> Diff: https://reviews.apache.org/r/45784/diff/
>
>
> Testing
> -------
>
> Added tests in HiveHookIT
>
>
> Thanks,
>
> Suma Shivaprasad
>
>
Re: Review Request 45784: Hve Hook - Support tracking lineage for
External Tables( Create/alter) , Load, import, export
Posted by Suma Shivaprasad <su...@gmail.com>.
> On April 6, 2016, 11:21 a.m., Shwetha GS wrote:
> > addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java, line 480
> > <https://reviews.apache.org/r/45784/diff/1/?file=1327184#file1327184line480>
> >
> > We need to fix the clusterName mess later - can't pickup hdfs clustername from hive conf
Have removed it for now since we dont know the right clusterName
- Suma
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45784/#review127310
-----------------------------------------------------------
On April 6, 2016, 7:08 p.m., Suma Shivaprasad wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45784/
> -----------------------------------------------------------
>
> (Updated April 6, 2016, 7:08 p.m.)
>
>
> Review request for atlas.
>
>
> Bugs: ATLAS-527
> https://issues.apache.org/jira/browse/ATLAS-527
>
>
> Repository: atlas
>
>
> Description
> -------
>
> Added support to track lineage between HDFS Paths and hive tables in
>
> a. LOAD( at table, partition level) - input is a HDFS path and output is table( even though we dont create partition entities, we are still tracking the lineage at table level for partitions. This could be an issue if there are large number of partition queries which is not being addressed in this jira - https://issues.apache.org/jira/browse/ATLAS-619) . refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
> b. IMPORT, EXPORT to and from hdfs paths - Refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
> c. CREATE EXTERNAL TABLE - input is hdfs path and o/p is table
> d. ALTER TABLE LOCATION for an external table - input is the new hdfs path and o/p is the table.
>
> Also changed the ordering of model registration by sorting them by modifiedTime to ensure they are registered in correct order
>
>
> Diffs
> -----
>
> addons/hdfs-model/src/main/java/org/apache/atlas/fs/model/FSDataModelGenerator.java 555d565
> addons/hdfs-model/src/main/scala/org/apache/atlas/fs/model/FSDataModel.scala c964f73
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 3a802d7
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 68e32ff
> addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e17afb8
> addons/storm-bridge/src/main/java/org/apache/atlas/storm/hook/StormAtlasHook.java 5665856
> client/src/main/java/org/apache/atlas/AtlasClient.java c3b4ba9
> repository/src/main/java/org/apache/atlas/services/ReservedTypesRegistrar.java 430bb6b
>
> Diff: https://reviews.apache.org/r/45784/diff/
>
>
> Testing
> -------
>
> Added tests in HiveHookIT
>
>
> Thanks,
>
> Suma Shivaprasad
>
>
Re: Review Request 45784: Hve Hook - Support tracking lineage for
External Tables( Create/alter) , Load, import, export
Posted by Shwetha GS <ss...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45784/#review127310
-----------------------------------------------------------
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java (line 472)
<https://reviews.apache.org/r/45784/#comment190571>
We need to fix the clusterName mess later - can't pickup hdfs clustername from hive conf
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java (line 220)
<https://reviews.apache.org/r/45784/#comment190572>
Earlier one was more readable. You can use set methods instead of this long constructor?
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java (line 454)
<https://reviews.apache.org/r/45784/#comment190574>
Aren't there cases where input/output is local fs, for example load from local path?
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java (line 493)
<https://reviews.apache.org/r/45784/#comment190573>
This should be part of HiveMetaStoreBridge and should be used in import-hive as well?
Because this lineage will be created in import-hive, process name should be just tablename for create table so that its created just once.
repository/src/main/java/org/apache/atlas/services/DefaultMetadataService.java (line 165)
<https://reviews.apache.org/r/45784/#comment190575>
Use these in Process type definition.
Actually, these should be in AtlasClient?
- Shwetha GS
On April 5, 2016, 11:58 p.m., Suma Shivaprasad wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45784/
> -----------------------------------------------------------
>
> (Updated April 5, 2016, 11:58 p.m.)
>
>
> Review request for atlas.
>
>
> Bugs: ATLAS-527
> https://issues.apache.org/jira/browse/ATLAS-527
>
>
> Repository: atlas
>
>
> Description
> -------
>
> Added support to track lineage between HDFS Paths and hive tables in
>
> a. LOAD( at table, partition level) - input is a HDFS path and output is table( even though we dont create partition entities, we are still tracking the lineage at table level for partitions. This could be an issue if there are large number of partition queries which is not being addressed in this jira - https://issues.apache.org/jira/browse/ATLAS-619) . refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
> b. IMPORT, EXPORT to and from hdfs paths - Refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
> c. CREATE EXTERNAL TABLE - input is hdfs path and o/p is table
> d. ALTER TABLE LOCATION for an external table - input is the new hdfs path and o/p is the table.
>
> Also changed the ordering of model registration by sorting them by modifiedTime to ensure they are registered in correct order
>
>
> Diffs
> -----
>
> addons/hdfs-model/src/main/java/org/apache/atlas/fs/model/FSDataModelGenerator.java 555d565
> addons/hdfs-model/src/main/scala/org/apache/atlas/fs/model/FSDataModel.scala c964f73
> addons/hive-bridge/pom.xml e125f18
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 3a802d7
> addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 68e32ff
> addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e17afb8
> addons/storm-bridge/src/main/java/org/apache/atlas/storm/hook/StormAtlasHook.java 5665856
> repository/src/main/java/org/apache/atlas/services/DefaultMetadataService.java 0a04c5f
> repository/src/main/java/org/apache/atlas/services/ReservedTypesRegistrar.java 430bb6b
>
> Diff: https://reviews.apache.org/r/45784/diff/
>
>
> Testing
> -------
>
> Added tests in HiveHookIT
>
>
> Thanks,
>
> Suma Shivaprasad
>
>
Re: Review Request 45784: Hve Hook - Support tracking lineage for
External Tables( Create/alter) , Load, import, export
Posted by Suma Shivaprasad <su...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45784/
-----------------------------------------------------------
(Updated April 5, 2016, 11:58 p.m.)
Review request for atlas.
Bugs: ATLAS-527
https://issues.apache.org/jira/browse/ATLAS-527
Repository: atlas
Description (updated)
-------
Added support to track lineage between HDFS Paths and hive tables in
a. LOAD( at table, partition level) - input is a HDFS path and output is table( even though we dont create partition entities, we are still tracking the lineage at table level for partitions. This could be an issue if there are large number of partition queries which is not being addressed in this jira - https://issues.apache.org/jira/browse/ATLAS-619) . refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
b. IMPORT, EXPORT to and from hdfs paths - Refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
c. CREATE EXTERNAL TABLE - input is hdfs path and o/p is table
d. ALTER TABLE LOCATION for an external table - input is the new hdfs path and o/p is the table.
Also changed the ordering of model registration by sorting them by modifiedTime to ensure they are registered in correct order
Diffs
-----
addons/hdfs-model/src/main/java/org/apache/atlas/fs/model/FSDataModelGenerator.java 555d565
addons/hdfs-model/src/main/scala/org/apache/atlas/fs/model/FSDataModel.scala c964f73
addons/hive-bridge/pom.xml e125f18
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 3a802d7
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 68e32ff
addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e17afb8
addons/storm-bridge/src/main/java/org/apache/atlas/storm/hook/StormAtlasHook.java 5665856
repository/src/main/java/org/apache/atlas/services/DefaultMetadataService.java 0a04c5f
repository/src/main/java/org/apache/atlas/services/ReservedTypesRegistrar.java 430bb6b
Diff: https://reviews.apache.org/r/45784/diff/
Testing
-------
Added tests in HiveHookIT
Thanks,
Suma Shivaprasad
Re: Review Request 45784: Hve Hook - Support tracking lineage for
External Tables( Create/alter) , Load, import, export
Posted by Suma Shivaprasad <su...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45784/
-----------------------------------------------------------
(Updated April 5, 2016, 11:54 p.m.)
Review request for atlas.
Bugs: ATLAS-527
https://issues.apache.org/jira/browse/ATLAS-527
Repository: atlas
Description
-------
Added support to track lineage between HDFS Paths and hive tables in
a. LOAD( at table, partition level) - input is a HDFS path and output is table( even though we dont create partition entities, we are still tracking the lineage at table level for partitions. This could be an issue if there are large number of partition queries which is not being addressed in this jira - https://issues.apache.org/jira/browse/ATLAS-619) . refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
b. IMPORT, EXPORT to and from hdfs paths - Refer https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
c. CREATE EXTERNAL TABLE - input is hdfs path and o/p is table
d. ALTER TABLE LOCATION for an external table - input is the new hdfs path and o/p is the table.
Diffs
-----
addons/hdfs-model/src/main/java/org/apache/atlas/fs/model/FSDataModelGenerator.java 555d565
addons/hdfs-model/src/main/scala/org/apache/atlas/fs/model/FSDataModel.scala c964f73
addons/hive-bridge/pom.xml e125f18
addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java 3a802d7
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 68e32ff
addons/hive-bridge/src/test/java/org/apache/atlas/hive/hook/HiveHookIT.java e17afb8
addons/storm-bridge/src/main/java/org/apache/atlas/storm/hook/StormAtlasHook.java 5665856
repository/src/main/java/org/apache/atlas/services/DefaultMetadataService.java 0a04c5f
repository/src/main/java/org/apache/atlas/services/ReservedTypesRegistrar.java 430bb6b
Diff: https://reviews.apache.org/r/45784/diff/
Testing (updated)
-------
Added tests in HiveHookIT
Thanks,
Suma Shivaprasad