You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Suma Shivaprasad (JIRA)" <ji...@apache.org> on 2016/04/07 22:21:25 UTC

[jira] [Comment Edited] (ATLAS-528) support drop table, view

    [ https://issues.apache.org/jira/browse/ATLAS-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230983#comment-15230983 ] 

Suma Shivaprasad edited comment on ATLAS-528 at 4/7/16 8:21 PM:
----------------------------------------------------------------

[~yhemanth] Thanks for the review.

The handling of truncate table will create a single process with no inputs and a single table as output. We should validate if this type of lineage view is OK for a end user. I agree though that capturing the information is important.

--> Raised ATLAS-653 to track this.

On the same note, is the query string captured correctly for the process. Can we enhance the truncate test to validate this?

--> HiveHookIT.validateProcess already check this by querying with the same exact query

If a table / view being dropped is not present in Atlas, this would generate a 404 and we would loose the information. Maybe we can file a bug and track this for later?

--> Raised ATLAS-652 to track this.

We should also test what happens if a user executes a drop table if exists non_existent_table - do we get called?

--> I observed that the hooks is being called  and the hook is ignoring this since there are no outputs for this query. Added a testcase HiveHookIT.testDropNonExistingTable to test this. 

There are break statements missing after ALTERTABLE_LOCATION and DROPTABLE/DROPVIEW. While the latter are the last branches of the switch, it is still safer to add IMO.

--> Added 


was (Author: suma.shivaprasad):
@Hemanth yamijala Thanks for the review.

The handling of truncate table will create a single process with no inputs and a single table as output. We should validate if this type of lineage view is OK for a end user. I agree though that capturing the information is important.

--> Raised ATLAS-653 to track this.

On the same note, is the query string captured correctly for the process. Can we enhance the truncate test to validate this?

--> HiveHookIT.validateProcess already check this by querying with the same exact query

If a table / view being dropped is not present in Atlas, this would generate a 404 and we would loose the information. Maybe we can file a bug and track this for later?

--> Raised ATLAS-652 to track this.

We should also test what happens if a user executes a drop table if exists non_existent_table - do we get called?

--> I observed that the hooks is being called  and the hook is ignoring this since there are no outputs for this query. Added a testcase HiveHookIT.testDropNonExistingTable to test this. 

There are break statements missing after ALTERTABLE_LOCATION and DROPTABLE/DROPVIEW. While the latter are the last branches of the switch, it is still safer to add IMO.

--> Added 

> support drop table, view
> ------------------------
>
>                 Key: ATLAS-528
>                 URL: https://issues.apache.org/jira/browse/ATLAS-528
>             Project: Atlas
>          Issue Type: Sub-task
>    Affects Versions: 0.7-incubating
>            Reporter: Suma Shivaprasad
>            Assignee: Suma Shivaprasad
>             Fix For: 0.7-incubating
>
>         Attachments: ATLAS-528.1.patch, ATLAS-528.patch
>
>
> Drop table and view requires soft deletes. The reason is whenever a table is dropped , it also may have an associated lineage which consists of a hive_process which N input tables and an output table. If the table is dropped, the lineage edge is also dropped resulting in incorrect lineage history. 
> With soft deletes, the expected behaviour is to changes the table status to "deleted" and when the table is recreated through a create table statement, then create another vertex/entity for that table with the new state. Also,  the lineage for this newly recreated table should be a new hive_process and should not reuse the existing entity/vertex even though the hive statement for that process is the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)