You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Zoltan Haindrich (Jira)" <ji...@apache.org> on 2020/07/06 12:35:00 UTC
[jira] [Commented] (HIVE-22301) Hive lineage is not generated for
insert overwrite queries on partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151990#comment-17151990 ]
Zoltan Haindrich commented on HIVE-22301:
-----------------------------------------
the [removal of these entities|https://github.com/apache/hive/blob/ae6976b0b0a4b04ea1b2ee5906a3980ce7e1cddd/ql/src/java/org/apache/hadoop/hive/ql/Executor.java#L480] seems to be intentionally introduced in HIVE-1781
it seems like they were avoiding to change the "postexec" hooks...
> Hive lineage is not generated for insert overwrite queries on partitioned tables
> --------------------------------------------------------------------------------
>
> Key: HIVE-22301
> URL: https://issues.apache.org/jira/browse/HIVE-22301
> Project: Hive
> Issue Type: Bug
> Components: lineage
> Affects Versions: 3.1.2
> Reporter: Sidharth Kumar Mishra
> Assignee: Zoltan Haindrich
> Priority: Major
> Attachments: ScreenShot HookContext.png, ScreenShot RunPostExecHook.png, ScreenShot runBeforeExecution.png
>
>
> Problem: When I run the below mentioned queries, the last query should have given the proper hive lineage info (through HookContext) from table_b to table_t.
> * Create table table_t (id int) partitioned by (dob date);
> * Create table table_b (id int) partitioned by (dob date);
> * from table_b a insert overwrite table table_t select a.id,a.dob;
> Note : for CTAS query from a partitioned table , this issue is not seen. Only for insert queries like insert into <table> select * from <table> and query like above, issue is seen.
>
> Technical Observations:
> At HookContext (passed from hive.ql.Driver to Hive Hook of Atlas through hookRunner.runPostExecHooks call) contains no outputs. Check below screenshot from IntelliJ.
> !ScreenShot RunPostExecHook.png|width=728,height=427!
>
> I found that the PrivateHookContext is getting created with proper outputs value as shown below initially:
> !ScreenShot HookContext.png|width=714,height=541!
> The same is passed properly to runBeforeExecutionHook as shown below:
> !ScreenShot runBeforeExecution.png|width=719,height=620!
>
> Later when we pass HookContext to runPostExecHooks, there is no output populated. Kindly check the reason and let me know if you need any further information from my end.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)