You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Igor Kryvenko (JIRA)" <ji...@apache.org> on 2018/11/06 21:40:00 UTC
[jira] [Commented] (HIVE-20581) Eliminate rename() from full CRUD
transactional tables
[ https://issues.apache.org/jira/browse/HIVE-20581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16677325#comment-16677325 ]
Igor Kryvenko commented on HIVE-20581:
--------------------------------------
I've been doing an investigation of this JIRA. And some behavior looks stranger to me.
Suppose we have t1 - full acid table, and t2 - insert-only table.
And if I run {{Insert overwrite into t1 select * from another_table_1 union all another_table_2}} files are moving inside {{MoveTask}}.
But, in the case {{Insert overwrite into t2 select * from another_table_1 union all another_table_2}} all files are moving inside MR job, and all operations in MoveTask are related only to metadata in the metastore.
Is it correct behavior?
cc [~ekoifman]
> Eliminate rename() from full CRUD transactional tables
> ------------------------------------------------------
>
> Key: HIVE-20581
> URL: https://issues.apache.org/jira/browse/HIVE-20581
> Project: Hive
> Issue Type: Improvement
> Components: Transactions
> Reporter: Eugene Koifman
> Assignee: Emily lozano d1751740 551913586
> Priority: Major
>
> The {{MoveTask}} in a query writing to full CRUD transactional table still performs a {{FileSystem.rename()}}. Full CRUD should follow the insert-only transactional table implementation and write directly to delta_x_x in the partition dir. If the txn fails, this delta will be marked aborted and will not be read.
> There are several places that rely on this rename. For example, support for {{Insert ... select ... Union All ... Select }} which creates multiple dirs, 1 for each leg of the union.
> Others?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)