You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2016/05/19 20:04:12 UTC

[jira] [Resolved] (HIVE-13622) WriteSet tracking optimizations

     [ https://issues.apache.org/jira/browse/HIVE-13622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eugene Koifman resolved HIVE-13622.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: 2.1.0
                   1.3.0

> WriteSet tracking optimizations
> -------------------------------
>
>                 Key: HIVE-13622
>                 URL: https://issues.apache.org/jira/browse/HIVE-13622
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 1.3.0, 2.1.0
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>            Priority: Critical
>             Fix For: 1.3.0, 2.1.0
>
>         Attachments: HIVE-13622.2.patch, HIVE-13622.3.patch, HIVE-13622.4.patch, HIVE-13622.branch-1.patch
>
>
> HIVE-13395 solves the the lost update problem with some inefficiencies.
> 1. TxhHandler.OperationType is currently derived from LockType.  This doesn't  distinguish between Update and Delete but would be useful.  See comments in TxnHandler.  Should be able to pass in Insert/Update/Delete info from client into TxnHandler.
> 2. TxnHandler.addDynamicPartitions() should know the OperationType as well from the client.  It currently extrapolates it from TXN_COMPONENTS.  This works but requires extra SQL statements and is thus less performant.  It will not work multi-stmt txns.  See comments in the code.
> 3. TxnHandler.checkLock() see more comments around "isPartOfDynamicPartitionInsert".  If TxnHandler knew whether it is being called as part of an op running with dynamic partitions, it could be more efficient.  In that case we don't have to write to TXN_COMPONENTS at all during lock acquisition.  Conversely, if not running with DynPart then, we can kill current txn on lock grant rather than wait until commit time.
> 4. TxnHandler.addDynamicPartitions() - the insert stmt here should combing multiple rows into single SQL stmt (but with a limit for extreme cases)
> 5. TxnHandler.enqueueLockWithRetry() - this currently adds components that are only being read to TXN_COMPONENTS.   This is useless at best since read op don't generate anything to compact.  For example, delete from T where t1 in (select c1 from C) - no reason to add C to txn_components but we do.
>  
> All of these require some Thrift changes
> Once done, re-enable TestDbTxnHandler2.testWriteSetTracking11()
> Also see comments in [here|https://issues.apache.org/jira/browse/HIVE-13395?focusedCommentId=15271712&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15271712]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)