You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Eugene Koifman (JIRA)" <ji...@apache.org> on 2017/05/01 20:21:04 UTC

[jira] [Created] (HIVE-16564) StreamingAPI is locking too much?

Eugene Koifman created HIVE-16564:
-------------------------------------

             Summary: StreamingAPI is locking too much?
                 Key: HIVE-16564
                 URL: https://issues.apache.org/jira/browse/HIVE-16564
             Project: Hive
          Issue Type: Bug
          Components: HCatalog, Transactions
    Affects Versions: 1.0.0
            Reporter: Eugene Koifman
            Assignee: Eugene Koifman


Currently _TransactionBatchImpl.beginNextTransactionImpl()_ acquires Shared locks for each Transaction in the batch.  
Especially under high load this creates pressure on the LockManager (i.e. Metastore) and degrades performance of Ingest itself.
Because all transactions in a batch write to the same physical file and the fact that for Acid tables (which are required for Streaming Ingest) shared locks only protect against Exclusive locks (like drop table), acquiring/releasing locks doesn't for each txn doesn't achieve much.

One possibility to acquire all locks (i.e. for all txns) at the time the batch is created (same as is done for openTxn() for all txns in the batch).  Locks for each txn in the batch will be released automatically when commit is called for the respective txn.

Alternatively, don't acquire any locks - this means someone may drop a table while it's written to but using locks here doesn't buy much.  Say a Drop request is issued when a write is in progress.  It will block until the write releases it's lock and execute immediately after that.  Thus none of the data of that write is visible for any meaningful length of time anyway.

Allow a "meta lock" - a lock not associated with any specific txn, that is held for the duration of the TransactionBatch.  This sort of breaks the model (especially since HIVE-12636).  Perhaps each batch can open one "extra" txn for internal purposes, just to acquire this "meta lock".  No data will ever be tagged with this "extra" txn.







--
This message was sent by Atlassian JIRA
(v6.3.15#6346)