You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Damien Carol (JIRA)" <ji...@apache.org> on 2015/06/16 16:31:00 UTC

[jira] [Updated] (HIVE-9938) Add retry logic to DbTxnMgr instead of aborting transactions.

     [ https://issues.apache.org/jira/browse/HIVE-9938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Damien Carol updated HIVE-9938:
-------------------------------
    Description: 
Sometimes parallel updates using DBTxnMgr results in the following error trace
{noformat}
5325 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver> 
5351 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Error in acquiring locks: Error communicating with the metastore 
org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the metastore 
at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:100) 
at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:194) 
{noformat}

Internally looking at the postgres logs we see 
{noformat}
2015-02-02 06:36:05,632 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: org.apache.thrift.TException: MetaException(message:Unable to update transaction database org.postgresql.util.PSQLException: ERROR: could not serialize access due to concurrent update 

{noformat}
Ideally we should add a retry logic to retry the failed transaction.

  was:
Sometimes parallel updates using DBTxnMgr results in the following error trace

5325 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver> 
5351 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Error in acquiring locks: Error communicating with the metastore 
org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the metastore 
at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:100) 
at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:194) 


Internally looking at the postgres logs we see 

2015-02-02 06:36:05,632 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: org.apache.thrift.TException: MetaException(message:Unable to update transaction database org.postgresql.util.PSQLException: ERROR: could not serialize access due to concurrent update 


Ideally we should add a retry logic to retry the failed transaction.


> Add retry logic to DbTxnMgr instead of aborting transactions.
> -------------------------------------------------------------
>
>                 Key: HIVE-9938
>                 URL: https://issues.apache.org/jira/browse/HIVE-9938
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 0.14.0
>            Reporter: bharath v
>
> Sometimes parallel updates using DBTxnMgr results in the following error trace
> {noformat}
> 5325 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver> 
> 5351 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Error in acquiring locks: Error communicating with the metastore 
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the metastore 
> at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:100) 
> at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:194) 
> {noformat}
> Internally looking at the postgres logs we see 
> {noformat}
> 2015-02-02 06:36:05,632 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: org.apache.thrift.TException: MetaException(message:Unable to update transaction database org.postgresql.util.PSQLException: ERROR: could not serialize access due to concurrent update 
> {noformat}
> Ideally we should add a retry logic to retry the failed transaction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)