You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/04/28 19:07:08 UTC

[GitHub] [iceberg] RussellSpitzer opened a new issue #2540: Hive: Lock Issues with multithreaded commits

RussellSpitzer opened a new issue #2540:
URL: https://github.com/apache/iceberg/issues/2540


   When trying to commit to the same Iceberg table from multiple threads in the same application, the meta-store ends up building expensive and large lock chains on the HMS side. This can end up overloading the underlying RDBMS and we have done several fixes (#2263, #1873) to try to reduce this thrashing but I would like to have a better solution if possible.
   
   Since the HiveTableOperations knows that only a single commit operation will ever be able to lock uniquely, it probably makes sense for us to also lock at the JVM level for all incoming HiveTableOperation commit operations. There is never a benefit to allowing a second thread to attempt to acquire a lock while we know a first thread has already acquired it. In fact, we know that the opposite is true. The more simultaneous lock attempts and checks, the greater the pressure on the HMS and worse performance of ALL behaviors.
   
   While we can't stop multiple processes from simultaneously attempting lock (which is the point of the hms lock in the first place), we can do a much cheaper JVM level lock for a single process. For example if we had an application that previously had dozens of threads which all attempt to simultaneously commit, say the SparkThriftServer or something like that. Now it would only allow a single thread to attempt to get the hive lock, while all others would queue up behind it. If we don't care about order within threads we could use a synchronize and if we do we could use a concurrent queue or alike.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on issue #2540: Hive: Lock Issues with multithreaded commits

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on issue #2540:
URL: https://github.com/apache/iceberg/issues/2540#issuecomment-828708739


   @raptond + @marton-bod Wdyt?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] raptond commented on issue #2540: Hive: Lock Issues with multithreaded commits

Posted by GitBox <gi...@apache.org>.
raptond commented on issue #2540:
URL: https://github.com/apache/iceberg/issues/2540#issuecomment-828777965


   The problem we face internally with several customers is, after a while, the Iceberg table cannot be added any more data. When I say "any more data", I mean, data files are created, snapshot is created but the commit can never be successful. At this point, a manual intervention of clearing the locks for the given iceberg table is required.
   
   Some cases that we investigated had 
   1. several jobs 
   2. and within each job, several threads
   
   trying to commit to the same Iceberg table. 
   
   > We should only be blocking threads which would like to commit to the same table simultaneously, not to any Iceberg table in general (since the HMS locks are also on the table level).
   👍 to this. 
   
   >  we release the HMS lock, in this case the commits will happen in a Round Robin fashion between the JVMs.
   👍 to this too.
   
   The `share-the-hms-lock-within-JVM-for-a-single-iceberg-table` proposed by @RussellSpitzer would save the situation (2) very effectively. IMHO with additional considerations like, [a] release the HMS lock as soon as the last thread returned the lock back and [b] releasing it by time for fair usage even if more threads are in the queue waiting for the lock. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] marton-bod commented on issue #2540: Hive: Lock Issues with multithreaded commits

Posted by GitBox <gi...@apache.org>.
marton-bod commented on issue #2540:
URL: https://github.com/apache/iceberg/issues/2540#issuecomment-828743787


   @RussellSpitzer From the Hive side we haven't noticed this issue as much, but I think this makes a lot of sense. Did you mean adding a JVM level lock per table? We should only be blocking threads which would like to commit to the same table simultaneously, not to any Iceberg table in general (since the HMS locks are also on the table level). I'm happy to take a crack at it tomorrow, but also happy to review if anyone else fancies working on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #2540: Hive: Lock Issues with multithreaded commits

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #2540:
URL: https://github.com/apache/iceberg/issues/2540#issuecomment-828791623


   > Want to also call out - for issue (1), we will need to find a way to clear up the expired locks which are still blocking the lock chain.
   > 
   > In this problem space of (1), I believe #2263 would solve a majority of cases, leaving just the `driver-crashing-after-acquiring-the-lock` cases. May be a case for HMS cleanup service.
   
   I think you could use the https://github.com/apache/hive/blob/rel/release-3.1.2/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/AcidHouseKeeperService.java
   
   This is an existing service in HMS to remove stale locks. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer closed issue #2540: Hive: Lock Issues with multithreaded commits

Posted by GitBox <gi...@apache.org>.
RussellSpitzer closed issue #2540:
URL: https://github.com/apache/iceberg/issues/2540


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] raptond commented on issue #2540: Hive: Lock Issues with multithreaded commits

Posted by GitBox <gi...@apache.org>.
raptond commented on issue #2540:
URL: https://github.com/apache/iceberg/issues/2540#issuecomment-828779435


   Want to also call out - for issue (1), we will need to find a way to clear up the expired locks which are still blocking the lock chain. 
   
   In this problem space of (1), I believe https://github.com/apache/iceberg/pull/2263 would solve a majority of cases, leaving just the `driver-crashing-after-acquiring-the-lock` cases. May be a case for HMS cleanup service. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on issue #2540: Hive: Lock Issues with multithreaded commits

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on issue #2540:
URL: https://github.com/apache/iceberg/issues/2540#issuecomment-828781745


   @pvary I wish we could :) But we have tons of users with their own HMS code and I don't think it would be trivial for us to get them all on the same page.
   
   @marton-bod That's exactly what I was thinking about. I wanted to make sure there wasn't anything I was missing that would have this be slower or cause side effects that everyone trying to acquire the lock at once might have.
   
   Basically one JVM never allows more than one thread to run doCommit() for the same table at the same time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] raptond commented on issue #2540: Hive: Lock Issues with multithreaded commits

Posted by GitBox <gi...@apache.org>.
raptond commented on issue #2540:
URL: https://github.com/apache/iceberg/issues/2540#issuecomment-828798539


   Thanks @pvary. I didn't realize we need this flag to turn it on - `hive.compactor.initiator.on`.
   This should take care of the stale locks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #2540: Hive: Lock Issues with multithreaded commits

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #2540:
URL: https://github.com/apache/iceberg/issues/2540#issuecomment-828761375


   HMS locks are fair. If we have multiple HS2 instances doing the commits then we lose this fairness. We can use fair locks inside a JVM, but cross JVM fairness would be lost. We either share the HMS lock between threads then we risk starving the threads on another JVM, or we release the HMS lock, in this case the commits will happen in a Round Robin fashion between the JVMs.
   
   Also for MR, or TezAM side commits are running in separate JVMs, so JVM level locks does not help there (and does not hurt that much) 
   
   @RussellSpitzer: If you can patch your HMS, you might want to try out this two patches:
   - https://issues.apache.org/jira/browse/HIVE-22906
   - https://issues.apache.org/jira/browse/HIVE-22888


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org