You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "chenglei (Jira)" <ji...@apache.org> on 2021/08/12 10:21:00 UTC
[jira] [Updated] (HBASE-26026) HBase Write may be stuck forever
when using CompactingMemStore
[ https://issues.apache.org/jira/browse/HBASE-26026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chenglei updated HBASE-26026:
-----------------------------
Description:
Sometimes I observed that HBase Write might be stuck in my hbase cluster which enabling {{CompactingMemStore}}. I have simulated the problem by unit test in my PR.
The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} :
{code:java}
425 private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell cellToAdd,
426 MemStoreSizing memstoreSizing) {
427 if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
428 if (currActive.setInMemoryFlushed()) {
429 flushInMemory(currActive);
430 if (setInMemoryCompactionFlag()) {
431 // The thread is dispatched to do in-memory compaction in the background
......
}
{code}
In line 427, {{shouldFlushInMemory}} checking if {{currActive.getDataSize}} adding the size of {{cellToAdd}} exceeds {{CompactingMemStore.inmemoryFlushSize}},if true, then {{currActive}} should be flushed, {{currActive.setInMemoryFlushed()}} is invoked in line 428 :
{code:java}
public boolean setInMemoryFlushed() {
return flushed.compareAndSet(false, true);
}
{code}
After sucessfully set {{currActive.flushed}} to true, in above line 429 {{flushInMemory(currActive)}} invokes {{CompactingMemStore.pushActiveToPipeline}} :
{code:java}
protected void pushActiveToPipeline(MutableSegment currActive) {
if (!currActive.isEmpty()) {
pipeline.pushHead(currActive);
resetActive();
}
}
{code}
In above {{CompactingMemStore.pushActiveToPipeline}} method , if the {{currActive.cellSet}} is empty, then nothing is done. Due to concurrent writes and because we first add cell size to {{currActive.getDataSize}} and then actually add cell to {{currActive.cellSet}}, it is possible that {{currActive.getDataSize}} could not accommodate {{cellToAdd}} but {{currActive.cellSet}} is still empty if pending writes which not yet add cells to {{currActive.cellSet}}.
So if the {{currActive.cellSet}} is empty now, then no {{ActiveSegment}} is created, and new writes still continue target to {{currActive}}, but {{currActive.flushed}} is true, {{currActive}} could not enter {{flushInMemory(currActive)}} again,and new {{ActiveSegment}} could not be created forever ! In the end all writes would be stuck.
In my opinion , once {{currActive.flushed}} is set true, it could not continue use as {{ActiveSegment}} , and because of concurrent pending writes, only after {{currActive.updatesLock.writeLock()}} is acquired(i.e. {{currActive.waitForUpdates}} is called) in {{CompactingMemStore.inMemoryCompaction}} ,we can safely say {{currActive}} is empty or not.
My fix is remove the {{if (!currActive.isEmpty())}} check here and left the check to background {{InMemoryCompactionRunnable}} after {{currActive.waitForUpdates}} is called. An alternative fix is we use synchronization mechanism in {{checkAndAddToActiveSize}} method to prevent all writes , wait for all pending write completed(i.e. currActive.waitForUpdates is called) and if {{currActive}} is still empty ,then we set {{currActive.flushed}} back to false,but I am not inclined to use so heavy synchronization in write path, and I think we would better maintain lockless implementation for {{CompactingMemStore.add}} method just as now and {{currActive.waitForUpdates}} would better be left in background {{InMemoryCompactionRunnable}}.
was:
Sometimes I observed that HBase Write might be stuck in my hbase cluster which enabling {{CompactingMemStore}}. I have simulated the problem by unit test in my PR.
The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} :
{code:java}
425 private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell cellToAdd,
426 MemStoreSizing memstoreSizing) {
427 if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
428 if (currActive.setInMemoryFlushed()) {
429 flushInMemory(currActive);
430 if (setInMemoryCompactionFlag()) {
431 // The thread is dispatched to do in-memory compaction in the background
......
}
{code}
In line 427, if {{currActive.getDataSize}} adding the size of {{cellToAdd}} exceeds {{CompactingMemStore.inmemoryFlushSize}}, then {{currActive}} should be flushed, {{MutableSegment.setInMemoryFlushed()}} is invoked in above line 428 :
{code:java}
public boolean setInMemoryFlushed() {
return flushed.compareAndSet(false, true);
}
{code}
After set {{currActive.flushed}} to true, in above line 429 {{flushInMemory(currActive)}} invokes {{CompactingMemStore.pushActiveToPipeline}} :
{code:java}
protected void pushActiveToPipeline(MutableSegment currActive) {
if (!currActive.isEmpty()) {
pipeline.pushHead(currActive);
resetActive();
}
}
{code}
In above {{CompactingMemStore.pushActiveToPipeline}} method , if the {{currActive.cellSet}} is empty, then nothing is done. Due to concurrent writes and because we first add cell size to {{currActive.getDataSize}} and then actually add cell to {{currActive.cellSet}}, it is possible that {{currActive.getDataSize}} could not accommodate {{cellToAdd}} but {{currActive.cellSet}} is still empty if pending writes which not yet add cells to {{currActive.cellSet}}.
So if the {{currActive.cellSet}} is empty now, then no {{ActiveSegment}} is created, and new writes still continue target to {{currActive}}, but {{currActive.flushed}} is true, {{currActive}} could not enter {{flushInMemory(currActive)}} again,and new {{ActiveSegment}} could not be created forever ! In the end all writes would be stuck.
In my opinion , once {{currActive.flushed}} is set true, it could not continue use as {{ActiveSegment}} , and because of concurrent pending writes, only after {{currActive.updatesLock.writeLock()}} is acquired(i.e. {{currActive.waitForUpdates}} is called) in {{CompactingMemStore.inMemoryCompaction}} ,we can safely say {{currActive}} is empty or not.
My fix is remove the {{if (!currActive.isEmpty())}} check here and left the check to background {{InMemoryCompactionRunnable}} after {{currActive.waitForUpdates}} is called. An alternative fix is we use synchronization mechanism in {{checkAndAddToActiveSize}} method to prevent all writes , wait for all pending write completed(i.e. currActive.waitForUpdates is called) and if {{currActive}} is still empty ,then we set {{currActive.flushed}} back to false,but I am not inclined to use so heavy synchronization in write path, and I think we would better maintain lockless implementation for {{CompactingMemStore.add}} method just as now and {{currActive.waitForUpdates}} would better be left in background {{InMemoryCompactionRunnable}}.
> HBase Write may be stuck forever when using CompactingMemStore
> --------------------------------------------------------------
>
> Key: HBASE-26026
> URL: https://issues.apache.org/jira/browse/HBASE-26026
> Project: HBase
> Issue Type: Bug
> Components: in-memory-compaction
> Affects Versions: 3.0.0-alpha-1, 2.3.0, 2.4.0
> Reporter: chenglei
> Assignee: chenglei
> Priority: Critical
> Fix For: 2.5.0, 3.0.0-alpha-2, 2.4.6, 2.3.7
>
>
> Sometimes I observed that HBase Write might be stuck in my hbase cluster which enabling {{CompactingMemStore}}. I have simulated the problem by unit test in my PR.
> The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} :
> {code:java}
> 425 private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell cellToAdd,
> 426 MemStoreSizing memstoreSizing) {
> 427 if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
> 428 if (currActive.setInMemoryFlushed()) {
> 429 flushInMemory(currActive);
> 430 if (setInMemoryCompactionFlag()) {
> 431 // The thread is dispatched to do in-memory compaction in the background
> ......
> }
> {code}
> In line 427, {{shouldFlushInMemory}} checking if {{currActive.getDataSize}} adding the size of {{cellToAdd}} exceeds {{CompactingMemStore.inmemoryFlushSize}},if true, then {{currActive}} should be flushed, {{currActive.setInMemoryFlushed()}} is invoked in line 428 :
> {code:java}
> public boolean setInMemoryFlushed() {
> return flushed.compareAndSet(false, true);
> }
> {code}
> After sucessfully set {{currActive.flushed}} to true, in above line 429 {{flushInMemory(currActive)}} invokes {{CompactingMemStore.pushActiveToPipeline}} :
> {code:java}
> protected void pushActiveToPipeline(MutableSegment currActive) {
> if (!currActive.isEmpty()) {
> pipeline.pushHead(currActive);
> resetActive();
> }
> }
> {code}
> In above {{CompactingMemStore.pushActiveToPipeline}} method , if the {{currActive.cellSet}} is empty, then nothing is done. Due to concurrent writes and because we first add cell size to {{currActive.getDataSize}} and then actually add cell to {{currActive.cellSet}}, it is possible that {{currActive.getDataSize}} could not accommodate {{cellToAdd}} but {{currActive.cellSet}} is still empty if pending writes which not yet add cells to {{currActive.cellSet}}.
> So if the {{currActive.cellSet}} is empty now, then no {{ActiveSegment}} is created, and new writes still continue target to {{currActive}}, but {{currActive.flushed}} is true, {{currActive}} could not enter {{flushInMemory(currActive)}} again,and new {{ActiveSegment}} could not be created forever ! In the end all writes would be stuck.
> In my opinion , once {{currActive.flushed}} is set true, it could not continue use as {{ActiveSegment}} , and because of concurrent pending writes, only after {{currActive.updatesLock.writeLock()}} is acquired(i.e. {{currActive.waitForUpdates}} is called) in {{CompactingMemStore.inMemoryCompaction}} ,we can safely say {{currActive}} is empty or not.
> My fix is remove the {{if (!currActive.isEmpty())}} check here and left the check to background {{InMemoryCompactionRunnable}} after {{currActive.waitForUpdates}} is called. An alternative fix is we use synchronization mechanism in {{checkAndAddToActiveSize}} method to prevent all writes , wait for all pending write completed(i.e. currActive.waitForUpdates is called) and if {{currActive}} is still empty ,then we set {{currActive.flushed}} back to false,but I am not inclined to use so heavy synchronization in write path, and I think we would better maintain lockless implementation for {{CompactingMemStore.add}} method just as now and {{currActive.waitForUpdates}} would better be left in background {{InMemoryCompactionRunnable}}.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)