You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org> on 2012/03/23 03:58:22 UTC

[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush

    [ https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236293#comment-13236293 ] 

Hadoop QA commented on HBASE-5623:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12519570/HBASE-5623_v0.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1271//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1271//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1271//console

This message is automatically generated.
                
> Race condition when rolling the HLog and hlogFlush
> --------------------------------------------------
>
>                 Key: HBASE-5623
>                 URL: https://issues.apache.org/jira/browse/HBASE-5623
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>    Affects Versions: 0.94.0
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>         Attachments: HBASE-5623_v0.patch
>
>
> When doing a ycsb test with a large number of handlers (regionserver.handler.count=60), I get the following exceptions:
> {code}
> Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.NullPointerException
> 	at org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099)
> 	at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291)
> 	at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
> 	at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
> 	at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> 	at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920)
> 	at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152)
> 	at $Proxy1.multi(Unknown Source)
> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691)
> 	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689)
> 	at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214)
> {code}
> and 
> {code}
> 	java.lang.NullPointerException
> 		at org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026)
> 		at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068)
> 		at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035)
> 		at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279)
> 		at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237)
> 		at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271)
> 		at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391)
> 		at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192)
> 		at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985)
> 		at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400)
> 		at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
> 		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 		at java.lang.reflect.Method.invoke(Method.java:597)
> 		at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366)
> 		at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351)
> {code}
> It seems the root cause of the issue is that we open a new log writer and close the old one at HLog#rollWriter() holding the updateLock, but the other threads doing syncer() calls
> {code} 
> logSyncerThread.hlogFlush(this.writer);
> {code}
> without holding the updateLock. LogSyncer only synchronizes against concurrent appends and flush(), but not on the passed writer, which can be closed already by rollWriter(). In this case, since SequenceFile#Writer.close() sets it's out field as null, we get the NPE. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira