You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Enis Soztutar (JIRA)" <ji...@apache.org> on 2013/05/07 02:16:18 UTC

[jira] [Updated] (HBASE-5995) Fix and reenable TestLogRolling.testLogRollOnPipelineRestart

     [ https://issues.apache.org/jira/browse/HBASE-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Enis Soztutar updated HBASE-5995:
---------------------------------

    Attachment: hbase-5995_v1.patch

Attaching a candidate patch for this. As per my initial analysis, it seems that we have to call recoverLease before opening the wal files for read. 

We do not crash the region servers in this test, so normally all log files should be closed, and recoverLease() should not be necessary. However, we do restart all the datanodes, and when we trigger a log roll, then the DFSOuputStream.close() receives exception on the close: 
{code}
2013-05-03 11:38:28,366 ERROR [RegionServer:1;10.11.3.18,51418,1367606292279.logRoller] wal.FSHLog(691): Failed close of HLog writer
java.io.IOException: All datanodes 127.0.0.1:51404 are bad. Aborting...
  at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:941)
  at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:756)
  at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:425)
{code}

We just ride over the close() exception, thus should call recoverLease() afterwards. 

I have yet to check why on earth we are able to run this successfully with Hadoop1. 

The test succeeds with the patch. 
                
> Fix and reenable TestLogRolling.testLogRollOnPipelineRestart
> ------------------------------------------------------------
>
>                 Key: HBASE-5995
>                 URL: https://issues.apache.org/jira/browse/HBASE-5995
>             Project: HBase
>          Issue Type: Sub-task
>          Components: test
>    Affects Versions: 0.95.2
>            Reporter: stack
>            Assignee: Enis Soztutar
>            Priority: Blocker
>             Fix For: 0.95.1
>
>         Attachments: hbase-5995_v1.patch
>
>
> HBASE-5984 disabled this flakey test (See the issue for more).  This issue is about getting it enabled again.  Made a blocker on 0.96.0 so it gets attention.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira