You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2013/09/25 02:22:03 UTC

[jira] [Commented] (HBASE-9651) Backport HBASE-3890 'Scheduled tasks in distributed log splitting not in sync with ZK' to 0.94

    [ https://issues.apache.org/jira/browse/HBASE-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13776981#comment-13776981 ] 

Ted Yu commented on HBASE-9651:
-------------------------------

Tests passed:
{code}
Tests run: 1384, Failures: 0, Errors: 0, Skipped: 13

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:12:13.160s
[INFO] Finished at: Wed Sep 25 00:18:38 UTC 2013
[INFO] Final Memory: 30M/319M
{code}
                
> Backport HBASE-3890 'Scheduled tasks in distributed log splitting not in sync with ZK' to 0.94
> ----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-9651
>                 URL: https://issues.apache.org/jira/browse/HBASE-9651
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>             Fix For: 0.94.13
>
>         Attachments: 9651.patch
>
>
> HBASE-3890 was fixed in 0.96 and trunk. This issue is to backport to 0.94
> Note that there must be more slightly off here. Although the splitlogs znode is now empty the master is still stuck here:
> {code}
> Doing distributed log split in hdfs://localhost:8020/hbase/.logs/10.0.0.65,60020,1305406356765	
> - Waiting for distributed tasks to finish. scheduled=2 done=1 error=0   4380s
> Master startup	
> - Splitting logs after master startup   4388s
> {code}
> There seems to be an issue with what is in ZK and what the TaskBatch holds. In my case it could be related to the fact that the task was already in ZK after many faulty restarts because of the NPE. Maybe it was added once (since that is keyed by path, and that is unique on my machine), but the reference count upped twice? Now that the real one is done, the done counter has been increased, but will never match the scheduled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira