You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "terry zhang (JIRA)" <ji...@apache.org> on 2012/08/31 08:38:07 UTC

[jira] [Created] (HBASE-6700) [replication] replication node will never delete if copy newQueues size is 0

terry zhang created HBASE-6700:
----------------------------------

             Summary: [replication] replication node will never delete if copy newQueues size is 0
                 Key: HBASE-6700
                 URL: https://issues.apache.org/jira/browse/HBASE-6700
             Project: HBase
          Issue Type: Bug
          Components: replication
    Affects Versions: 0.94.1
            Reporter: terry zhang


Please check code below

{code:title=ReplicationSourceManager.java|borderStyle=solid}
// NodeFailoverWorker class
public void run() {
{
    ...

      LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
      SortedMap<String, SortedSet<String>> newQueues =
          zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
      zkHelper.deleteRsQueues(rsZnode); 
      if (newQueues == null || newQueues.size() == 0) {
        return;  
      }
    ...
}


  public void closeRecoveredQueue(ReplicationSourceInterface src) {
    LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
    this.oldsources.remove(src);
    this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
  }
{code} 

So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.


eg below failover node will never be delete:

[zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60
020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw8
8.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1
346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb
.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,134631
5315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.
cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,13463212990
40-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,6
0020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
[] *<= Empty node *
       



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6700) [replication] replication node will never delete if copy newQueues size is 0

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-6700:
---------------------------------

    Fix Version/s: 0.96.0
                   0.94.3

Patch looks good. [~jdcryans] wanna have a look?
                
> [replication] replication node will never delete if copy newQueues size is 0
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-6700
>                 URL: https://issues.apache.org/jira/browse/HBASE-6700
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.1
>            Reporter: terry zhang
>             Fix For: 0.94.3, 0.96.0
>
>         Attachments: HBASE-6700.patch
>
>
> Please check code below
> {code:title=ReplicationSourceManager.java|borderStyle=solid}
> // NodeFailoverWorker class
> public void run() {
> {
>     ...
>       LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
>       SortedMap<String, SortedSet<String>> newQueues =
>           zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
>       zkHelper.deleteRsQueues(rsZnode); 
>       if (newQueues == null || newQueues.size() == 0) {
>         return;  
>       }
>     ...
> }
>   public void closeRecoveredQueue(ReplicationSourceInterface src) {
>     LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
>     this.oldsources.remove(src);
>     this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
>   }
> {code} 
> So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.
> eg below failover node will never be delete:
> [zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
> [] // empty node will never be deleted
>        

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-6700) [replication] empty znodes created during queue failovers aren't deleted

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans resolved HBASE-6700.
---------------------------------------

      Resolution: Fixed
        Assignee: terry zhang
    Hadoop Flags: Reviewed

Committed to 0.94 and trunk. Thanks for the patch Terry.
                
> [replication] empty znodes created during queue failovers aren't deleted
> ------------------------------------------------------------------------
>
>                 Key: HBASE-6700
>                 URL: https://issues.apache.org/jira/browse/HBASE-6700
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.1
>            Reporter: terry zhang
>            Assignee: terry zhang
>             Fix For: 0.94.3, 0.96.0
>
>         Attachments: HBASE-6700.patch
>
>
> Please check code below
> {code:title=ReplicationSourceManager.java|borderStyle=solid}
> // NodeFailoverWorker class
> public void run() {
> {
>     ...
>       LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
>       SortedMap<String, SortedSet<String>> newQueues =
>           zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
>       zkHelper.deleteRsQueues(rsZnode); 
>       if (newQueues == null || newQueues.size() == 0) {
>         return;  
>       }
>     ...
> }
>   public void closeRecoveredQueue(ReplicationSourceInterface src) {
>     LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
>     this.oldsources.remove(src);
>     this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
>   }
> {code} 
> So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.
> eg below failover node will never be delete:
> [zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
> [] // empty node will never be deleted
>        

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6700) [replication] replication node will never delete if copy newQueues size is 0

Posted by "terry zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445734#comment-13445734 ] 

terry zhang commented on HBASE-6700:
------------------------------------

we can let the NodeFailoverWorker create newClusterZnode after check the hlog size 
                
> [replication] replication node will never delete if copy newQueues size is 0
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-6700
>                 URL: https://issues.apache.org/jira/browse/HBASE-6700
>             Project: HBase
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.94.1
>            Reporter: terry zhang
>         Attachments: HBASE-6700.patch
>
>
> Please check code below
> {code:title=ReplicationSourceManager.java|borderStyle=solid}
> // NodeFailoverWorker class
> public void run() {
> {
>     ...
>       LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
>       SortedMap<String, SortedSet<String>> newQueues =
>           zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
>       zkHelper.deleteRsQueues(rsZnode); 
>       if (newQueues == null || newQueues.size() == 0) {
>         return;  
>       }
>     ...
> }
>   public void closeRecoveredQueue(ReplicationSourceInterface src) {
>     LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
>     this.oldsources.remove(src);
>     this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
>   }
> {code} 
> So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.
> eg below failover node will never be delete:
> [zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
> [] // empty node will never be deleted
>        

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6700) [replication] empty znodes created during queue failovers aren't deleted

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490366#comment-13490366 ] 

Hudson commented on HBASE-6700:
-------------------------------

Integrated in HBase-0.94-security-on-Hadoop-23 #9 (See [https://builds.apache.org/job/HBase-0.94-security-on-Hadoop-23/9/])
    HBASE-6700  [replication] empty znodes created during queue failovers aren't
            deleted (Terry Zhang via JD) (Revision 1403582)

     Result = FAILURE
jdcryans : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java

                
> [replication] empty znodes created during queue failovers aren't deleted
> ------------------------------------------------------------------------
>
>                 Key: HBASE-6700
>                 URL: https://issues.apache.org/jira/browse/HBASE-6700
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.1
>            Reporter: terry zhang
>            Assignee: terry zhang
>             Fix For: 0.94.3, 0.96.0
>
>         Attachments: HBASE-6700.patch
>
>
> Please check code below
> {code:title=ReplicationSourceManager.java|borderStyle=solid}
> // NodeFailoverWorker class
> public void run() {
> {
>     ...
>       LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
>       SortedMap<String, SortedSet<String>> newQueues =
>           zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
>       zkHelper.deleteRsQueues(rsZnode); 
>       if (newQueues == null || newQueues.size() == 0) {
>         return;  
>       }
>     ...
> }
>   public void closeRecoveredQueue(ReplicationSourceInterface src) {
>     LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
>     this.oldsources.remove(src);
>     this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
>   }
> {code} 
> So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.
> eg below failover node will never be delete:
> [zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
> [] // empty node will never be deleted
>        

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6700) [replication] empty znodes created during queue failovers aren't deleted

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486621#comment-13486621 ] 

Hudson commented on HBASE-6700:
-------------------------------

Integrated in HBase-TRUNK #3497 (See [https://builds.apache.org/job/HBase-TRUNK/3497/])
    HBASE-6700  [replication] empty znodes created during queue failovers aren't
            deleted (Terry Zhang via JD) (Revision 1403581)

     Result = FAILURE
jdcryans : 
Files : 
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java

                
> [replication] empty znodes created during queue failovers aren't deleted
> ------------------------------------------------------------------------
>
>                 Key: HBASE-6700
>                 URL: https://issues.apache.org/jira/browse/HBASE-6700
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.1
>            Reporter: terry zhang
>            Assignee: terry zhang
>             Fix For: 0.94.3, 0.96.0
>
>         Attachments: HBASE-6700.patch
>
>
> Please check code below
> {code:title=ReplicationSourceManager.java|borderStyle=solid}
> // NodeFailoverWorker class
> public void run() {
> {
>     ...
>       LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
>       SortedMap<String, SortedSet<String>> newQueues =
>           zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
>       zkHelper.deleteRsQueues(rsZnode); 
>       if (newQueues == null || newQueues.size() == 0) {
>         return;  
>       }
>     ...
> }
>   public void closeRecoveredQueue(ReplicationSourceInterface src) {
>     LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
>     this.oldsources.remove(src);
>     this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
>   }
> {code} 
> So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.
> eg below failover node will never be delete:
> [zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
> [] // empty node will never be deleted
>        

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6700) [replication] empty znodes created during queue failovers aren't deleted

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486580#comment-13486580 ] 

Hudson commented on HBASE-6700:
-------------------------------

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #244 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/244/])
    HBASE-6700  [replication] empty znodes created during queue failovers aren't
            deleted (Terry Zhang via JD) (Revision 1403581)

     Result = FAILURE
jdcryans : 
Files : 
* /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java

                
> [replication] empty znodes created during queue failovers aren't deleted
> ------------------------------------------------------------------------
>
>                 Key: HBASE-6700
>                 URL: https://issues.apache.org/jira/browse/HBASE-6700
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.1
>            Reporter: terry zhang
>            Assignee: terry zhang
>             Fix For: 0.94.3, 0.96.0
>
>         Attachments: HBASE-6700.patch
>
>
> Please check code below
> {code:title=ReplicationSourceManager.java|borderStyle=solid}
> // NodeFailoverWorker class
> public void run() {
> {
>     ...
>       LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
>       SortedMap<String, SortedSet<String>> newQueues =
>           zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
>       zkHelper.deleteRsQueues(rsZnode); 
>       if (newQueues == null || newQueues.size() == 0) {
>         return;  
>       }
>     ...
> }
>   public void closeRecoveredQueue(ReplicationSourceInterface src) {
>     LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
>     this.oldsources.remove(src);
>     this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
>   }
> {code} 
> So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.
> eg below failover node will never be delete:
> [zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
> [] // empty node will never be deleted
>        

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6700) [replication] replication node will never delete if copy newQueues size is 0

Posted by "terry zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

terry zhang updated HBASE-6700:
-------------------------------

    Attachment: HBASE-6700.patch
    
> [replication] replication node will never delete if copy newQueues size is 0
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-6700
>                 URL: https://issues.apache.org/jira/browse/HBASE-6700
>             Project: HBase
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.94.1
>            Reporter: terry zhang
>         Attachments: HBASE-6700.patch
>
>
> Please check code below
> {code:title=ReplicationSourceManager.java|borderStyle=solid}
> // NodeFailoverWorker class
> public void run() {
> {
>     ...
>       LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
>       SortedMap<String, SortedSet<String>> newQueues =
>           zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
>       zkHelper.deleteRsQueues(rsZnode); 
>       if (newQueues == null || newQueues.size() == 0) {
>         return;  
>       }
>     ...
> }
>   public void closeRecoveredQueue(ReplicationSourceInterface src) {
>     LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
>     this.oldsources.remove(src);
>     this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
>   }
> {code} 
> So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.
> eg below failover node will never be delete:
> [zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
> [] // empty node will never be deleted
>        

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6700) [replication] empty znodes created during queue failovers aren't deleted

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-6700:
--------------------------------------

    Summary: [replication] empty znodes created during queue failovers aren't deleted  (was: [replication] replication node will never delete if copy newQueues size is 0)

+1, I'm going to commit. I'm also fixing the title.
                
> [replication] empty znodes created during queue failovers aren't deleted
> ------------------------------------------------------------------------
>
>                 Key: HBASE-6700
>                 URL: https://issues.apache.org/jira/browse/HBASE-6700
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.1
>            Reporter: terry zhang
>             Fix For: 0.94.3, 0.96.0
>
>         Attachments: HBASE-6700.patch
>
>
> Please check code below
> {code:title=ReplicationSourceManager.java|borderStyle=solid}
> // NodeFailoverWorker class
> public void run() {
> {
>     ...
>       LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
>       SortedMap<String, SortedSet<String>> newQueues =
>           zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
>       zkHelper.deleteRsQueues(rsZnode); 
>       if (newQueues == null || newQueues.size() == 0) {
>         return;  
>       }
>     ...
> }
>   public void closeRecoveredQueue(ReplicationSourceInterface src) {
>     LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
>     this.oldsources.remove(src);
>     this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
>   }
> {code} 
> So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.
> eg below failover node will never be delete:
> [zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
> [] // empty node will never be deleted
>        

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6700) [replication] replication node will never delete if copy newQueues size is 0

Posted by "terry zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

terry zhang updated HBASE-6700:
-------------------------------

    Description: 
Please check code below

{code:title=ReplicationSourceManager.java|borderStyle=solid}
// NodeFailoverWorker class
public void run() {
{
    ...

      LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
      SortedMap<String, SortedSet<String>> newQueues =
          zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
      zkHelper.deleteRsQueues(rsZnode); 
      if (newQueues == null || newQueues.size() == 0) {
        return;  
      }
    ...
}


  public void closeRecoveredQueue(ReplicationSourceInterface src) {
    LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
    this.oldsources.remove(src);
    this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
  }
{code} 

So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.


eg below failover node will never be delete:

[zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
[] // empty node will never be deleted
       



  was:
Please check code below

{code:title=ReplicationSourceManager.java|borderStyle=solid}
// NodeFailoverWorker class
public void run() {
{
    ...

      LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
      SortedMap<String, SortedSet<String>> newQueues =
          zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
      zkHelper.deleteRsQueues(rsZnode); 
      if (newQueues == null || newQueues.size() == 0) {
        return;  
      }
    ...
}


  public void closeRecoveredQueue(ReplicationSourceInterface src) {
    LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
    this.oldsources.remove(src);
    this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
  }
{code} 

So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.


eg below failover node will never be delete:

[zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
[] 
       



    
> [replication] replication node will never delete if copy newQueues size is 0
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-6700
>                 URL: https://issues.apache.org/jira/browse/HBASE-6700
>             Project: HBase
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.94.1
>            Reporter: terry zhang
>
> Please check code below
> {code:title=ReplicationSourceManager.java|borderStyle=solid}
> // NodeFailoverWorker class
> public void run() {
> {
>     ...
>       LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
>       SortedMap<String, SortedSet<String>> newQueues =
>           zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
>       zkHelper.deleteRsQueues(rsZnode); 
>       if (newQueues == null || newQueues.size() == 0) {
>         return;  
>       }
>     ...
> }
>   public void closeRecoveredQueue(ReplicationSourceInterface src) {
>     LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
>     this.oldsources.remove(src);
>     this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
>   }
> {code} 
> So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.
> eg below failover node will never be delete:
> [zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
> [] // empty node will never be deleted
>        

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6700) [replication] replication node will never delete if copy newQueues size is 0

Posted by "terry zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

terry zhang updated HBASE-6700:
-------------------------------

    Description: 
Please check code below

{code:title=ReplicationSourceManager.java|borderStyle=solid}
// NodeFailoverWorker class
public void run() {
{
    ...

      LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
      SortedMap<String, SortedSet<String>> newQueues =
          zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
      zkHelper.deleteRsQueues(rsZnode); 
      if (newQueues == null || newQueues.size() == 0) {
        return;  
      }
    ...
}


  public void closeRecoveredQueue(ReplicationSourceInterface src) {
    LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
    this.oldsources.remove(src);
    this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
  }
{code} 

So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.


eg below failover node will never be delete:

[zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
[] 
       



  was:
Please check code below

{code:title=ReplicationSourceManager.java|borderStyle=solid}
// NodeFailoverWorker class
public void run() {
{
    ...

      LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
      SortedMap<String, SortedSet<String>> newQueues =
          zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
      zkHelper.deleteRsQueues(rsZnode); 
      if (newQueues == null || newQueues.size() == 0) {
        return;  
      }
    ...
}


  public void closeRecoveredQueue(ReplicationSourceInterface src) {
    LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
    this.oldsources.remove(src);
    this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
  }
{code} 

So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.


eg below failover node will never be delete:

[zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60
020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw8
8.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1
346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb
.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,134631
5315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.
cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,13463212990
40-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,6
0020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
[] *<= Empty node *
       



    
> [replication] replication node will never delete if copy newQueues size is 0
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-6700
>                 URL: https://issues.apache.org/jira/browse/HBASE-6700
>             Project: HBase
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.94.1
>            Reporter: terry zhang
>
> Please check code below
> {code:title=ReplicationSourceManager.java|borderStyle=solid}
> // NodeFailoverWorker class
> public void run() {
> {
>     ...
>       LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
>       SortedMap<String, SortedSet<String>> newQueues =
>           zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
>       zkHelper.deleteRsQueues(rsZnode); 
>       if (newQueues == null || newQueues.size() == 0) {
>         return;  
>       }
>     ...
> }
>   public void closeRecoveredQueue(ReplicationSourceInterface src) {
>     LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
>     this.oldsources.remove(src);
>     this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
>   }
> {code} 
> So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.
> eg below failover node will never be delete:
> [zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
> [] 
>        

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6700) [replication] empty znodes created during queue failovers aren't deleted

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486630#comment-13486630 ] 

Hudson commented on HBASE-6700:
-------------------------------

Integrated in HBase-0.94 #559 (See [https://builds.apache.org/job/HBase-0.94/559/])
    HBASE-6700  [replication] empty znodes created during queue failovers aren't
            deleted (Terry Zhang via JD) (Revision 1403582)

     Result = FAILURE
jdcryans : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/ReplicationZookeeper.java

                
> [replication] empty znodes created during queue failovers aren't deleted
> ------------------------------------------------------------------------
>
>                 Key: HBASE-6700
>                 URL: https://issues.apache.org/jira/browse/HBASE-6700
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.94.1
>            Reporter: terry zhang
>            Assignee: terry zhang
>             Fix For: 0.94.3, 0.96.0
>
>         Attachments: HBASE-6700.patch
>
>
> Please check code below
> {code:title=ReplicationSourceManager.java|borderStyle=solid}
> // NodeFailoverWorker class
> public void run() {
> {
>     ...
>       LOG.info("Moving " + rsZnode + "'s hlogs to my queue");
>       SortedMap<String, SortedSet<String>> newQueues =
>           zkHelper.copyQueuesFromRS(rsZnode);   // Node create here*
>       zkHelper.deleteRsQueues(rsZnode); 
>       if (newQueues == null || newQueues.size() == 0) {
>         return;  
>       }
>     ...
> }
>   public void closeRecoveredQueue(ReplicationSourceInterface src) {
>     LOG.info("Done with the recovered queue " + src.getPeerClusterZnode());
>     this.oldsources.remove(src);
>     this.zkHelper.deleteSource(src.getPeerClusterZnode(), false);  // Node delete here*
>   }
> {code} 
> So from code we can see if newQueues == null or newQueues.size() == 0, Failover replication Source will never start and the failover zk node will never deleted.
> eg below failover node will never be delete:
> [zk: 10.232.98.77:2181(CONNECTED) 16] ls /hbase-test3-repl/replication/rs/dw93.kgb.sqa.cm4,60020,1346337383956/1-dw93.kgb.sqa.cm4,60020,1346309263932-dw91.kgb.sqa.cm4,60020,1346307150041-dw89.kgb.sqa.cm4,60020,1346307911711-dw93.kgb.sqa.cm4,60020,1346312019213-dw88.kgb.sqa.cm4,60020,1346311774939-dw89.kgb.sqa.cm4,60020,1346312314229-dw93.kgb.sqa.cm4,60020,1346312524307-dw88.kgb.sqa.cm4,60020,1346313203367-dw89.kgb.sqa.cm4,60020,1346313944402-dw88.kgb.sqa.cm4,60020,1346314214286-dw91.kgb.sqa.cm4,60020,1346315119613-dw93.kgb.sqa.cm4,60020,1346314186436-dw88.kgb.sqa.cm4,60020,1346315594396-dw89.kgb.sqa.cm4,60020,1346315909491-dw92.kgb.sqa.cm4,60020,1346315315634-dw89.kgb.sqa.cm4,60020,1346316742242-dw93.kgb.sqa.cm4,60020,1346317604055-dw92.kgb.sqa.cm4,60020,1346318098972-dw91.kgb.sqa.cm4,60020,1346317855650-dw93.kgb.sqa.cm4,60020,1346318532530-dw92.kgb.sqa.cm4,60020,1346318573238-dw89.kgb.sqa.cm4,60020,1346321299040-dw91.kgb.sqa.cm4,60020,1346321304393-dw92.kgb.sqa.cm4,60020,1346325755894-dw89.kgb.sqa.cm4,60020,1346326520895-dw91.kgb.sqa.cm4,60020,1346328246992-dw92.kgb.sqa.cm4,60020,1346327290653-dw93.kgb.sqa.cm4,60020,1346337303018-dw91.kgb.sqa.cm4,60020,1346337318929
> [] // empty node will never be deleted
>        

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira