You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by "Sijie Guo (JIRA)" <ji...@apache.org> on 2012/11/12 13:41:12 UTC

[jira] [Created] (BOOKKEEPER-464) Provide an improved GC algorithm

Sijie Guo created BOOKKEEPER-464:
------------------------------------

             Summary: Provide an improved GC algorithm
                 Key: BOOKKEEPER-464
                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-464
             Project: Bookkeeper
          Issue Type: Sub-task
            Reporter: Sijie Guo




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-464) Provide an improved GC algorithm

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505487#comment-13505487 ] 

Flavio Junqueira commented on BOOKKEEPER-464:
---------------------------------------------

Unless I'm misunderstanding the meaning of zombie entries, the operations related to a ledger fragment in any given bookie need to be atomic. If a bookie starts replicating a ledger fragment because it has been added to the ensemble of ledger but it doesn't complete the replication process because, e.g., its disk becomes full, then we remove the partially replicated ledger fragment. If a bookie is notified that a ledger has been marked for deletion while it is replicating the ledger fragment, then it shouldn't delete the ledger fragment in the middle of the replication process.

Perhaps we should have a sequencer thread in each bookie that enforces atomicity in the way I'm mentioning.
                
> Provide an improved GC algorithm
> --------------------------------
>
>                 Key: BOOKKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-464
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>            Assignee: Fangmin Lv
>             Fix For: 4.3.0
>
>         Attachments: BOOKKEEPER-464.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-464) Provide an improved GC algorithm

Posted by "Fangmin Lv (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13503463#comment-13503463 ] 

Fangmin Lv commented on BOOKKEEPER-464:
---------------------------------------

Actually this is the implementer of revisit garbage collection algorithm proposed in BOOKKEEPER-249, there is a detailed design doc gc_revisit.pdf on BOOKKEEPER-249 which explains different garbage collector algorithms and their performance comparison. I'm afraid it will be superfluous to rewrite the doc about this feature, so here I simply describe the algorithm I used in the improved garbage collector: 

1. When delete ledger, we will write the deleted ledgers to it's ensembles' /ledgers/deleted/Bi/ nodes. Then the bookie's garbage collector thread will read his own /ledgers/deleted/Bi/ node to get the deleted ledgers list, finally bookie will delete the ledgers and remove ledger metadata  in meta storage according to the list. This is the same as 4.3 section Detail Design 2 in gc_revisit.pdf.
2. To avoid zombie entries, we will trigger polling based garbage collector when bookie's disk went out.  

                
> Provide an improved GC algorithm
> --------------------------------
>
>                 Key: BOOKKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-464
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>            Assignee: Fangmin Lv
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-464.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-464) Provide an improved GC algorithm

Posted by "Fangmin Lv (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Fangmin Lv updated BOOKKEEPER-464:
----------------------------------

    Assignee: Fangmin Lv
    
> Provide an improved GC algorithm
> --------------------------------
>
>                 Key: BOOKKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-464
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>            Assignee: Fangmin Lv
>             Fix For: 4.2.0
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-464) Provide an improved GC algorithm

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505906#comment-13505906 ] 

Flavio Junqueira commented on BOOKKEEPER-464:
---------------------------------------------

Thanks, Sijie. I added a comment to BOOKKEEPER-249.
                
> Provide an improved GC algorithm
> --------------------------------
>
>                 Key: BOOKKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-464
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>            Assignee: Fangmin Lv
>             Fix For: 4.3.0
>
>         Attachments: BOOKKEEPER-464.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-464) Provide an improved GC algorithm

Posted by "Sijie Guo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505739#comment-13505739 ] 

Sijie Guo commented on BOOKKEEPER-464:
--------------------------------------

{quote}
 If a bookie starts replicating a ledger fragment because it has been added to the ensemble of ledger but it doesn't complete the replication process because, e.g., its disk becomes full, then we remove the partially replicated ledger fragment.
{quote}

For replication, the sequence is that 1) we picked up a bookie to replicate entries to it, and then 2) update the ledger ensemble only after those entries are replicated. if there are failures happened between replicating and updating ledger ensemble, those replicated entries will become zombies (not referenced by any ledger ensemble).

I think I wrote a comment in BOOKKEEPER-249 ( https://issues.apache.org/jira/browse/BOOKKEEPER-249?focusedCommentId=13497139&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13497139 ) to discuss all the possible cases causing zombie entries (also I gave my understanding about zombie entry in that comment).
                
> Provide an improved GC algorithm
> --------------------------------
>
>                 Key: BOOKKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-464
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>            Assignee: Fangmin Lv
>             Fix For: 4.3.0
>
>         Attachments: BOOKKEEPER-464.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-464) Provide an improved GC algorithm

Posted by "Fangmin Lv (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Fangmin Lv updated BOOKKEEPER-464:
----------------------------------

    Attachment: BOOKKEEPER-464.patch

Attach a patch to implement the improved gc based on BOOKKEEPER-463.
                
> Provide an improved GC algorithm
> --------------------------------
>
>                 Key: BOOKKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-464
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>            Assignee: Fangmin Lv
>             Fix For: 4.2.0
>
>         Attachments: BOOKKEEPER-464.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (BOOKKEEPER-464) Provide an improved GC algorithm

Posted by "Fangmin Lv (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Fangmin Lv updated BOOKKEEPER-464:
----------------------------------

    Fix Version/s:     (was: 4.2.0)
                   4.3.0
    
> Provide an improved GC algorithm
> --------------------------------
>
>                 Key: BOOKKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-464
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>            Assignee: Fangmin Lv
>             Fix For: 4.3.0
>
>         Attachments: BOOKKEEPER-464.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-464) Provide an improved GC algorithm

Posted by "Fangmin Lv (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501909#comment-13501909 ] 

Fangmin Lv commented on BOOKKEEPER-464:
---------------------------------------

Good suggestion, I will upload the document ASAP, thanks for your proposal.
                
> Provide an improved GC algorithm
> --------------------------------
>
>                 Key: BOOKKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-464
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>            Assignee: Fangmin Lv
>             Fix For: 4.2.0
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-464) Provide an improved GC algorithm

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501806#comment-13501806 ] 

Flavio Junqueira commented on BOOKKEEPER-464:
---------------------------------------------

Hi Fangmin, Could you upload a design document for this feature, please, just so that anyone interested can check your proposal?
                
> Provide an improved GC algorithm
> --------------------------------
>
>                 Key: BOOKKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-464
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>            Assignee: Fangmin Lv
>             Fix For: 4.2.0
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-464) Provide an improved GC algorithm

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13504535#comment-13504535 ] 

Hadoop QA commented on BOOKKEEPER-464:
--------------------------------------

Testing JIRA BOOKKEEPER-464

WARNING: Running test-patch on a dirty local svn workspace

Patch <a href="/jira/secure/attachment/12554685/BOOKKEEPER-464.patch">/jira/secure/attachment/12554685/BOOKKEEPER-464.patch</a> downloaded at Tue Nov 27 11:04:36 UTC 2012

----------------------------

{color:red}-1{color} Patch failed to apply to head of branch

----------------------------
                
> Provide an improved GC algorithm
> --------------------------------
>
>                 Key: BOOKKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-464
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-server
>            Reporter: Sijie Guo
>            Assignee: Fangmin Lv
>             Fix For: 4.3.0
>
>         Attachments: BOOKKEEPER-464.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira