You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org> on 2009/07/13 22:21:14 UTC

[jira] Created: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Need procedure to garbage collect ledgers
-----------------------------------------

                 Key: ZOOKEEPER-464
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
             Project: Zookeeper
          Issue Type: New Feature
          Components: contrib-bookkeeper
            Reporter: Flavio Paiva Junqueira


An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.

To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12861135#action_12861135 ] 

Hadoop QA commented on ZOOKEEPER-464:
-------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12442901/ZOOKEEPER-464.patch
  against trunk revision 938212.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/74/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/74/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/74/console

This message is automatically generated.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-464:
---------------------------------------------

    Attachment: zookeeper-464-log.txt

Hi Erwin, I still see several comments preceding new methods that we need to put in javadoc format. A more serious issue is that LedgerDeleteTest failed for me. I'm attaching the output log. It essentially crashed because it ran out of heap space.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854123#action_12854123 ] 

Erwin Tam commented on ZOOKEEPER-464:
-------------------------------------

Thanks Flavio.  I noticed that heap space problem intermittently too when running the junit tests.  It comes about  when we're reading through the entry logs but the format is off so the 4 byte entry length field is way too big.  We then try to allocate a huge buffer to read the entry from but something was mismatched already.  I noticed at times when running tests with small entry logs that the "BKLO" header doesn't get appended to it at times.  That would definitely screw things up when reading through it.  I'll see if I can reproduce this problem with the current configs in the tests and maybe exit more gracefully instead of allocating a huge amount of heap space for the buffers to read.  Obviously something must have been wrong when reading the entry log files.  I'm not sure why this problem only shows up a small percentage of the time.

Could you let me know which files still have the comments preceding new methods?  I thought I took this out already in the modified files but I must have missed some.

With regards to the documentation changes, do you generally modify the xml content document files, use forest to generate the html and pdf files and then check in all 3 changes in the jira attachment?  Or is it just the xml changes you submit via the jira?  Can I submit another file attachment to this same jira for the documentation changes?  Can you merge the changes there with the code changes in the already submitted patch?  Thanks!

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Status: Open  (was: Patch Available)

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860970#action_12860970 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-464:
--------------------------------------------------

Hi Erwin, Sorry about the delay to review it. I think this patch is pretty much ready to get in. I have just a couple of tiny requests:

# I noticed that the two commands you modified in the documentation (bookkeeperStated.xml and bookkeeperConfig.xml) are going over the paper margin for the generated pdfs. Could you please break up the lines?
# In LedgerDeleteTest, you're using "//" for multi-line comments. I believe we have been using "/**/" instead. This is just to keep it consistent with comments in other parts of the code.

Otherwise, it looks good.  If you fix these two things, then I think we will be able to get it in shortly after.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12848950#action_12848950 ] 

Hadoop QA commented on ZOOKEEPER-464:
-------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12439615/ZOOKEEPER-464.patch
  against trunk revision 925362.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    -1 release audit.  The applied patch generated 26 release audit warnings (more than the trunk's current 24 warnings).

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/37/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/37/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/37/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/37/console

This message is automatically generated.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>         Attachments: ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Status: Patch Available  (was: Open)

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>         Attachments: ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Status: Patch Available  (was: Open)

Uploaded a new patch with the formatting changes.  Note that the generated pdf's previously already had the issue with a line going beyond the paper margin.  I modified this but let me know if it isn't done correctly.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-464:
---------------------------------------------

        Status: Resolved  (was: Patch Available)
    Resolution: Fixed

Committed revision 944003. Thanks, Erwin!

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Attachment: ZOOKEEPER-464.patch

Updated patch with documentation changes and fix so the unit tests for ledger deletes won't error out intermittently. The problem was a bug in the EntryLogger where the "BKLO" ByteBuffer header that is prepended to each entry log was defined as a static object.  If multiple Bookies within the JVM are running, a race condition can occur if they are rolling over entry logs and accessing this header at the same time.  The end result is, some entry logs are created without the header and that screws the entry log ledger extraction.

All new methods should have proper javadoc comments before them now too.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853547#action_12853547 ] 

Hadoop QA commented on ZOOKEEPER-464:
-------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12440653/ZOOKEEPER-464.patch
  against trunk revision 929564.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/47/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/47/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/47/console

This message is automatically generated.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Status: Patch Available  (was: Open)

Latest version of the patch with documentation changes and fixes for some intermittent junit test errors.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Patrick Hunt (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Hunt updated ZOOKEEPER-464:
-----------------------------------

    Fix Version/s: 3.4.0

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855183#action_12855183 ] 

Erwin Tam commented on ZOOKEEPER-464:
-------------------------------------

There was an intermittent problem with the ledger delete junit tests prior to the last patch uploaded (which resolved it).  I'll document the bug and the fix for that here.

The ledger delete junit tests were failing intermittently and it was related to an issue I saw earlier when I was running unit tests with a very small entry log limit size (2K).  When the entry logs roll over, we create a new one by first writing the "BKLO" 1024 byte header to the beginning of the file.  The problem is, this byte buffer object is statically defined.  In our junit tests, we have multiple Bookie servers (and thus EntryLogger instances) in the same jvm.  If more than one EntryLogger is rolling over its current log and writing the next one, they are accessing the same entryLog file header buffer.  This creates problems since the static header isn't accessed in a synchronized way.  

This header byte buffer is cleared first before writing it to the log file. Since it is static, one thread could clear it first, then another thread (from a second Bookie server) clears it at the same time. The first thread writes the header but when it is done, the header's byte buffer's internal pointers have it pointing to the end and aren't reset.  The second thread will then be reading the header buffer that has not been cleared/reset.  What ends up happening is the entry logs in the second Bookie are created without the header.  When we're reading through those files later on to figure out which ledgers make it up, it'll read incorrect values and try to allocate byte buffers based on an incorrect length segment (basically reading in junk random bytes).  This creates the java heap space error.

The fix is simple and is to just make this logfile header a non-static variable, initializing it in the EntryLogger constructor.  In practice, we shouldn't be running multiple Bookies within the same jvm so we wouldn't run into this problem.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855005#action_12855005 ] 

Hadoop QA commented on ZOOKEEPER-464:
-------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12441169/ZOOKEEPER-464.patch
  against trunk revision 929564.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/49/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/49/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: http://hudson.zones.apache.org/hudson/job/Zookeeper-Patch-h1.grid.sp2.yahoo.net/49/console

This message is automatically generated.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Attachment: ZOOKEEPER-464.patch

Revised patch for the entry log garbage collecting feature.  Description on a high level for how this works is included here. Only the major files modified will be listed.

1. Client changes:
BookKeeper.java
LedgerDeleteOp.java
AsyncCallback.java

Added BK client methods to delete a ledger, both synchronously and asynchronously.  Deleting a ledger from the client side is to just remove the ZK ledger metadata node for it (in the /ledgers/L<ledger id> path).

2. Bookie Server changes:
EntryLogger.java
LedgerCache.java
Bookie.java

BK servers now are also ZK clients so they can query ZK for the existing ledger nodes to see what the current active set of ledgers are.  Bookies when initialized will create the ZK client instance and register an ephemeral node in ZK for the server.  

EntryLogger when initialized will try to extract all of the ledgers that make up the already existing entry logs (if any). We do not extract ledgers from the current active entry log that we are writing into since this will never be deleted.  When entry logs roll over, that's when we will read through old entry logs and extract the set of ledgers that make up all of the entries in the entry log (if it hasn't been done already).  This data is stored in memory as a mapping from entry log ID's to the set of ledger ID's that comprise them.  

LedgerCache when initialized will read through all of the existing ledger index files (if any) and store them in memory as the set of active ledgers that the Bookie Server knows about.  When a new ledger index file is created (new ledger), we will add that to this in memory set mapping.

The EntryLogger contains the Garbage Collector thread which runs periodically.  This first syncs with ZK, then reads the set of current active ledgers.  It compares this to the Bookie Server's LedgerCache's set of active ledgers that the Bookie Server knows about.  If there are any in that set but not in ZK, these are removed.  Then we loop though all of the older entry logs (other than the one we're writing into), see which ledgers make the entry log.  If any of those ledgers are deleted, we remove it from the entry log's ledgers set.  If any of these entry logs have no more active ledgers associated with them, then we delete the entry log.

3. Unit tests:
LedgerDeleteTest.java
New unit test to test out entry logs being garbage collected. This test will write enough ledger entries to roll over the first entry log.  Then the ledgers are deleted from the client side and the BK server's garbage collecting thread will pick up these changes and delete the entry log file(s).  Two variations of this test for the initial case when the BK servers start up with no existing entry logs and a second where entry logs already exist.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Status: Patch Available  (was: Open)

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Attachment: ZOOKEEPER-464.patch

Uploaded a patch for BK ledger deletes.  Could use more unit tests perhaps.  Known issues are with using the BufferedChannel currently when reading in the ledgers that make up the entry log files.  If the BufferedChannel is created for an already existing entry log file we want to read from (but not write to), it results in an infinite loop.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>         Attachments: ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874194#action_12874194 ] 

Hudson commented on ZOOKEEPER-464:
----------------------------------

Integrated in ZooKeeper-trunk #831 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/831/])
    

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867106#action_12867106 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-464:
--------------------------------------------------

+1, this patch is ready to get in. Great job, Erwin! I'll be committing it shortly.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Attachment:     (was: ZOOKEEPER-464.patch)

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855158#action_12855158 ] 

Erwin Tam commented on ZOOKEEPER-464:
-------------------------------------

Flavio, my comments a couple of entries above talks about the EntryLogger's "BKLO" header that was causing the problem.  That was creating the issue in the junit test making it fail intermittently.  I sent you and Ben an email about this fix too in more detail but we could talk about it more if you want. Thanks!

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867431#action_12867431 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-464:
--------------------------------------------------

Erwin pointed out that I forgot to add two new files, so here they are: Committed revision 944140.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Attachment: ZOOKEEPER-464.patch

Updated patch incorporating the feedback from Flavio.  Note that the documentation changes haven't been done yet here.  I might open a separate Jira for that once I figure out the forrest xml stuff.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Attachment: ZOOKEEPER-464.patch

Updated patch with formatting changes requested.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855117#action_12855117 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-464:
--------------------------------------------------

Thanks for the new patch, Erwin. Could you please tell me how you've fixed the issue with the unit test?

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852527#action_12852527 ] 

Erwin Tam commented on ZOOKEEPER-464:
-------------------------------------

Thanks Flavio for the feedback!  I'll make the changes you suggested and will upload a new patch (without deleting the old one).  I'll bug Ben about the forrest xml documentation if I can't figure it out.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Flavio Paiva Junqueira updated ZOOKEEPER-464:
---------------------------------------------

    Status: Open  (was: Patch Available)

Hi Erwin, It looks mostly fine. Here are a few suggestions and requests:

* When you upload a new patch, please don't delete the previous one. Also, I can't remember if we cancelled and re-summitted, but every time we submit a new version of the patch, we need to cancel the previous patch in the workflow, upload the new patch, and then submit again. This is to make sure that Hudson tests the patch.

* For documentation, the source files are under src/docs/src/documentation/content/xdocs. They are xml files that we compile with forrest to generate the documentation we have on the Web. If you can't figure out how to change the doc files and compile, then please open a jira and list the necessary modifications. It took me a while to get used to forrest, so don't worry too much if you can get it to work.
 
* In LedgerDeleteTest, I would suggest to replace this excerpt of code:

{noformat}
if (f.isFile() && f.getName().equals("0.log")) {
    LOG.error("Found the entry log file (0.log) that should have been deleted in ledgerDirectory: "
                            + ledgerDirectory);
    // This test has failed since we have found the entry Log
    // file that should have been deleted.
    assertTrue(false);
}
{noformat}

with:

{noformat}
assertFalse(f.isFile() && f.getName().equals("0.log"), "Found the entry log file (0.log) that should have been deleted in ledgerDirectory: " + ledgerDirectory);
{noformat}

I believe it is more compact and does the same, doesn't it?

* I'd like to ask you to add javadocs to the new methods. I've seen several new methods that have comments right before the method declaration using "//". Also, please review the text of the methods as you change the syntax to javadoc as some of them do not parse correctly.  
* Doesn't access to the following data structures in LedgerCache.deleteLedger() need to be synchronized:
{noformat}	
fileInfoCache
cleanLedgers
dirtyLedgers
openLedgers
{noformat}

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Status: Open  (was: Patch Available)

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: zookeeper-464-log.txt, ZOOKEEPER-464.patch, ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Flavio Paiva Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12849113#action_12849113 ] 

Flavio Paiva Junqueira commented on ZOOKEEPER-464:
--------------------------------------------------

This is great, Erwin! A few high-level comments:

# Could you summarize the changes of this patch? For example, it would be nice to summarize how the garbage collector behaves with respect to removing entry logs;
# The license text is missing in LedgerDeleteTest.java. That's why your patch got a -1 in release audits;
#  Could you detail the issue with BufferedChannel? Where is the problem exactly? Why can't we fix it?
# We need documentation for these features. If we don't do it in this patch, then we need to open another jira and make sure that the documentation makes it to the same release as this patch. Right now it sounds like this feature will go into 3.4.0.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12849421#action_12849421 ] 

Erwin Tam commented on ZOOKEEPER-464:
-------------------------------------

Thanks for the feedback Flavio!
1. Where should I put the summary of the behavior?  What about documentation of this feature?  It does require changes to how you set up Bookie Servers (need to pass in the ZK servers list).  Should I put that in comments for this Jira?  Should I upload a text documentation too to describe the feature functionality?
2. I'll add the missing license text.  Forgot that I need that for new files.
3. I have a fix for the BufferedChannel issue and it can be done either in the BufferedChannel code or outside it from the callers.  The problem is when it is created on an existing File.  The FileChannels passed to the constructor need to set their position to the end of the file.  Otherwise, the write buffer's start position will default to 0, the start of the file.  That screws things up when you try to read from it.  This bug should already exist if you start up the Bookies with already existing entry logs and you try to read entries from there.  I don't think there are any unit tests that test that specifically.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>             Fix For: 3.4.0
>
>         Attachments: ZOOKEEPER-464.patch
>
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-464) Need procedure to garbage collect ledgers

Posted by "Erwin Tam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erwin Tam updated ZOOKEEPER-464:
--------------------------------

    Assignee: Erwin Tam

I'm working on this enhancement now and just about ready to submit this as a patch.  Will see if Ben wants to review my changes first beforehand.

> Need procedure to garbage collect ledgers
> -----------------------------------------
>
>                 Key: ZOOKEEPER-464
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-464
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: contrib-bookkeeper
>            Reporter: Flavio Paiva Junqueira
>            Assignee: Erwin Tam
>
> An application using BookKeeper is likely to use a large number of ledgers over time. Such an application might not need all ledgers created over time and might want to delete some of these ledgers to free up some space on bookies. The idea of this jira is to implement a procedure that enables an application to garbage-collect unwanted ledgers.
> To garbage-collect a ledger, we need to delete the ledger metadata on ZooKeeper, and delete the ledger data on corresponding bookies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.