You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bookkeeper.apache.org by GitBox <gi...@apache.org> on 2018/03/01 00:08:39 UTC
[GitHub] reddycharan commented on issue #570: Multiple active entrylogs

reddycharan commented on issue #570: Multiple active entrylogs
URL: https://github.com/apache/bookkeeper/issues/570#issuecomment-369416511
 
 
   Hey @ivankelly , 
   
   to answer ?when/where was this discussed part?, I can point you to the meeting notes, mail exchanges, slack messages and git links which I shared with the community throughout the period. This work item was initially created by @jvrao during April 2017 https://issues.apache.org/jira/browse/BOOKKEEPER-1041. And by May 2017 I came up with design overview and implementation choices and presented to the community on May 18th community meeting. You can find the meeting minutes of what I presented on that day - https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-5-18+Meeting+Notes and I explained the design overview in the same jira issue. And yes, initially the design was to have configured number of entrylogs per ledgerdir. While implementing it, I noticed few issues with how checkpoint is done, regarding synchronization logic and preallocation of entry logs and raised the concerns regarding the same in the community by starting mail threads and having conversations in the community bi-weekly meets. In one of such mail conversation, @sijie asked same set of questions regarding where the abstraction/composition should happen. I explained in detail in the following mail http://mail-archives.apache.org/mod_mbox/bookkeeper-dev/201707.mbox/%3CCAAFz1KNJbPnhLGwp27kKKGgOtLiDPR7FHpSngdcN-C-Njxt7eQ%40mail.gmail.com%3E . Also in this email I provided git link for the initial implementation of code https://github.com/reddycharan/bookkeeper/tree/multipleentrylogs. Following this mail conversation @sijie , @jvrao  and I had in length conversation regarding where the composition should happen and convinced @sijie  that given the way code is structured it is right thing to do (abstraction/composition) at lowest level - Entrylogger level. 
   
   But in the follow up, I?ve been questioned about the intricacies of multithreaded/synchronization aspects of slot map management in configured number of entrylogs per ledgerdir approach and the benefits of explicit entry log per ledger in the compaction story (binary decision of whether to keep the entry log or not during compaction). So I changed the design to entrylog per ledger, which changed the logic of choosing entry log for the given ledger but the underlying changes of the abstraction of entry logger remained the same. This has been informed to the community formally in several community meet calls and Sijie in particular (multiple times) before proceeding. https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-06-01+Meeting+Notes https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-10-19+Meeting+Notes https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-11-30+Meeting+Notes  https://cwiki.apache.org/confluence/display/BOOKKEEPER/2017-12-14+Meeting+Notes. As I requested in 2017/12/14 meeting I shared preview version of my code in community slack https://apachebookkeeper.slack.com/archives/C6G5104SF/p1513212259000262 and the code is shared in my GitHub repo - https://github.com/reddycharan/bookkeeper/tree/entrylogperledger . @eolivelli and @sijie were helpful to do early CR, provide some initial feedback and gave their approval. Finally as I envisioned, discussed and shared, after rebasing my changes on the recent community code I created formal pull request - https://github.com/apache/bookkeeper/pull/1201 and also updated this issue with formal description of entry log per ledger design which I?ve been discussing all along.
   
   Agreed, I could have updated this issue description with the new design back then itself. But I?m not sure if it would have made difference since this was communicated/explained multiple times in the community meets and particularly to the interested people. For the people who are coming across this work item for the first time, it is needed/required when I created the formal pull request and hence I made sure I updated issue with full description (new design) before sending the Pull Request. I can say community has changed a lot in the amount of activity, processes, systems used and the list of participants in the last year (2017). This Work Item has spanned during this transition phase (it took quite longer time, this is because of change in priorities on our side, me dealing with multiple repos - internals salesforce one and community, deluge of commits in the second half of 2017 and the need to rebase my commits on top of them every single time I wanted to push it forward) and I see why you might have different opinion, considering you weren?t present in any of the above mentioned conversations/communications. But I see you are the one who migrated jira work item to git issues, but I?m not sure if it was automated task or well curated manual task. 
   
   Hope this answers ?when/where was this discussed part?.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services