You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by gi...@git.apache.org on 2017/04/27 15:02:36 UTC

[GitHub] davisp opened a new pull request #496: Couchdb 3287 pluggable storage engines

davisp opened a new pull request #496: Couchdb 3287 pluggable storage engines
URL: https://github.com/apache/couchdb/pull/496
 
 
   ## Overview
   
   Pluggable Storage Engines (Oh my!)
   
   I've finally covered enough bases to start asking for reviews on the pluggable storage engine work. I know that this is a fairly large change so I don't expect this to actually be merged for a number of weeks. So everyone that wants to review any part of this feel free to take your time and be thorough.
   
   I highly suggest reviewing this work a commit at a time as there are a few fairly large commits. A few signposts for the intrepid reader:
   
   1. Add couch_db_engine module
   
   This is a single file addition that just introduces the new pluggable storage engine API. Its fairly thoroughly documented and hopefully makes sense as to the level that this API is geared at. When I initially started this work I spent a lot of time trying to figure out what level of the API would be the best place to separate the storage engine from database logic. Go too high and every engine is reimplementing core bits of logic. Go to low and there's not enough room for interesting changes to the actual storage engine algorithms. I believe this is a happy middle ground that gives storage engines the room to play and invent alternative implementations while also not requiring an extremely large amount of reimplementation for various bits of behavior that are required.
   
   2. Add legacy storage engine implementation
   
   This commit is quite big but its important to note that all its doing is copying the existing parts of couch_db.erl and couch_db_updater.erl and moving them to a new set of modules named couch_bt_engine_*.erl. Nothing uses this code yet. I have this in its own commit since its fairly large. However, if you mostly just want to check it you should be able to see that the implementation is just pulled from existing functions. The behavior of this engine is identical to the "pre-PSE" engine because that's what it is. Its just been reformatted a bit and had its name changed.
   
   Also, I've kept this and couch_db_engine.erl in the main couch application since we'll always want to have at least one storage engine. Though this was pre-monorepo when adding another git repo to our builds seemed awfully heavy. Now that we're monorepo we're just creating a folder which is easy enough. For any reviewer I'd like to have them keep this in mind as one thing I expect to discuss is whether we should split this out into its own application.
   
   3. Implement pluggable storage engines
   
   This is the doozy. There are two main bits of work going on in this commit. First, the removal of all the code that was in the previous commit and second, the addition of all the code to have couch_db and couch_db_updater start using the couch_db_engine APIs. Its long and big but really if you take your time there's nothing magical going on here. For the core bits to study I'd recommend spending some extra time on couch_server to see how engines are configured and chosen at runtime.
   
   4. Add storage engine test suite
   
   This is a fairly complete test suite for the entire storage engine API. I don't remember what coverage of couch_bt_engine was but I know its fairly high. The useful bit about this is when devs are creating new storage engines they can just pull this in with a single eunit case to test their implementations. I'll show a few examples on this below.
   
   5. Ensure deterministic revisions for attachments
   
   I may end up squashing this into the implementation commit. This was a fix while I was developing PSE that I ended up refixing after the fact slightly differently. The original goal was to make sure everything compiled and all tests suites ran for each commit. I think I've just convinced myself to squash this. So if its missing its because I did that and then forgot to edit this description to remove this paragraph...
   
   ## Testing recommendations
   
   $ make check
   
   ## JIRA issue number
   
   https://issues.apache.org/jira/browse/COUCHDB-3287
   
   ## Related Pull Requests
   
   This PR is based on the previous mixed cluster upgrade PR. However when merge time comes around we'll merge #495 first and then this will be merged into master so that users can deploy the first changes before these changes. (Uptime is fun time).
   
   https://github.com/apache/couchdb/pull/495
   
   ## Checklist
   
   - [ ] Code is written and works correctly;
   - [ ] Changes are covered by tests;
   - [ ] Documentation reflects the changes;
   
   (We should probably remove the todo item for updating rebar.config since we're doing almost all work on the monorepo now)
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services