You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by iilyak <gi...@git.apache.org> on 2016/05/17 18:55:24 UTC

[GitHub] couchdb-couch-index pull request: 65501 disable recompaction

GitHub user iilyak opened a pull request:

    https://github.com/apache/couchdb-couch-index/pull/16

    65501 disable recompaction

    In some cases it is a beneficial to be able to temporary disable recompaction. 
    This PR introduces a way to do so globally or individually per database or specific index.
    
    The configuration of the feature is as follows:
    
    ```
    [view_compaction.recompaction]
    enable = true / false %% to set globaly
    <db_name> = enable / disable %% to set per db
    <db_name>:<idx_name> = enable / disable %% to set per index
    ```
    
    Depends on: https://github.com/apache/couchdb-erlang-tests/pull/2

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloudant/couchdb-couch-index 65501-disable-recompaction

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/couchdb-couch-index/pull/16.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16
    
----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#issuecomment-219868691
  
    @rnewson: This change is for view_compaction (not db compaction). Here is my understanding of how it would work. 
    
    - The entry point to trigger compaction is [`couch_index:compact`](https://github.com/cloudant/couchdb-couch-index/blob/a3af68f86ee21a9e072507b9d1d7da11e001b299/src/couch_index.erl#L76). 
    - It will do [gen_server:call(CPid, compact)](https://github.com/apache/couchdb-couch-index/blob/master/src/couch_index.erl#L77) to compactor Pid. 
    - `couch_index_compactor` would [spawn_link(fun() -> compact/3..](https://github.com/apache/couchdb-couch-index/blob/master/src/couch_index_compactor.erl#L64). 
    - `couch_index_compactor:compact/3` would call [Mod:compact/3](https://github.com/apache/couchdb-couch-index/blob/master/src/couch_index_compactor.erl#L104). 
    In our case `Mod` would be `couch_mrview_index`.
    - `couch_mrview_index:compact/3` would dispatch the call to [couch_mrview_compactor:compact/3](https://github.com/apache/couchdb-couch-mrview/blob/master/src/couch_mrview_index.erl#L193). 
    - When compaction is finished we get control back into [couch_index_compactor:compact](https://github.com/apache/couchdb-couch-index/blob/master/src/couch_index_compactor.erl#L106). 
    - Where right after `couch_mrview_compactor` we do `Mod:commit(NewIdxState)`. We notify couch_index by doing [gen_server:call(Idx, {compacted, NewIdxState})](https://github.com/apache/couchdb-couch-index/blob/master/src/couch_index_compactor.erl#L107). 
    - I.e. we get [here](https://github.com/cloudant/couchdb-couch-index/blob/a3af68f86ee21a9e072507b9d1d7da11e001b299/src/couch_index.erl#L191). Where we can check if re-compaction is enabled. 
    - If re-compaction is not enabled we swap files. 
    Otherwise we reply `{reply, recompact, State}`. So  we can trigger [couch_mrview_index:compact(_, _, [recompact])](https://github.com/cloudant/couchdb-couch-index/blob/a3af68f86ee21a9e072507b9d1d7da11e001b299/src/couch_index_compactor.erl#L108:L110).
    
    Please let me know what I'm missing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by kxepal <gi...@git.apache.org>.
Github user kxepal commented on the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#issuecomment-219820258
  
    Recompaction? What does it gives?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by davisp <gi...@git.apache.org>.
Github user davisp commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#discussion_r63764684
  
    --- Diff: src/couch_index.erl ---
    @@ -407,3 +394,160 @@ assert_signature_match(Mod, OldIdxState, NewIdxState) ->
             {Sig, Sig} -> ok;
             _ -> erlang:error(signature_mismatch)
         end.
    +
    +commit_compacted(NewIdxState, State) ->
    +    #st{
    +        mod=Mod,
    +        idx_state=OldIdxState,
    +        updater=Updater,
    +        commit_delay=Delay
    +    } = State,
    +    {ok, NewIdxState1} = Mod:swap_compacted(OldIdxState, NewIdxState),
    +    % Restart the indexer if it's running.
    +    case couch_index_updater:is_running(Updater) of
    +        true -> ok = couch_index_updater:restart(Updater, NewIdxState1);
    +        false -> ok
    +    end,
    +    case State#st.committed of
    +        true -> erlang:send_after(Delay, self(), commit);
    +        false -> ok
    +    end,
    +    State#st{
    +        idx_state=NewIdxState1,
    +        committed=false
    +     }.
    +
    +is_recompaction_enabled(IdxState, #st{mod = Mod}) ->
    +    DbName = binary_to_list(Mod:get(db_name, IdxState)),
    +    IdxName = binary_to_list(Mod:get(idx_name, IdxState)),
    +    IdxKey = DbName ++ ":" ++ IdxName,
    +    Global = config:get("view_compaction.recompaction", "enable"),
    +    PerIndex = config:get("view_compaction.recompaction", IdxKey),
    +    PerDb = config:get("view_compaction.recompaction", DbName),
    +    case {Global, PerDb, PerIndex} of
    --- End diff --
    
    This truth table is confusing to me. I'd disable to on and then have a trigger off with a simple orelse combo. Something along the lines of:
    
        Global = config:get("view_compaction.recompaction", "enabled", "true") == "false",
        PerDb = config:get("view_compaction.recompaction", DbName, "enabled") == "disabled",
        PerIdx = config:get("view_compaction.recompaction", IdxKey, "enabled") == "disabled",
        Global orelse PerDb orelse PerIdx.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#discussion_r63967414
  
    --- Diff: src/couch_index.erl ---
    @@ -407,3 +394,160 @@ assert_signature_match(Mod, OldIdxState, NewIdxState) ->
             {Sig, Sig} -> ok;
             _ -> erlang:error(signature_mismatch)
         end.
    +
    +commit_compacted(NewIdxState, State) ->
    +    #st{
    +        mod=Mod,
    +        idx_state=OldIdxState,
    +        updater=Updater,
    +        commit_delay=Delay
    +    } = State,
    +    {ok, NewIdxState1} = Mod:swap_compacted(OldIdxState, NewIdxState),
    +    % Restart the indexer if it's running.
    +    case couch_index_updater:is_running(Updater) of
    +        true -> ok = couch_index_updater:restart(Updater, NewIdxState1);
    +        false -> ok
    +    end,
    +    case State#st.committed of
    +        true -> erlang:send_after(Delay, self(), commit);
    +        false -> ok
    +    end,
    +    State#st{
    +        idx_state=NewIdxState1,
    +        committed=false
    +     }.
    +
    +is_recompaction_enabled(IdxState, #st{mod = Mod}) ->
    +    DbName = binary_to_list(Mod:get(db_name, IdxState)),
    +    IdxName = binary_to_list(Mod:get(idx_name, IdxState)),
    +    IdxKey = DbName ++ ":" ++ IdxName,
    +    Global = config:get("view_compaction.recompaction", "enable"),
    +    PerIndex = config:get("view_compaction.recompaction", IdxKey),
    +    PerDb = config:get("view_compaction.recompaction", DbName),
    +    case {Global, PerDb, PerIndex} of
    --- End diff --
    
    It is about defaults. By default recompaction should be enabled. We also need to distinguish between default and user provided value. There are two major modes:
    
    1. when `view_compaction.enable_recompaction = true`
       In this case we can set `view_recompaction.DbName:IdxName.enable = false` to ***disable*** `recompact`.
    
    1. when `view_compaction.enable_recompaction = false`
       In this case we can set `view_recompaction.DbName:IdxName.enable = true` to ***enable*** `recompact`.
    
    Although I agree the case statement is hard to follow. I'll simplify the condition.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by davisp <gi...@git.apache.org>.
Github user davisp commented on the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#issuecomment-221310546
  
    @iilyak +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by davisp <gi...@git.apache.org>.
Github user davisp commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#discussion_r64073496
  
    --- Diff: src/couch_index.erl ---
    @@ -407,3 +394,160 @@ assert_signature_match(Mod, OldIdxState, NewIdxState) ->
             {Sig, Sig} -> ok;
             _ -> erlang:error(signature_mismatch)
         end.
    +
    +commit_compacted(NewIdxState, State) ->
    +    #st{
    +        mod=Mod,
    +        idx_state=OldIdxState,
    +        updater=Updater,
    +        commit_delay=Delay
    +    } = State,
    +    {ok, NewIdxState1} = Mod:swap_compacted(OldIdxState, NewIdxState),
    +    % Restart the indexer if it's running.
    +    case couch_index_updater:is_running(Updater) of
    +        true -> ok = couch_index_updater:restart(Updater, NewIdxState1);
    +        false -> ok
    +    end,
    +    case State#st.committed of
    +        true -> erlang:send_after(Delay, self(), commit);
    +        false -> ok
    +    end,
    +    State#st{
    +        idx_state=NewIdxState1,
    +        committed=false
    +     }.
    +
    +is_recompaction_enabled(IdxState, #st{mod = Mod}) ->
    +    DbName = binary_to_list(Mod:get(db_name, IdxState)),
    +    IdxName = binary_to_list(Mod:get(idx_name, IdxState)),
    +    IdxKey = DbName ++ ":" ++ IdxName,
    +    Global = config:get("view_compaction.recompaction", "enable"),
    +    PerIndex = config:get("view_compaction.recompaction", IdxKey),
    +    PerDb = config:get("view_compaction.recompaction", DbName),
    +    case {Global, PerDb, PerIndex} of
    --- End diff --
    
    Fair enough. I didn't contemplate an operator wanting to selectively re-enable recompact at a narrower scope when its been disabled in a wider scope. I guess that could be a possibility.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by kxepal <gi...@git.apache.org>.
Github user kxepal commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#discussion_r63833802
  
    --- Diff: src/couch_index.erl ---
    @@ -407,3 +394,160 @@ assert_signature_match(Mod, OldIdxState, NewIdxState) ->
             {Sig, Sig} -> ok;
             _ -> erlang:error(signature_mismatch)
         end.
    +
    +commit_compacted(NewIdxState, State) ->
    +    #st{
    +        mod=Mod,
    +        idx_state=OldIdxState,
    +        updater=Updater,
    +        commit_delay=Delay
    +    } = State,
    +    {ok, NewIdxState1} = Mod:swap_compacted(OldIdxState, NewIdxState),
    +    % Restart the indexer if it's running.
    +    case couch_index_updater:is_running(Updater) of
    +        true -> ok = couch_index_updater:restart(Updater, NewIdxState1);
    +        false -> ok
    +    end,
    +    case State#st.committed of
    +        true -> erlang:send_after(Delay, self(), commit);
    +        false -> ok
    +    end,
    +    State#st{
    +        idx_state=NewIdxState1,
    +        committed=false
    +     }.
    +
    +is_recompaction_enabled(IdxState, #st{mod = Mod}) ->
    +    DbName = binary_to_list(Mod:get(db_name, IdxState)),
    +    IdxName = binary_to_list(Mod:get(idx_name, IdxState)),
    +    IdxKey = DbName ++ ":" ++ IdxName,
    +    Global = config:get("view_compaction.recompaction", "enable"),
    --- End diff --
    
    I would prefer
    ```
    [view_compaction]
    enable_recompaction = true | false
    
    [view_compaction.DbName]
    enable = true | false
    [view_compaction.DbName:IdxName]
    enable = true | false
    ```
    
    Because:
    - This is boolean logic without else state;
    - For others switches we use booleans as well;
    - If we would like to add few more options for recompaction (still don't understand what is it and why), there wouldn't be any issues with;
    - It's more error-prone solution (enable or enabled?)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by rnewson <gi...@git.apache.org>.
Github user rnewson commented on the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#issuecomment-219819020
  
    compaction in couchdb is initiated by the user or the compaction daemon, this seems to be in the wrong place.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/couchdb-couch-index/pull/16


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by davisp <gi...@git.apache.org>.
Github user davisp commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#discussion_r63764967
  
    --- Diff: src/couch_index.erl ---
    @@ -407,3 +394,160 @@ assert_signature_match(Mod, OldIdxState, NewIdxState) ->
             {Sig, Sig} -> ok;
             _ -> erlang:error(signature_mismatch)
         end.
    +
    +commit_compacted(NewIdxState, State) ->
    +    #st{
    +        mod=Mod,
    +        idx_state=OldIdxState,
    +        updater=Updater,
    +        commit_delay=Delay
    +    } = State,
    +    {ok, NewIdxState1} = Mod:swap_compacted(OldIdxState, NewIdxState),
    +    % Restart the indexer if it's running.
    +    case couch_index_updater:is_running(Updater) of
    +        true -> ok = couch_index_updater:restart(Updater, NewIdxState1);
    +        false -> ok
    +    end,
    +    case State#st.committed of
    +        true -> erlang:send_after(Delay, self(), commit);
    +        false -> ok
    +    end,
    +    State#st{
    +        idx_state=NewIdxState1,
    +        committed=false
    +     }.
    +
    +is_recompaction_enabled(IdxState, #st{mod = Mod}) ->
    +    DbName = binary_to_list(Mod:get(db_name, IdxState)),
    +    IdxName = binary_to_list(Mod:get(idx_name, IdxState)),
    +    IdxKey = DbName ++ ":" ++ IdxName,
    +    Global = config:get("view_compaction.recompaction", "enable"),
    --- End diff --
    
    Also true vs enabled for settings seems odd. Seems like you'd be better off using something like "status = enabled" as a setting. Or putting it in a different section for the global settings although I dunno if we have a better section. Though you could just use
    
        [view_compaction]
        recompaction = enabled
    
        [view_compaction.recompaction]
        DbName = enabled | disabled
        DbName:IdxName = enabled | disabled


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by davisp <gi...@git.apache.org>.
Github user davisp commented on a diff in the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#discussion_r63764248
  
    --- Diff: src/couch_index.erl ---
    @@ -407,3 +394,160 @@ assert_signature_match(Mod, OldIdxState, NewIdxState) ->
             {Sig, Sig} -> ok;
             _ -> erlang:error(signature_mismatch)
         end.
    +
    +commit_compacted(NewIdxState, State) ->
    +    #st{
    +        mod=Mod,
    +        idx_state=OldIdxState,
    +        updater=Updater,
    +        commit_delay=Delay
    +    } = State,
    +    {ok, NewIdxState1} = Mod:swap_compacted(OldIdxState, NewIdxState),
    +    % Restart the indexer if it's running.
    +    case couch_index_updater:is_running(Updater) of
    +        true -> ok = couch_index_updater:restart(Updater, NewIdxState1);
    +        false -> ok
    +    end,
    +    case State#st.committed of
    +        true -> erlang:send_after(Delay, self(), commit);
    +        false -> ok
    +    end,
    +    State#st{
    +        idx_state=NewIdxState1,
    +        committed=false
    +     }.
    +
    +is_recompaction_enabled(IdxState, #st{mod = Mod}) ->
    +    DbName = binary_to_list(Mod:get(db_name, IdxState)),
    +    IdxName = binary_to_list(Mod:get(idx_name, IdxState)),
    +    IdxKey = DbName ++ ":" ++ IdxName,
    +    Global = config:get("view_compaction.recompaction", "enable"),
    --- End diff --
    
    I think you should be using "enabled" and "disabled" here. The noun version I think is the more common use for a setting value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#issuecomment-219870207
  
    @kxepal: If recompaction is enabled it would through away "almost" compacted view if db update occurs. Sometimes though you want to let compaction finish and then do regular index update to catch up. By temporary disabling recompaction you would be able to do so. Above is just one example when it could be useful. There are other use cases as well. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by davisp <gi...@git.apache.org>.
Github user davisp commented on the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#issuecomment-220387961
  
    For background on this, "recompaction" is just an internal name for the step during view compaction when we spawn an index updater to top off a compacted view. This has been in CouchDB since almost forever but was never used by Cloudant.
    
    The reason we have the view updater is that (unlike DB compactions) we can't resume a view compaction. So we have to make sure that when we swap out the compaction that there's no need to retry compaction.
    
    The downside to not running the recompact step is that a view will appear to roll backwards when a view compaction finishes. In a cluster this isn't a big deal because there are two other copies that will cover while the index updater catches back up.
    
    The downside to running a recompact is that if the update frequency on a database is too high we'll never converge on view compactions. Which makes obvious sense if you think about it. Any updates that happen during the initial compaction stage need to be re-processed by the index updater. If you have two index updaters running over the same data (ie, one updating the uncompacted index, one trying to finish updates on the compacted index) they should be running at the same rate and will thus never converge.
    
    To @rnewson's comment, this is the appropriate place. Its not a question of whether/when compaction runs, but whether we perform the index updater step which we have long known falls down on large views and/or high update rates.
    
    For the config layout, I'd propose a combined approach between mine and @kxepal's suggestions. I do like the booleans but I think the `[view_compaction.recompact] with `$dbname` and `$dbname:$view` is more appropriate. The specific sections with a single pre-defined key seem like a misuse of the `section/key/value` namespace config hierarchy.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by iilyak <gi...@git.apache.org>.
Github user iilyak commented on the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#issuecomment-220619707
  
    @davisp: The hard to follow case statement was removed. I'll fix the commit message to document the new names of the settings when squash.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] couchdb-couch-index pull request: disable recompaction

Posted by kxepal <gi...@git.apache.org>.
Github user kxepal commented on the pull request:

    https://github.com/apache/couchdb-couch-index/pull/16#issuecomment-220406689
  
    @davisp 
    Thanks you for the great (as usual) explanation! +1 for combined config approach, also looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---