You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2020/09/10 18:38:29 UTC

[GitHub] [couchdb] tonysun83 opened a new pull request #3144: move remonitor code into DOWN message

tonysun83 opened a new pull request #3144:
URL: https://github.com/apache/couchdb/pull/3144


   Smoosh monitors the compactor pid to determine when the compaction jobs
   finishes, and uses this for its idea of concurrency. However, this isn't
   accurate in the case where the compaction job has to re-spawn to catch up on
   intervening changes since the same logical compaction job continues with
   another pid and smoosh is not aware. In such cases, a smoosh channel with
   concurrency one can start arbitrarily many additional database compaction jobs.
   
   To solve this problem, we added a check to see if a compaction PID exists for
   a db in `start_compact`. But this is the wrong approach because it’s for a
   shard that comes off the queue. So it’s a different shard and the following
   can occur:
   
   1. Enqueue a bunch of stuff into channel with concurrency 1
   2. Begin highest priority job, Shard1, in channel
   3. Compaction finishes, discovers compaction file is behind main file
   4. Smoosh-monitored PID for Shard1 exits, a new one starts to finish the job
   5. Smoosh receives the 'DOWN' message, begins the next highest priority job,
   Shard2
   6. Channel concurrency is now 2, not 1
   
   This change moves the check into the 'DOWN' message so that we can check for
   that specific shard. If the compaction PID exists then it means a new process
   was spawned and we just monitor that one and add it back to the queue. The
   length of the queue does not change and therefore we won’t spawn new
   compaction jobs.
   
   <!-- Thank you for your contribution!
   
        Please file this form by replacing the Markdown comments
        with your text. If a section needs no action - remove it.
   
        Also remember, that CouchDB uses the Review-Then-Commit (RTC) model
        of code collaboration. Positive feedback is represented +1 from committers
        and negative is a -1. The -1 also means veto, and needs to be addressed
        to proceed. Once there are no objections, the PR can be merged by a
        CouchDB committer.
   
        See: http://couchdb.apache.org/bylaws.html#decisions for more info. -->
   
   ## Overview
   
   <!-- Please give a short brief for the pull request,
        what problem it solves or how it makes things better. -->
   
   ## Testing recommendations
   
   <!-- Describe how we can test your changes.
        Does it provides any behaviour that the end users
        could notice? -->
   
   ## Related Issues or Pull Requests
   
   <!-- If your changes affects multiple components in different
        repositories please put links to those issues or pull requests here.  -->
   
   ## Checklist
   
   - [ ] Code is written and works correctly
   - [ ] Changes are covered by tests
   - [ ] Any new configurable parameters are documented in `rel/overlay/etc/default.ini`
   - [ ] A PR for documentation changes has been made in https://github.com/apache/couchdb-documentation
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] tonysun83 commented on a change in pull request #3144: move remonitor code into DOWN message

Posted by GitBox <gi...@apache.org>.
tonysun83 commented on a change in pull request #3144:
URL: https://github.com/apache/couchdb/pull/3144#discussion_r486574946



##########
File path: src/smoosh/src/smoosh_channel.erl
##########
@@ -275,24 +276,34 @@ start_compact(State, Db) ->
     case smoosh_utils:ignore_db(Db) of
     false ->
         DbPid = couch_db:get_pid(Db),
-        Key = couch_db:name(Db),
-        case couch_db:get_compactor_pid(Db) of
-            nil ->
-                Ref = erlang:monitor(process, DbPid),
-                DbPid ! {'$gen_call', {self(), Ref}, start_compact},
-                State#state{starting=[{Ref, Key}|State#state.starting]};
-            % database is still compacting so we can just monitor the existing
-            % compaction pid
-            CPid ->
-                couch_log:notice("Db ~s continuing compaction",
-                    [smoosh_utils:stringify(Key)]),
-                erlang:monitor(process, CPid),
-                State#state{active=[{Key, CPid}|State#state.active]}
-        end;
+        Ref = erlang:monitor(process, DbPid),
+        DbPid ! {'$gen_call', {self(), Ref}, start_compact},
+        State#state{starting=[{Ref, couch_db:name(Db)}|State#state.starting]};

Review comment:
       also, this section is a revert to what we originally had: https://github.com/cloudant/smoosh/pull/54/files#diff-7ff50b91998e5bf2a1f4cf4a8250f607L236-L238




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] davisp commented on a change in pull request #3144: move remonitor code into DOWN message

Posted by GitBox <gi...@apache.org>.
davisp commented on a change in pull request #3144:
URL: https://github.com/apache/couchdb/pull/3144#discussion_r486562310



##########
File path: src/smoosh/src/smoosh_channel.erl
##########
@@ -122,17 +122,18 @@ handle_info({'DOWN', Ref, _, Job, Reason}, State0) ->
     #state{active=Active0, starting=Starting0} = State,
     case lists:keytake(Job, 2, Active0) of
         {value, {Key, _Pid}, Active1} ->
-            couch_log:warning("exit for compaction of ~p: ~p", [
-                smoosh_utils:stringify(Key), Reason]),
-            {ok, _} = timer:apply_after(5000, smoosh_server, enqueue, [Key]),
-            {noreply, maybe_start_compaction(State#state{active=Active1})};
+            State1 = maybe_remonitor_cpid(State#state{active=Active1}, Key,
+                Reason),
+            {noreply, maybe_start_compaction(State1)};
         false ->
             case lists:keytake(Ref, 1, Starting0) of
                 {value, {_, Key}, Starting1} ->
-                    couch_log:warning("failed to start compaction of ~p: ~p", [
-                        smoosh_utils:stringify(Key), Reason]),
-                    {ok, _} = timer:apply_after(5000, smoosh_server, enqueue, [Key]),
-                    {noreply, maybe_start_compaction(State#state{starting=Starting1})};
+                    couch_log:warning("failed to start compaction of ~p: ~p",
+                        [smoosh_utils:stringify(Key), Reason]),
+                    {ok, _} = timer:apply_after(5000, smoosh_server, enqueue,
+                        [Key]),
+                    {noreply,
+                        maybe_start_compaction(State#state{starting=Starting1})};

Review comment:
       This hunk appears to be an unintended style change? Unless I'm missing something that's changed that's hard to see with the different wrapping lets remove this bit.

##########
File path: src/smoosh/src/smoosh_channel.erl
##########
@@ -275,24 +276,34 @@ start_compact(State, Db) ->
     case smoosh_utils:ignore_db(Db) of
     false ->
         DbPid = couch_db:get_pid(Db),
-        Key = couch_db:name(Db),
-        case couch_db:get_compactor_pid(Db) of
-            nil ->
-                Ref = erlang:monitor(process, DbPid),
-                DbPid ! {'$gen_call', {self(), Ref}, start_compact},
-                State#state{starting=[{Ref, Key}|State#state.starting]};
-            % database is still compacting so we can just monitor the existing
-            % compaction pid
-            CPid ->
-                couch_log:notice("Db ~s continuing compaction",
-                    [smoosh_utils:stringify(Key)]),
-                erlang:monitor(process, CPid),
-                State#state{active=[{Key, CPid}|State#state.active]}
-        end;
+        Ref = erlang:monitor(process, DbPid),
+        DbPid ! {'$gen_call', {self(), Ref}, start_compact},
+        State#state{starting=[{Ref, couch_db:name(Db)}|State#state.starting]};

Review comment:
       I don't think this is correct. A compaction could be running due to manual intervention or perhaps if smoosh crashed and left a compaction running. I'd just change the comment to be something like "Compaction is already running, so monitor existing compaction pid".




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] davisp commented on a change in pull request #3144: move remonitor code into DOWN message

Posted by GitBox <gi...@apache.org>.
davisp commented on a change in pull request #3144:
URL: https://github.com/apache/couchdb/pull/3144#discussion_r486581411



##########
File path: src/smoosh/src/smoosh_channel.erl
##########
@@ -275,24 +276,34 @@ start_compact(State, Db) ->
     case smoosh_utils:ignore_db(Db) of
     false ->
         DbPid = couch_db:get_pid(Db),
-        Key = couch_db:name(Db),
-        case couch_db:get_compactor_pid(Db) of
-            nil ->
-                Ref = erlang:monitor(process, DbPid),
-                DbPid ! {'$gen_call', {self(), Ref}, start_compact},
-                State#state{starting=[{Ref, Key}|State#state.starting]};
-            % database is still compacting so we can just monitor the existing
-            % compaction pid
-            CPid ->
-                couch_log:notice("Db ~s continuing compaction",
-                    [smoosh_utils:stringify(Key)]),
-                erlang:monitor(process, CPid),
-                State#state{active=[{Key, CPid}|State#state.active]}
-        end;
+        Ref = erlang:monitor(process, DbPid),
+        DbPid ! {'$gen_call', {self(), Ref}, start_compact},
+        State#state{starting=[{Ref, couch_db:name(Db)}|State#state.starting]};

Review comment:
       No, I'm saying the comment on lines 284/285 from before the patch are a comment about "database still compaction...". I think it'd be better to just change that comment rather than changing the whole thing back.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] tonysun83 commented on a change in pull request #3144: move remonitor code into DOWN message

Posted by GitBox <gi...@apache.org>.
tonysun83 commented on a change in pull request #3144:
URL: https://github.com/apache/couchdb/pull/3144#discussion_r486567890



##########
File path: src/smoosh/src/smoosh_channel.erl
##########
@@ -275,24 +276,34 @@ start_compact(State, Db) ->
     case smoosh_utils:ignore_db(Db) of
     false ->
         DbPid = couch_db:get_pid(Db),
-        Key = couch_db:name(Db),
-        case couch_db:get_compactor_pid(Db) of
-            nil ->
-                Ref = erlang:monitor(process, DbPid),
-                DbPid ! {'$gen_call', {self(), Ref}, start_compact},
-                State#state{starting=[{Ref, Key}|State#state.starting]};
-            % database is still compacting so we can just monitor the existing
-            % compaction pid
-            CPid ->
-                couch_log:notice("Db ~s continuing compaction",
-                    [smoosh_utils:stringify(Key)]),
-                erlang:monitor(process, CPid),
-                State#state{active=[{Key, CPid}|State#state.active]}
-        end;
+        Ref = erlang:monitor(process, DbPid),
+        DbPid ! {'$gen_call', {self(), Ref}, start_compact},
+        State#state{starting=[{Ref, couch_db:name(Db)}|State#state.starting]};

Review comment:
       there isn't a comment here. are you talking about line 295 :
   
   ```
   couch_log:notice("Db ~s continuing compaction",
                   [smoosh_utils:stringify(DbName)])
   ```
   to
   
   ```
   couch_log:notice("Compaction is already running for ~p, so monitor existing compaction pid ~p",
                   [smoosh_utils:stringify(DbName), CPID])
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] tonysun83 merged pull request #3144: move remonitor code into DOWN message

Posted by GitBox <gi...@apache.org>.
tonysun83 merged pull request #3144:
URL: https://github.com/apache/couchdb/pull/3144


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] tonysun83 commented on a change in pull request #3144: move remonitor code into DOWN message

Posted by GitBox <gi...@apache.org>.
tonysun83 commented on a change in pull request #3144:
URL: https://github.com/apache/couchdb/pull/3144#discussion_r486590441



##########
File path: src/smoosh/src/smoosh_channel.erl
##########
@@ -275,24 +276,34 @@ start_compact(State, Db) ->
     case smoosh_utils:ignore_db(Db) of
     false ->
         DbPid = couch_db:get_pid(Db),
-        Key = couch_db:name(Db),
-        case couch_db:get_compactor_pid(Db) of
-            nil ->
-                Ref = erlang:monitor(process, DbPid),
-                DbPid ! {'$gen_call', {self(), Ref}, start_compact},
-                State#state{starting=[{Ref, Key}|State#state.starting]};
-            % database is still compacting so we can just monitor the existing
-            % compaction pid
-            CPid ->
-                couch_log:notice("Db ~s continuing compaction",
-                    [smoosh_utils:stringify(Key)]),
-                erlang:monitor(process, CPid),
-                State#state{active=[{Key, CPid}|State#state.active]}
-        end;
+        Ref = erlang:monitor(process, DbPid),
+        DbPid ! {'$gen_call', {self(), Ref}, start_compact},
+        State#state{starting=[{Ref, couch_db:name(Db)}|State#state.starting]};

Review comment:
       oh I see what you mean, basically leave the initial check in as well




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] tonysun83 commented on a change in pull request #3144: move remonitor code into DOWN message

Posted by GitBox <gi...@apache.org>.
tonysun83 commented on a change in pull request #3144:
URL: https://github.com/apache/couchdb/pull/3144#discussion_r486572134



##########
File path: src/smoosh/src/smoosh_channel.erl
##########
@@ -275,24 +276,34 @@ start_compact(State, Db) ->
     case smoosh_utils:ignore_db(Db) of
     false ->
         DbPid = couch_db:get_pid(Db),
-        Key = couch_db:name(Db),
-        case couch_db:get_compactor_pid(Db) of
-            nil ->
-                Ref = erlang:monitor(process, DbPid),
-                DbPid ! {'$gen_call', {self(), Ref}, start_compact},
-                State#state{starting=[{Ref, Key}|State#state.starting]};
-            % database is still compacting so we can just monitor the existing
-            % compaction pid
-            CPid ->
-                couch_log:notice("Db ~s continuing compaction",
-                    [smoosh_utils:stringify(Key)]),
-                erlang:monitor(process, CPid),
-                State#state{active=[{Key, CPid}|State#state.active]}
-        end;
+        Ref = erlang:monitor(process, DbPid),
+        DbPid ! {'$gen_call', {self(), Ref}, start_compact},
+        State#state{starting=[{Ref, couch_db:name(Db)}|State#state.starting]};

Review comment:
       I think I changed this below
   

##########
File path: src/smoosh/src/smoosh_channel.erl
##########
@@ -275,24 +276,34 @@ start_compact(State, Db) ->
     case smoosh_utils:ignore_db(Db) of
     false ->
         DbPid = couch_db:get_pid(Db),
-        Key = couch_db:name(Db),
-        case couch_db:get_compactor_pid(Db) of
-            nil ->
-                Ref = erlang:monitor(process, DbPid),
-                DbPid ! {'$gen_call', {self(), Ref}, start_compact},
-                State#state{starting=[{Ref, Key}|State#state.starting]};
-            % database is still compacting so we can just monitor the existing
-            % compaction pid
-            CPid ->
-                couch_log:notice("Db ~s continuing compaction",
-                    [smoosh_utils:stringify(Key)]),
-                erlang:monitor(process, CPid),
-                State#state{active=[{Key, CPid}|State#state.active]}
-        end;
+        Ref = erlang:monitor(process, DbPid),
+        DbPid ! {'$gen_call', {self(), Ref}, start_compact},
+        State#state{starting=[{Ref, couch_db:name(Db)}|State#state.starting]};

Review comment:
        I changed this below
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org