You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2020/08/06 11:26:31 UTC

[GitHub] [couchdb] rnewson opened a new issue #3061: couch_index_server crashes too easily

rnewson opened a new issue #3061:
URL: https://github.com/apache/couchdb/issues/3061


   We have a problem with couch_index_server.
   
   handle_info({'EXIT', Pid, Reason}, Server) ->
       case ets:lookup(?BY_PID, Pid) of
           [{Pid, {DbName, Sig}}] ->
               DDocIds = [DDocId || {_, {DDocId, _}}
                   <- ets:match_object(?BY_DB, {DbName, {'$1', Sig}})],
               rem_from_ets(DbName, Sig, DDocIds, Pid);
           [] when Reason /= normal ->
               exit(Reason); <--
   
   whenever an index process crashes abnormally couch_index_server also terminates, taking all other index processes with it.
   
   The cause of the couch_index crash I saw this with was from couch_mrview_index;
   
   close(State) ->
       erlang:demonitor(State#mrst.fd_monitor, [flush]),
       couch_file:close(State#mrst.fd).
   
   where the "fd" was not a live process.
   
   couch_index_server should _not_ terminate if there is an abnormal exit by a single index.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670037960


   Could also add a `catch` on this line: https://github.com/apache/couchdb/blob/master/src/couch_mrview/src/couch_mrview_index.erl#L156
   
   Or we could change things so that close/shutdown/delete can return a new state so that they would know whether their couch_file pid is alive.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671416718


   Roughly but not quite. It'd be something like:
   
   ```
   unlink(Pid),
   gen_server:cast(Pid, delete),
   receive
       {'EXIT', Pid, _Reason} -> ok
       after 0 -> ok
   end,
   rem_from_ets(DbName, Sig, DDocIds, Pid)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] rnewson commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
rnewson commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671255684


   I don't much like the idea of couch_index_server having to wait for that clean shutdown elsewhere.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] rnewson commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
rnewson commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671451184


   ```
   handle_cast(delete, State) ->
       #st{mod=Mod, idx_state=IdxState} = State,
       ok = Mod:delete(IdxState),
       {stop, normal, State};
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] rnewson commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
rnewson commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670030960


   in the original logs, which I can't share here, there's a design document deletion while that index is building. the delete removes the underlying .view files and kills the couch_file pids, so when the index_server process itself terminates, it crashes -in- terminate when trying to close the now dead pid
   
   I can't say that this is the only root to killing couch_index_server, though, so I would rather focus on removing the
   
   ```
   [] when Reason /= normal ->
       exit(Reason); 
   ```
   
   bit, which seems like an over-reaction now.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670027261


   The odd thing is that this is actually an unknown process dying which suggests there's something else afoot. Perhaps we remove an indexer from ets before getting its `EXIT` signal?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670031871


   Which sounds like we've removed it from ets before we should/without dropping a link or flushing a message.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670032090


   That is to say, freaking out when we get an unknown `EXIT` message seems like a pretty reasonable reaction.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] rnewson commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
rnewson commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671255305


   ```
   diff --git a/src/couch_index/src/couch_index_server.erl b/src/couch_index/src/couch_index_server.erl
   index 49d1e61b7..f1828c665 100644
   --- a/src/couch_index/src/couch_index_server.erl
   +++ b/src/couch_index/src/couch_index_server.erl
   @@ -243,9 +243,8 @@ reset_indexes(DbName, Root) ->
        end, dict:new(), ets:lookup(?BY_DB, DbName)),
        Fun = fun({Sig, DDocIds}) ->
            [{_, Pid}] = ets:lookup(?BY_SIG, {DbName, Sig}),
   -        MRef = erlang:monitor(process, Pid),
            gen_server:cast(Pid, delete),
   -        receive {'DOWN', MRef, _, _, _} -> ok end,
   +        receive {'EXIT', Pid, _Reason} -> ok end,
            rem_from_ets(DbName, Sig, DDocIds, Pid)
        end,
        lists:foreach(Fun, dict:to_list(SigDDocIds)),
   ```
   
   Is this what you mean, @davisp?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] rnewson commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
rnewson commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671254847


   Original introduction of this problem is here https://github.com/apache/couchdb/commit/d977eb7ec52bc13dca7b4f530cb51f42df0f6224. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] rnewson closed issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
rnewson closed issue #3061:
URL: https://github.com/apache/couchdb/issues/3061


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670029585


   And by unknown, I mean its a process that's linked cause we're getting the `EXIT` but the process isn't in ets. Which means we're either accidentally linking something to `couch_index_server` or we're removing a linked process from ets without unlinking/flushing any pending `EXIT` messages.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] rnewson commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
rnewson commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671418656


   that looks better to me
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670035773


   https://github.com/apache/couchdb/blob/master/src/couch_index/src/couch_index_server.erl#L246-L248
   
   We do a monitor dance as well. We should add a second receive to flush the `EXIT` message.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670036006


   And/or replace the monitor dance by waiting for the 'EXIT' instead would likely be better.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily

Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671422205


   Might want to double check that the `delete` message causes the indexer to exit. I think it does but didn't double check.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org