You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2020/08/06 11:26:31 UTC
[GitHub] [couchdb] rnewson opened a new issue #3061: couch_index_server crashes too easily
rnewson opened a new issue #3061:
URL: https://github.com/apache/couchdb/issues/3061
We have a problem with couch_index_server.
handle_info({'EXIT', Pid, Reason}, Server) ->
case ets:lookup(?BY_PID, Pid) of
[{Pid, {DbName, Sig}}] ->
DDocIds = [DDocId || {_, {DDocId, _}}
<- ets:match_object(?BY_DB, {DbName, {'$1', Sig}})],
rem_from_ets(DbName, Sig, DDocIds, Pid);
[] when Reason /= normal ->
exit(Reason); <--
whenever an index process crashes abnormally couch_index_server also terminates, taking all other index processes with it.
The cause of the couch_index crash I saw this with was from couch_mrview_index;
close(State) ->
erlang:demonitor(State#mrst.fd_monitor, [flush]),
couch_file:close(State#mrst.fd).
where the "fd" was not a live process.
couch_index_server should _not_ terminate if there is an abnormal exit by a single index.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670037960
Could also add a `catch` on this line: https://github.com/apache/couchdb/blob/master/src/couch_mrview/src/couch_mrview_index.erl#L156
Or we could change things so that close/shutdown/delete can return a new state so that they would know whether their couch_file pid is alive.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671416718
Roughly but not quite. It'd be something like:
```
unlink(Pid),
gen_server:cast(Pid, delete),
receive
{'EXIT', Pid, _Reason} -> ok
after 0 -> ok
end,
rem_from_ets(DbName, Sig, DDocIds, Pid)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] rnewson commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
rnewson commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671255684
I don't much like the idea of couch_index_server having to wait for that clean shutdown elsewhere.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] rnewson commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
rnewson commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671451184
```
handle_cast(delete, State) ->
#st{mod=Mod, idx_state=IdxState} = State,
ok = Mod:delete(IdxState),
{stop, normal, State};
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] rnewson commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
rnewson commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670030960
in the original logs, which I can't share here, there's a design document deletion while that index is building. the delete removes the underlying .view files and kills the couch_file pids, so when the index_server process itself terminates, it crashes -in- terminate when trying to close the now dead pid
I can't say that this is the only root to killing couch_index_server, though, so I would rather focus on removing the
```
[] when Reason /= normal ->
exit(Reason);
```
bit, which seems like an over-reaction now.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670027261
The odd thing is that this is actually an unknown process dying which suggests there's something else afoot. Perhaps we remove an indexer from ets before getting its `EXIT` signal?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670031871
Which sounds like we've removed it from ets before we should/without dropping a link or flushing a message.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670032090
That is to say, freaking out when we get an unknown `EXIT` message seems like a pretty reasonable reaction.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] rnewson commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
rnewson commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671255305
```
diff --git a/src/couch_index/src/couch_index_server.erl b/src/couch_index/src/couch_index_server.erl
index 49d1e61b7..f1828c665 100644
--- a/src/couch_index/src/couch_index_server.erl
+++ b/src/couch_index/src/couch_index_server.erl
@@ -243,9 +243,8 @@ reset_indexes(DbName, Root) ->
end, dict:new(), ets:lookup(?BY_DB, DbName)),
Fun = fun({Sig, DDocIds}) ->
[{_, Pid}] = ets:lookup(?BY_SIG, {DbName, Sig}),
- MRef = erlang:monitor(process, Pid),
gen_server:cast(Pid, delete),
- receive {'DOWN', MRef, _, _, _} -> ok end,
+ receive {'EXIT', Pid, _Reason} -> ok end,
rem_from_ets(DbName, Sig, DDocIds, Pid)
end,
lists:foreach(Fun, dict:to_list(SigDDocIds)),
```
Is this what you mean, @davisp?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] rnewson commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
rnewson commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671254847
Original introduction of this problem is here https://github.com/apache/couchdb/commit/d977eb7ec52bc13dca7b4f530cb51f42df0f6224.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] rnewson closed issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
rnewson closed issue #3061:
URL: https://github.com/apache/couchdb/issues/3061
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670029585
And by unknown, I mean its a process that's linked cause we're getting the `EXIT` but the process isn't in ets. Which means we're either accidentally linking something to `couch_index_server` or we're removing a linked process from ets without unlinking/flushing any pending `EXIT` messages.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] rnewson commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
rnewson commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671418656
that looks better to me
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670035773
https://github.com/apache/couchdb/blob/master/src/couch_index/src/couch_index_server.erl#L246-L248
We do a monitor dance as well. We should add a second receive to flush the `EXIT` message.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-670036006
And/or replace the monitor dance by waiting for the 'EXIT' instead would likely be better.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [couchdb] davisp commented on issue #3061: couch_index_server crashes too easily
Posted by GitBox <gi...@apache.org>.
davisp commented on issue #3061:
URL: https://github.com/apache/couchdb/issues/3061#issuecomment-671422205
Might want to double check that the `delete` message causes the indexer to exit. I think it does but didn't double check.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org