You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by va...@apache.org on 2021/04/06 15:44:31 UTC
[couchdb] branch main updated: Retryable error fixes in
couch_jobs_type_monitor
This is an automated email from the ASF dual-hosted git repository.
vatamane pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/couchdb.git
The following commit(s) were added to refs/heads/main by this push:
new 7a6ea65 Retryable error fixes in couch_jobs_type_monitor
7a6ea65 is described below
commit 7a6ea6545338f942ecf9fb590d5372b73867e0b9
Author: Nick Vatamaniuc <va...@gmail.com>
AuthorDate: Tue Apr 6 10:04:17 2021 -0400
Retryable error fixes in couch_jobs_type_monitor
This continues improvements to retryable error handling started in
https://github.com/apache/couchdb/pull/3460. Here we add the same logic we
already have for the `erlfdb:wait/2` call in
https://github.com/apache/couchdb/blob/main/src/couch_jobs/src/couch_jobs_type_monitor.erl#L55-L57
to the `get_vs_and_watch/1` section.
couch_jobs_type_monitor is meant to be linked to and run in a continuous loop
as long as the parent process is alive. If FDB becomes unavailable the main
process which we linked to or other main component (the whole application)
should crash and fail as opposed to the type monitor itself. Still, to avoid
running in a tight loop we use the holdoff interval to sleep a bit before
recursing. The typical values of the holdoff is around 50-100 msec.
---
src/couch_jobs/src/couch_jobs_type_monitor.erl | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
diff --git a/src/couch_jobs/src/couch_jobs_type_monitor.erl b/src/couch_jobs/src/couch_jobs_type_monitor.erl
index a62eb62..b58f34e 100644
--- a/src/couch_jobs/src/couch_jobs_type_monitor.erl
+++ b/src/couch_jobs/src/couch_jobs_type_monitor.erl
@@ -81,7 +81,20 @@ notify(#st{} = St) ->
St#st{timestamp = Now}.
-get_vs_and_watch(#st{jtx = JTx, type = Type}) ->
- couch_jobs_fdb:tx(JTx, fun(JTx1) ->
- couch_jobs_fdb:get_activity_vs_and_watch(JTx1, Type)
- end).
+get_vs_and_watch(#st{} = St) ->
+ #st{jtx = JTx, type = Type, holdoff = HoldOff} = St,
+ try
+ couch_jobs_fdb:tx(JTx, fun(JTx1) ->
+ couch_jobs_fdb:get_activity_vs_and_watch(JTx1, Type)
+ end)
+ catch
+ error:{erlfdb_error, ?ERLFDB_TRANSACTION_TIMED_OUT} ->
+ timer:sleep(HoldOff),
+ get_vs_and_watch(St);
+ error:{erlfdb_error, Code} when ?ERLFDB_IS_RETRYABLE(Code) ->
+ timer:sleep(HoldOff),
+ get_vs_and_watch(St);
+ error:{timeout, _} ->
+ timer:sleep(HoldOff),
+ get_vs_and_watch(St)
+ end.