You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2021/04/26 20:46:01 UTC

[GitHub] [couchdb] nickva opened a new pull request #3530: Improve retry handling in _all_dbs, _dbs_info and _deleted_dbs

nickva opened a new pull request #3530:
URL: https://github.com/apache/couchdb/pull/3530


   There are 2 commits:
   
   1. After recent improvements to how retries are handled in `fabric2_fdb`, Elixir
    integration tests can often pass when running under "buggify" mode. The chance 
   of re-sending response data during retries is now much lower, too. However,
   there are still some instances of that type of failure when running `_all_dbs`
   tests. To trigger it would have to run the all_dbs test from basics_tests.exs a
   few thousands times in a row. The reason for the failure is that retryable
   errors might be still thrown during the `LayerPrefix = get_dir(Tx)` call, and
   the whole transaction closure would be retried in [2]. When that happens,
   user's callback is called twice with `meta` and it sends `[` twice in a row to
   the client [3], which is incorrect.
      A simple fix is to not call `meta` or `complete` from the transaction context.
   That works because we don't pass the transaction object into user's callback
   and the user won't be able to run another transaction in the same process
   anyway.
     There are already tests which test retryable errors in _all_dbs and _dbs_info,
   but they had been updated to only throw retryable errors when rows are emitted
   to match the new behavior.
   
   2. When running integration tests with a "buggified" client, sometimes
   `fold_range_not_progressing` is triggered since it's possible retriable errors
   might be thrown 3 times in a row. Instead of bumping it arbitrarily, since we
   already have a retry limit in fabric2_server, start using that.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva edited a comment on pull request #3530: Improve retry handling in _all_dbs, _dbs_info and _deleted_dbs

Posted by GitBox <gi...@apache.org>.
nickva edited a comment on pull request #3530:
URL: https://github.com/apache/couchdb/pull/3530#issuecomment-827777653


   @iilyak restart_tx logic ensures that once rows start to get processed restarts due to retryable errors will not re-send the same row twice. There are tests for this fabric https://github.com/apache/couchdb/blob/main/src/fabric/test/fabric2_db_crud_tests.erl#L484-L541


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] iilyak edited a comment on pull request #3530: Improve retry handling in _all_dbs, _dbs_info and _deleted_dbs

Posted by GitBox <gi...@apache.org>.
iilyak edited a comment on pull request #3530:
URL: https://github.com/apache/couchdb/pull/3530#issuecomment-827761831


   I could be wrong, however after reading the code I came to conclusion that in case of a re-try we would be calling `UserCallback({row, Row}, Acc)` multiple times for the same row. This means that the response would have duplicates. Because we don't do anything with rows other than calling [`{ok, Resp1} = chttpd:send_delayed_chunk(Resp0, [Prepend, ?JSON_ENCODE(DbName)]),`](https://github.com/apache/couchdb/blob/main/src/chttpd/src/chttpd_misc.erl#L142)
   
   I agree that this PR should fix multiple '[' characters on the fist line. However we still have changed/broken semantics of `_all_dbs` endpoint.
   
   Having in mind recent discussions about introducing limits on each response. It could be a good time to get away from streaming API and instead de-duplicate rows in memory before sending the result to client.
   
   This would fix the long due problem when client receives invalid JSON when erlang process dies while streaming.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva edited a comment on pull request #3530: Improve retry handling in _all_dbs, _dbs_info and _deleted_dbs

Posted by GitBox <gi...@apache.org>.
nickva edited a comment on pull request #3530:
URL: https://github.com/apache/couchdb/pull/3530#issuecomment-827777653


   @iilyak restart_tx logic ensures that once rows start to get processed restarts due to retryable errors will not re-send the same row twice. There are tests for this in fabric https://github.com/apache/couchdb/blob/main/src/fabric/test/fabric2_db_crud_tests.erl#L484-L541


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on pull request #3530: Improve retry handling in _all_dbs, _dbs_info and _deleted_dbs

Posted by GitBox <gi...@apache.org>.
nickva commented on pull request #3530:
URL: https://github.com/apache/couchdb/pull/3530#issuecomment-827777653


   @iilyak restart_tx logic ensures that once rows start to get processes restarts due to retryable errors will not re-send the same row twice. There are tests for this fabric/test https://github.com/apache/couchdb/blob/main/src/fabric/test/fabric2_db_crud_tests.erl#L484-L541


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva merged pull request #3530: Improve retry handling in _all_dbs, _dbs_info and _deleted_dbs

Posted by GitBox <gi...@apache.org>.
nickva merged pull request #3530:
URL: https://github.com/apache/couchdb/pull/3530


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] iilyak commented on a change in pull request #3530: Improve retry handling in _all_dbs, _dbs_info and _deleted_dbs

Posted by GitBox <gi...@apache.org>.
iilyak commented on a change in pull request #3530:
URL: https://github.com/apache/couchdb/pull/3530#discussion_r621388460



##########
File path: src/fabric/test/fabric2_db_crud_tests.erl
##########
@@ -449,9 +449,12 @@ list_dbs_tx_too_old(_) ->
     ?assertMatch({ok, _}, fabric2_db:create(DbName1, [])),
     ?assertMatch({ok, _}, fabric2_db:create(DbName2, [])),
 
-    UserFun = fun(Row, Acc) ->
-        fabric2_test_util:tx_too_old_raise_in_user_fun(),
-        {ok, [Row | Acc]}
+    UserFun = fun
+        ({row, _} = Row, Acc) ->
+            fabric2_test_util:tx_too_old_raise_in_user_fun(),
+                {ok, [Row | Acc]};

Review comment:
       strange indentation




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on a change in pull request #3530: Improve retry handling in _all_dbs, _dbs_info and _deleted_dbs

Posted by GitBox <gi...@apache.org>.
nickva commented on a change in pull request #3530:
URL: https://github.com/apache/couchdb/pull/3530#discussion_r621412234



##########
File path: src/fabric/test/fabric2_db_crud_tests.erl
##########
@@ -449,9 +449,12 @@ list_dbs_tx_too_old(_) ->
     ?assertMatch({ok, _}, fabric2_db:create(DbName1, [])),
     ?assertMatch({ok, _}, fabric2_db:create(DbName2, [])),
 
-    UserFun = fun(Row, Acc) ->
-        fabric2_test_util:tx_too_old_raise_in_user_fun(),
-        {ok, [Row | Acc]}
+    UserFun = fun
+        ({row, _} = Row, Acc) ->
+            fabric2_test_util:tx_too_old_raise_in_user_fun(),
+                {ok, [Row | Acc]};

Review comment:
       Good catch, I'll fix it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] iilyak commented on pull request #3530: Improve retry handling in _all_dbs, _dbs_info and _deleted_dbs

Posted by GitBox <gi...@apache.org>.
iilyak commented on pull request #3530:
URL: https://github.com/apache/couchdb/pull/3530#issuecomment-827761831


   I could be wrong, however after reading the code I came to conclusion that in case of a re-try we would be calling `UserCallback({row, Row}, Acc)` multiple times for the same row. This means that the response would have duplicates. Because we don't do anything with rows other than calling [`{ok, Resp1} = chttpd:send_delayed_chunk(Resp0, [Prepend, ?JSON_ENCODE(DbName)]),`](https://github.com/apache/couchdb/blob/main/src/chttpd/src/chttpd_misc.erl#L142
   
   I agree that this PR should fix multiple '[' characters on the fist line. However we still have changed/broken semantics of `_all_dbs` endpoint.
   
   Having in mind recent discussions about introducing limits on each response. It could be a good time to get away from streaming API and instead de-duplicate rows in memory before sending the result to client.
   
   This would fix the long due problem when client receives invalid JSON when erlang process dies while streaming.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] nickva commented on a change in pull request #3530: Improve retry handling in _all_dbs, _dbs_info and _deleted_dbs

Posted by GitBox <gi...@apache.org>.
nickva commented on a change in pull request #3530:
URL: https://github.com/apache/couchdb/pull/3530#discussion_r621414380



##########
File path: src/fabric/test/fabric2_db_crud_tests.erl
##########
@@ -449,9 +449,12 @@ list_dbs_tx_too_old(_) ->
     ?assertMatch({ok, _}, fabric2_db:create(DbName1, [])),
     ?assertMatch({ok, _}, fabric2_db:create(DbName2, [])),
 
-    UserFun = fun(Row, Acc) ->
-        fabric2_test_util:tx_too_old_raise_in_user_fun(),
-        {ok, [Row | Acc]}
+    UserFun = fun
+        ({row, _} = Row, Acc) ->
+            fabric2_test_util:tx_too_old_raise_in_user_fun(),
+                {ok, [Row | Acc]};
+        (Row, Acc) ->
+                {ok, [Row | Acc]}

Review comment:
       It is silly. Will fix it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [couchdb] iilyak commented on a change in pull request #3530: Improve retry handling in _all_dbs, _dbs_info and _deleted_dbs

Posted by GitBox <gi...@apache.org>.
iilyak commented on a change in pull request #3530:
URL: https://github.com/apache/couchdb/pull/3530#discussion_r621388887



##########
File path: src/fabric/test/fabric2_db_crud_tests.erl
##########
@@ -449,9 +449,12 @@ list_dbs_tx_too_old(_) ->
     ?assertMatch({ok, _}, fabric2_db:create(DbName1, [])),
     ?assertMatch({ok, _}, fabric2_db:create(DbName2, [])),
 
-    UserFun = fun(Row, Acc) ->
-        fabric2_test_util:tx_too_old_raise_in_user_fun(),
-        {ok, [Row | Acc]}
+    UserFun = fun
+        ({row, _} = Row, Acc) ->
+            fabric2_test_util:tx_too_old_raise_in_user_fun(),
+                {ok, [Row | Acc]};
+        (Row, Acc) ->
+                {ok, [Row | Acc]}

Review comment:
       strange indentation




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org