You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Adam Kocoloski (JIRA)" <ji...@apache.org> on 2011/08/22 16:20:29 UTC

[jira] [Created] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Incremental requests to _changes can skip revisions
---------------------------------------------------

                 Key: COUCHDB-1256
                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
             Project: CouchDB
          Issue Type: Bug
          Components: Replication
         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
            Reporter: Adam Kocoloski
            Assignee: Adam Kocoloski
            Priority: Blocker
             Fix For: 1.0.4, 1.1.1, 1.2


Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:

curl -X PUT localhost:5985/revseq 
{"ok":true}

curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
{"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}

curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
{"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}

% stick a conflict revision in foo
curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
{"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}

% request without since= gives the expected result
curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
{"results":[
{"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
{"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
],
"last_seq":3}

% request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
{"results":[
{"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
],
"last_seq":3}

I believe the fix is something like this (though we could refactor further because Style is unused):

diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
index e8705be..65aeca3 100644
--- a/src/couchdb/couch_db.erl
+++ b/src/couchdb/couch_db.erl
@@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
     changes_since(Db, Style, StartSeq, Fun, [], Acc).
     
 changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
-    Wrapper = fun(DocInfo, _Offset, Acc2) ->
-            #doc_info{revs=Revs} = DocInfo,
-            DocInfo2 =
-            case Style of
-            main_only ->
-                DocInfo;
-            all_docs ->
-                % remove revs before the seq
-                DocInfo#doc_info{revs=[RevInfo ||
-                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
-            end,
-            Fun(DocInfo2, Acc2)
-        end,
+    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
     {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
         Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
     {ok, AccOut}.



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Filipe Manana (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088882#comment-13088882 ] 

Filipe Manana commented on COUCHDB-1256:
----------------------------------------

I think skiping foo's revision 1-0dc33db52a43872b6f3371cef7de0277 when asking with since=2 seems correct, since that revision corresponds to sequence 1.
Like this the replicator does not receive revisions it already received before.
Am I missing something?

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Adam Kocoloski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088807#comment-13088807 ] 

Adam Kocoloski commented on COUCHDB-1256:
-----------------------------------------

Does any kind soul out there want to translate those curl commands into a JS test?  I'm not sure when I'll have more time to devote to this in the near future.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Adam Kocoloski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088927#comment-13088927 ] 

Adam Kocoloski commented on COUCHDB-1256:
-----------------------------------------

To expand on that a little, the replicator could have avoided this bug if it only ever saved checkpoints when it had consumed a full MVCC snapshot of the database and verified that everything transferred correctly.  I think that's a pretty onerous requirement that would make it much more difficult to replicate large databases over less reliable links.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Filipe Manana (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088933#comment-13088933 ] 

Filipe Manana commented on COUCHDB-1256:
----------------------------------------

Adam if the replicator checkpointed sequence 2, it means it received and processed revision 1-0dc33db52a43872b6f3371cef7de0277 of foo. Otherwise it would be a bug in the replicator.

If it crashes before processing seq 3 (which lists revision 1-cc609831f0ca66e8cd3d4c1e0d98108a of foo), on restart when requesting _changes?since=2&style=all_docs it will receive the revision of foo which hasn't yet processed (1-cc609831f0ca66e8cd3d4c1e0d98108a).

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>         Attachments: jira-1256-test.diff
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Randall Leeds (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089050#comment-13089050 ] 

Randall Leeds commented on COUCHDB-1256:
----------------------------------------

Another option is possible to change couch_db_updater to preserve seq numbers belonging to conflict revisions. Rather than sorting high_seq on the doc info, we could store the low_seq of the lowest conflict and compare using that to determine whether we need to include something. Half-thought-out, but wanted to get that said in case it triggers some different thinking on this. Otherwise, Adam's fix looks reasonable.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>         Attachments: jira-1256-test.diff
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Damien Katz (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089755#comment-13089755 ] 

Damien Katz commented on COUCHDB-1256:
--------------------------------------

I agree with the fix adam proposes. The code in question is an optimization to prevent the sending/checking of documents we've already examined, but with checkpointing it breaks. Removal of the code is the right fix for now.

In the future, we can add the optimization back if the check-pointing can keep note of completed replications vs. checkpointed. Checkpointed records would keep a "high water mark" of the last completed replication, and the seq num and that high mark for completed replication would both be sent to the _changes handler. The _changes would not send docs with a seq below the checkpoint value. When the replication checkpoints, it saves the current seq and the last high water mark complete. When replication completes. it sets the last seq and high water mark to the same seq, and that is gets sent for the next replication.

Also, continuous replication would need a way to signal when a replication is "complete" as well, so that the high water mark can be set there as well.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>         Attachments: jira-1256-test.diff
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Robert Newson (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Newson updated COUCHDB-1256:
-----------------------------------

    Affects Version/s: 0.10
                       0.10.1
                       0.10.2
                       0.11.1
                       0.11.2
                       1.0
                       1.0.1
                       1.0.2
                       1.1
                       1.0.3

marking all affected releases.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Filipe Manana (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088982#comment-13088982 ] 

Filipe Manana commented on COUCHDB-1256:
----------------------------------------

Adam, with this later explanation, it's clear to me now.
I was thinking the first replication request got a seq 1 changes row.
Thanks for pointing it out.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>         Attachments: jira-1256-test.diff
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Adam Kocoloski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088921#comment-13088921 ] 

Adam Kocoloski commented on COUCHDB-1256:
-----------------------------------------

Yes, I'm afraid you are.  In the example I posted imagine that the replicator starts at 0, saves a checkpoint at sequence 2 and crashes before processing sequence 3.  It hasn't seen 1-0dc33db52a43872b6f3371cef7de0277 of "foo" yet.  If you try the replication again the replicator will issue the since=2 request and will skip revision 1-0dc33 altogether.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Adam Kocoloski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089213#comment-13089213 ] 

Adam Kocoloski commented on COUCHDB-1256:
-----------------------------------------

Hi Randall, I'm not totally sure I understand your suggestion.  The sequence of each revision of a document is stored in the #rev_info record and is accessible in the #doc_info.  If we changed the key under which the #doc_info is indexed in the sequence tree to be the low_seq I think we'd break a lot of things -- adding a new revision wouldn't cause a document to show in an incremental view update, for instance.

In short, we can already compute the lowest sequence associated with a document from its #doc_info.  But I can't figure out how that helps us here.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>         Attachments: jira-1256-test.diff
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Filipe Manana (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088883#comment-13088883 ] 

Filipe Manana commented on COUCHDB-1256:
----------------------------------------

I think skiping foo's revision 1-0dc33db52a43872b6f3371cef7de0277 when asking with since=2 seems correct, since that revision corresponds to sequence 1.
Like this the replicator does not receive revisions it already received before.
Am I missing something?

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Adam Kocoloski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088952#comment-13088952 ] 

Adam Kocoloski commented on COUCHDB-1256:
-----------------------------------------

Hi Filipe, it is definitely not the case that "if the replicator checkpointed sequence 2, it means it received and processed revision 1-0dc33db52a43872b6f3371cef7de0277 of foo".  Please take a closer look at the full _changes response in the initial report:

curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs 
{"results":[ 
{"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}, 
{"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]} 
], 
"last_seq":3}

Revision 1-0dc33d of "foo" shows up in sequence 3, which is after sequence 2.  The replicator should checkpoint sequence 2 iff it has replicated all the entries in the _changes feed up to and including sequence 2.  Once again, that does not include 1-0dc33d.  If I resume the replication from a checkpoint at sequence 2 I will not receive 1-0dc33d.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>         Attachments: jira-1256-test.diff
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Adam Kocoloski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088741#comment-13088741 ] 

Adam Kocoloski commented on COUCHDB-1256:
-----------------------------------------

Tossed a couple of commits up on Paul's test git deployment that fix the bug on trunk:

http://git-wip-us.apache.org/repos/asf?p=couchdb.git;a=log;h=refs/heads/1256-changes-skips-revisions

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Adam Kocoloski (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Adam Kocoloski resolved COUCHDB-1256.
-------------------------------------

    Resolution: Fixed

Applied the patch and Bob's test code (with a small tweak) to trunk, 1.1.x, and 1.0.x.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>         Attachments: jira-1256-test.diff
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Bob Dionne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bob Dionne updated COUCHDB-1256:
--------------------------------

    Attachment: jira-1256-test.diff

here's a failing JS test - the suggested patch fixes it

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>         Attachments: jira-1256-test.diff
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Randall Leeds (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089336#comment-13089336 ] 

Randall Leeds commented on COUCHDB-1256:
----------------------------------------

Adam: I was referring to the fact that couch_db_updater:merge_rev_trees appears to remove all the older seq entries for a document whenever it's updated. If we didn't remove the seq entries for conflicts the example you gave would yield {"id":"foo"} for both seq 1 and 3. So while the original version wouldn't be emitted in 3 with ?style=all_docs it would make it the case that a checkpoint at 2 would necessarily mean the seq 1 update had been seen.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>         Attachments: jira-1256-test.diff
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (COUCHDB-1256) Incremental requests to _changes can skip revisions

Posted by "Adam Kocoloski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089953#comment-13089953 ] 

Adam Kocoloski commented on COUCHDB-1256:
-----------------------------------------

@Randall ah, now I see what you mean.  I'd have to think a bit about the consequences of having a document show up multiple times in the sequence tree.  In general I'm in favor of changes that make conflict revisions more visible.

@Damien yep, we should start distinguishing between interim checkpoints and completed replications of an MVCC snapshot.  That being said, I doubt we're losing much by dropping this particular optimization for now.  It's not like the target needs to do any extra IO to check for the presence of those revisions.

> Incremental requests to _changes can skip revisions
> ---------------------------------------------------
>
>                 Key: COUCHDB-1256
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1256
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 0.10, 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2, 1.1, 1.0.3
>         Environment: confirmed on Apache CouchDB 1.1.0, bug appears to be present in 1.0.3 and trunk
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>            Priority: Blocker
>             Fix For: 1.0.4, 1.1.1, 1.2
>
>         Attachments: jira-1256-test.diff
>
>
> Requests to _changes with style=all_docs&since=N (requests made by the replicator) are liable to suppress revisions of a document.  The following sequence of curl commands demonstrates the bug:
> curl -X PUT localhost:5985/revseq 
> {"ok":true}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo -d '{"a":123}'
> {"ok":true,"id":"foo","rev":"1-0dc33db52a43872b6f3371cef7de0277"}
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/bar -d '{"a":456}'
> {"ok":true,"id":"bar","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % stick a conflict revision in foo
> curl -X PUT -Hcontent-type:application/json localhost:5985/revseq/foo?new_edits=false -d '{"_rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a", "a":123}'
> {"ok":true,"id":"foo","rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}
> % request without since= gives the expected result
> curl -Hcontent-type:application/json localhost:5985/revseq/_changes?style=all_docs
> {"results":[
> {"seq":2,"id":"bar","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]},
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"},{"rev":"1-0dc33db52a43872b6f3371cef7de0277"}]}
> ],
> "last_seq":3}
> % request starting from since=2 suppresses revision 1-0dc33db52a43872b6f3371cef7de0277 of foo
> macbook:~ (master) $ curl localhost:5985/revseq/_changes?style=all_docs\&since=2
> {"results":[
> {"seq":3,"id":"foo","changes":[{"rev":"1-cc609831f0ca66e8cd3d4c1e0d98108a"}]}
> ],
> "last_seq":3}
> I believe the fix is something like this (though we could refactor further because Style is unused):
> diff --git a/src/couchdb/couch_db.erl b/src/couchdb/couch_db.erl
> index e8705be..65aeca3 100644
> --- a/src/couchdb/couch_db.erl
> +++ b/src/couchdb/couch_db.erl
> @@ -1029,19 +1029,7 @@ changes_since(Db, Style, StartSeq, Fun, Acc) ->
>      changes_since(Db, Style, StartSeq, Fun, [], Acc).
>      
>  changes_since(Db, Style, StartSeq, Fun, Options, Acc) ->
> -    Wrapper = fun(DocInfo, _Offset, Acc2) ->
> -            #doc_info{revs=Revs} = DocInfo,
> -            DocInfo2 =
> -            case Style of
> -            main_only ->
> -                DocInfo;
> -            all_docs ->
> -                % remove revs before the seq
> -                DocInfo#doc_info{revs=[RevInfo ||
> -                    #rev_info{seq=RevSeq}=RevInfo <- Revs, StartSeq < RevSeq]}
> -            end,
> -            Fun(DocInfo2, Acc2)
> -        end,
> +    Wrapper = fun(DocInfo, _Offset, Acc2) -> Fun(DocInfo, Acc2) end,
>      {ok, _LastReduction, AccOut} = couch_btree:fold(by_seq_btree(Db),
>          Wrapper, Acc, [{start_key, StartSeq + 1}] ++ Options),
>      {ok, AccOut}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira