You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2019/12/17 20:57:05 UTC

[GitHub] [couchdb-fauxton] davisp commented on issue #1069: Want to view previous revision content of a document

davisp commented on issue #1069: Want to view previous revision content of a document
URL: https://github.com/apache/couchdb-fauxton/issues/1069#issuecomment-566745633
 
 
   I just stumbled across this and figured I'd give a bit more background on the `_rev` field and why its not suitable as a revision controls system.
   
   First, its been a very common misconception for basically the entire history of CouchDB that the `_rev` field and the associated revision tree are useful as a revision control field. There have been numerous discussions over the years on completely renaming these concepts to avoid precisely this confusion. Unfortunately the names are baked directly into the replication protocol so they would require some fairly massive investment to rename. That's definitely on us as a project and we should have started years ago to make that transition. Unfortunately, that doesn't help the current situation.
   
   The reason that the `_rev` field is not useful as a revision control system is due to the nature of replication itself. The replicator and replication protocol never transfer document data related to internal nodes of the revision tree. In any situation where replication is involved the contents of older revisions may or may not exist. Thus depending on them to exist will inevitably lead to confusion and assertions that CouchDB is broken when they've gone missing.
   
   The reason that we made the UI change for CouchDB 2.0 is because replication is fundamental to the design of CouchDB's clustering logic. If you were to attempt to implement this feature you would end up in a situation where it was roughly random on whether the links worked correctly even so far as having document bodies appearing and disappearing with a page refresh. There are a number of factors that would contribute to this that would not be easily overcome with the eventual consistency design of CouchDB clustering. Everything from uncoordinated compaction events (i.e., all nodes in a cluster may compact any shard at any time) to parallel updates causing some nodes to miss a revision between internal replication updates (at which point a node will have never seen an internal revision and replication will never transfer it). And then finally to replication from remote nodes that never transfer old revision bodies to begin with which leads to confusion when an old revision is available on one side of the replication but not the other.
   
   And preemptively, the answer to: "Well, why not just enhance the replicator to transfer some number of previous revision bodies?", there are two main reasons. Primarily and easiest to explain is that the replicator has never supported this which means making the change would be even harder than us just changing the concept names to something less easily confused rather than update the algorithmic aspects of the replicator. Secondly and more importantly, it doesn't actually fix anything. Having a configurable number of revisions just means that our hard coded value of 1 document body per leaf would be some larger number. The underlying issues about missing revisions doesn't change any of the fundamental issues in attempting to rely on previous revision data.
   
   Hopefully that clarifies why adding this sort of feature isn't possible. I do understand that the `_rev` token can be easily misunderstood as a useful approach for revision history. However, there are fundamental aspects to CouchDB that prevent it from being used as such.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services