You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Jens Rantil <je...@gmail.com> on 2013/08/22 11:26:17 UTC

Changes API - purged?

Hi,

I have a use case where I'd like to build up an external state by following
the changes of a database. Obviously, the /db/_changes will be a great
source to start for this. My question is, can I always be sure that simply
following /db/_changes from seqnum=0 will bring me to a consistent state of
the current database? Are /db/_changes purged (on /db/_purge?) in any way
throughout the lifetime of a database?

I've tried finding this information in documentation, but have failed so
far.

Thanks,
Jens

Re: Changes API - purged?

Posted by Robert Newson <rn...@apache.org>.
Compaction does not affect the changes feed.

On 23 August 2013 09:01, Jens Rantil <je...@gmail.com> wrote:
> Robert,
>
> Thanks you for your e-mail. This was great information!
>
> Just to be clear, does compaction influence the changes feed in any way?
>
> Thanks,
> Jens
>
> Den torsdagen den 22:e augusti 2013 skrev Robert Newson:
>
>> If you use _purge then, yes, the information is purged, which is one
>> of the many reasons you should not use _purge. :)
>>
>> The replicator uses changes for the same purpose as you intent, where
>> the 'external state' is also a couchdb database in another server. You
>> can rely on it to contain everything you need to synchronize an
>> external stateful system with your database.
>>
>> For the avoidance of doubt, the changes feed does *not* preserve every
>> change made to your database. It has one entry for every doc id that
>> has ever been present in your database, in the order of their most
>> recent update. Starting from an empty database, the first document
>> added will have update sequence 1. If you update or delete that
>> document, then the changes feed will have no entry for update sequence
>> 1 but will have an entry for update sequence 2. If you apply every
>> update in the order you receive it from _changes to your target
>> system, you will end up in the correct state.
>>
>> B.
>>
>>
>> On 22 August 2013 10:26, Jens Rantil <jens.rantil@gmail.com <javascript:;>>
>> wrote:
>> > Hi,
>> >
>> > I have a use case where I'd like to build up an external state by
>> following
>> > the changes of a database. Obviously, the /db/_changes will be a great
>> > source to start for this. My question is, can I always be sure that
>> simply
>> > following /db/_changes from seqnum=0 will bring me to a consistent state
>> of
>> > the current database? Are /db/_changes purged (on /db/_purge?) in any way
>> > throughout the lifetime of a database?
>> >
>> > I've tried finding this information in documentation, but have failed so
>> > far.
>> >
>> > Thanks,
>> > Jens
>>

Re: Changes API - purged?

Posted by Robert Newson <rn...@apache.org>.
"The changes feed is just a listing of documents ordered by their
current sequence numbers."

Succinctly put.

B.


On 23 August 2013 16:15, Jens Alfke <je...@couchbase.com> wrote:
>
> On Aug 23, 2013, at 1:01 AM, Jens Rantil <je...@gmail.com> wrote:
>
>> Just to be clear, does compaction influence the changes feed in any way?
>
> It may help to think of the changes feed this way:
>
> Every database has a last-sequence counter (similar to a SQL table’s autoincrement counter.)
> Every document has a sequence number*.
> Whenever a document is updated (i.e. a revision is added) its sequence number is changed to the next available sequence count.
> The changes feed is just a listing of documents ordered by their current sequence numbers.
> (Under the hood, the database has a separate b-tree index that maps sequence numbers to document IDs.)
>
> Thus the effect is that updating a document moves it to the end of the changes feed, with a new sequence number.
>
> Compaction doesn’t have any effect on this at all; all it does is prune intra-document revision data.
>
> —Other Jens ;)
>
> * This gets more complex with BigCouch/Cloudant, because it’s clustered. The opaque sequence IDs it shows clients are actually aggregates of the sequence numbers of all the nodes in the cluster.

Re: Changes API - purged?

Posted by Robert Newson <rn...@apache.org>.
Nope. I don't think you should take "each document has a seq no" too
seriously, the number changes and has no particular meaning beyond
providing a means of synchronizing databases and indexes.

B.


On 23 August 2013 17:08, James Hayton <th...@purplebulldog.com> wrote:
> So something interesting here to me is that each document had a seq no... Is there any way to figure out what that is from the id/rev?
>
> On Aug 23, 2013, at 8:15 AM, Jens Alfke <je...@couchbase.com> wrote:
>
>>
>> On Aug 23, 2013, at 1:01 AM, Jens Rantil <je...@gmail.com> wrote:
>>
>>> Just to be clear, does compaction influence the changes feed in any way?
>>
>> It may help to think of the changes feed this way:
>>
>> Every database has a last-sequence counter (similar to a SQL table’s autoincrement counter.)
>> Every document has a sequence number*.
>> Whenever a document is updated (i.e. a revision is added) its sequence number is changed to the next available sequence count.
>> The changes feed is just a listing of documents ordered by their current sequence numbers.
>> (Under the hood, the database has a separate b-tree index that maps sequence numbers to document IDs.)
>>
>> Thus the effect is that updating a document moves it to the end of the changes feed, with a new sequence number.
>>
>> Compaction doesn’t have any effect on this at all; all it does is prune intra-document revision data.
>>
>> —Other Jens ;)
>>
>> * This gets more complex with BigCouch/Cloudant, because it’s clustered. The opaque sequence IDs it shows clients are actually aggregates of the sequence numbers of all the nodes in the cluster.

Re: Changes API - purged?

Posted by James Hayton <th...@purplebulldog.com>.
So something interesting here to me is that each document had a seq no... Is there any way to figure out what that is from the id/rev?  

On Aug 23, 2013, at 8:15 AM, Jens Alfke <je...@couchbase.com> wrote:

> 
> On Aug 23, 2013, at 1:01 AM, Jens Rantil <je...@gmail.com> wrote:
> 
>> Just to be clear, does compaction influence the changes feed in any way?
> 
> It may help to think of the changes feed this way:
> 
> Every database has a last-sequence counter (similar to a SQL table’s autoincrement counter.)
> Every document has a sequence number*.
> Whenever a document is updated (i.e. a revision is added) its sequence number is changed to the next available sequence count.
> The changes feed is just a listing of documents ordered by their current sequence numbers.
> (Under the hood, the database has a separate b-tree index that maps sequence numbers to document IDs.)
> 
> Thus the effect is that updating a document moves it to the end of the changes feed, with a new sequence number.
> 
> Compaction doesn’t have any effect on this at all; all it does is prune intra-document revision data.
> 
> —Other Jens ;)
> 
> * This gets more complex with BigCouch/Cloudant, because it’s clustered. The opaque sequence IDs it shows clients are actually aggregates of the sequence numbers of all the nodes in the cluster.

Re: Changes API - purged?

Posted by Jens Alfke <je...@couchbase.com>.
On Aug 23, 2013, at 1:01 AM, Jens Rantil <je...@gmail.com> wrote:

> Just to be clear, does compaction influence the changes feed in any way?

It may help to think of the changes feed this way:

Every database has a last-sequence counter (similar to a SQL table’s autoincrement counter.)
Every document has a sequence number*.
Whenever a document is updated (i.e. a revision is added) its sequence number is changed to the next available sequence count.
The changes feed is just a listing of documents ordered by their current sequence numbers.
(Under the hood, the database has a separate b-tree index that maps sequence numbers to document IDs.)

Thus the effect is that updating a document moves it to the end of the changes feed, with a new sequence number.

Compaction doesn’t have any effect on this at all; all it does is prune intra-document revision data.

—Other Jens ;)

* This gets more complex with BigCouch/Cloudant, because it’s clustered. The opaque sequence IDs it shows clients are actually aggregates of the sequence numbers of all the nodes in the cluster.

Re: Changes API - purged?

Posted by Jens Rantil <je...@gmail.com>.
Robert,

Thanks you for your e-mail. This was great information!

Just to be clear, does compaction influence the changes feed in any way?

Thanks,
Jens

Den torsdagen den 22:e augusti 2013 skrev Robert Newson:

> If you use _purge then, yes, the information is purged, which is one
> of the many reasons you should not use _purge. :)
>
> The replicator uses changes for the same purpose as you intent, where
> the 'external state' is also a couchdb database in another server. You
> can rely on it to contain everything you need to synchronize an
> external stateful system with your database.
>
> For the avoidance of doubt, the changes feed does *not* preserve every
> change made to your database. It has one entry for every doc id that
> has ever been present in your database, in the order of their most
> recent update. Starting from an empty database, the first document
> added will have update sequence 1. If you update or delete that
> document, then the changes feed will have no entry for update sequence
> 1 but will have an entry for update sequence 2. If you apply every
> update in the order you receive it from _changes to your target
> system, you will end up in the correct state.
>
> B.
>
>
> On 22 August 2013 10:26, Jens Rantil <jens.rantil@gmail.com <javascript:;>>
> wrote:
> > Hi,
> >
> > I have a use case where I'd like to build up an external state by
> following
> > the changes of a database. Obviously, the /db/_changes will be a great
> > source to start for this. My question is, can I always be sure that
> simply
> > following /db/_changes from seqnum=0 will bring me to a consistent state
> of
> > the current database? Are /db/_changes purged (on /db/_purge?) in any way
> > throughout the lifetime of a database?
> >
> > I've tried finding this information in documentation, but have failed so
> > far.
> >
> > Thanks,
> > Jens
>

Re: Changes API - purged?

Posted by Robert Newson <rn...@apache.org>.
If you use _purge then, yes, the information is purged, which is one
of the many reasons you should not use _purge. :)

The replicator uses changes for the same purpose as you intent, where
the 'external state' is also a couchdb database in another server. You
can rely on it to contain everything you need to synchronize an
external stateful system with your database.

For the avoidance of doubt, the changes feed does *not* preserve every
change made to your database. It has one entry for every doc id that
has ever been present in your database, in the order of their most
recent update. Starting from an empty database, the first document
added will have update sequence 1. If you update or delete that
document, then the changes feed will have no entry for update sequence
1 but will have an entry for update sequence 2. If you apply every
update in the order you receive it from _changes to your target
system, you will end up in the correct state.

B.


On 22 August 2013 10:26, Jens Rantil <je...@gmail.com> wrote:
> Hi,
>
> I have a use case where I'd like to build up an external state by following
> the changes of a database. Obviously, the /db/_changes will be a great
> source to start for this. My question is, can I always be sure that simply
> following /db/_changes from seqnum=0 will bring me to a consistent state of
> the current database? Are /db/_changes purged (on /db/_purge?) in any way
> throughout the lifetime of a database?
>
> I've tried finding this information in documentation, but have failed so
> far.
>
> Thanks,
> Jens