You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Sean Clark Hess <se...@gmail.com> on 2010/01/26 22:56:38 UTC

Do document versions ever expire?

I'm wondering if old versions of documents ever expire. On servers that are
disk-bound (like my tiny VPS slices will be) this could be something I had
to design around.

For example, when importing data (millions of rows) from a relational
database, I want to be able to build a document a piece at a time. The
relational schema is wacked - it has information about a given document in
like 10 different tables, and I don't want to have to try to hold everything
in memory just so I only have to write the document once.

Any way to control it, or turn versioning off?  Is it even a concern?
Thanks!

Re: Do document versions ever expire?

Posted by Sean Clark Hess <se...@gmail.com>.
Thanks Guys

On Tue, Jan 26, 2010 at 4:15 PM, Markus Jelsma <ma...@buyways.nl> wrote:

> Hello Paul and others,
>
>
> Now we're on the subject of compaction, let me ask an question. I have
> some importer somewhere that fills a clean db with about 3500 records,
> futon now tells me its size is 4.2 MB. However, if i compact a fresh and
> clean database (presumably without extraneous information such as old
> revisions) it is suddenly just 2.4 MB!
>
> Can you, or someone, give an explanation on this matter? It smells like an
> unwanted feature but i could be wrong :)
>
> To get things straight, this doesn't happen with just a two documents with
> only a uuid ID and a revision number.
>
>
> Cheers,
>
>
> Paul Davis said:
> > On Tue, Jan 26, 2010 at 4:56 PM, Sean Clark Hess <se...@gmail.com>
> > wrote:
> >> I'm wondering if old versions of documents ever expire. On servers
> >> that are disk-bound (like my tiny VPS slices will be) this could be
> >> something I had to design around.
> >>
> >> For example, when importing data (millions of rows) from a relational
> >> database, I want to be able to build a document a piece at a time. The
> >> relational schema is wacked - it has information about a given
> >> document in like 10 different tables, and I don't want to have to try
> >> to hold everything in memory just so I only have to write the document
> >> once.
> >>
> >> Any way to control it, or turn versioning off?  Is it even a concern?
> >> Thanks!
> >>
> >
> > Sean,
> >
> > Compaction removes the bodies of old documents. The only information
> > that remains is some historical information to allow for proper
> > merging during replication. The number of historical descriptions is
> > configurable so that even this information can be pruned during
> > compaction.
> >
> > The closest you could get to purging all historical information is to
> > set the rev_stemming parameter low and compacting to get rid of the
> > extra data. I personally wouldn't worry too much about the
> > rev_stemming parameter and instead just compact as much as needed
> > during the import.
> >
> > HTH,
> > Paul Davis
>
>
>
>

Re: Do document versions ever expire?

Posted by Jan Lehnardt <ja...@apache.org>.
On 26 Jan 2010, at 15:15, Markus Jelsma wrote:

> Hello Paul and others,
> 
> 
> Now we're on the subject of compaction, let me ask an question. I have
> some importer somewhere that fills a clean db with about 3500 records,
> futon now tells me its size is 4.2 MB. However, if i compact a fresh and
> clean database (presumably without extraneous information such as old
> revisions) it is suddenly just 2.4 MB!
> 
> Can you, or someone, give an explanation on this matter? It smells like an
> unwanted feature but i could be wrong :)
> 
> To get things straight, this doesn't happen with just a two documents with
> only a uuid ID and a revision number.

Beside the pruning of old revisions compaction will also rebuild the
underlying b-tree structure into a more compact form than single inserts
create on the original database.

Cheers
Jan
--


> 
> 
> Cheers,
> 
> 
> Paul Davis said:
>> On Tue, Jan 26, 2010 at 4:56 PM, Sean Clark Hess <se...@gmail.com>
>> wrote:
>>> I'm wondering if old versions of documents ever expire. On servers
>>> that are disk-bound (like my tiny VPS slices will be) this could be
>>> something I had to design around.
>>> 
>>> For example, when importing data (millions of rows) from a relational
>>> database, I want to be able to build a document a piece at a time. The
>>> relational schema is wacked - it has information about a given
>>> document in like 10 different tables, and I don't want to have to try
>>> to hold everything in memory just so I only have to write the document
>>> once.
>>> 
>>> Any way to control it, or turn versioning off?  Is it even a concern?
>>> Thanks!
>>> 
>> 
>> Sean,
>> 
>> Compaction removes the bodies of old documents. The only information
>> that remains is some historical information to allow for proper
>> merging during replication. The number of historical descriptions is
>> configurable so that even this information can be pruned during
>> compaction.
>> 
>> The closest you could get to purging all historical information is to
>> set the rev_stemming parameter low and compacting to get rid of the
>> extra data. I personally wouldn't worry too much about the
>> rev_stemming parameter and instead just compact as much as needed
>> during the import.
>> 
>> HTH,
>> Paul Davis
> 
> 
> 


Re: Do document versions ever expire?

Posted by Markus Jelsma <ma...@buyways.nl>.
Hello Paul and others,


Now we're on the subject of compaction, let me ask an question. I have
some importer somewhere that fills a clean db with about 3500 records,
futon now tells me its size is 4.2 MB. However, if i compact a fresh and
clean database (presumably without extraneous information such as old
revisions) it is suddenly just 2.4 MB!

Can you, or someone, give an explanation on this matter? It smells like an
unwanted feature but i could be wrong :)

To get things straight, this doesn't happen with just a two documents with
only a uuid ID and a revision number.


Cheers,


Paul Davis said:
> On Tue, Jan 26, 2010 at 4:56 PM, Sean Clark Hess <se...@gmail.com>
> wrote:
>> I'm wondering if old versions of documents ever expire. On servers
>> that are disk-bound (like my tiny VPS slices will be) this could be
>> something I had to design around.
>>
>> For example, when importing data (millions of rows) from a relational
>> database, I want to be able to build a document a piece at a time. The
>> relational schema is wacked - it has information about a given
>> document in like 10 different tables, and I don't want to have to try
>> to hold everything in memory just so I only have to write the document
>> once.
>>
>> Any way to control it, or turn versioning off?  Is it even a concern?
>> Thanks!
>>
>
> Sean,
>
> Compaction removes the bodies of old documents. The only information
> that remains is some historical information to allow for proper
> merging during replication. The number of historical descriptions is
> configurable so that even this information can be pruned during
> compaction.
>
> The closest you could get to purging all historical information is to
> set the rev_stemming parameter low and compacting to get rid of the
> extra data. I personally wouldn't worry too much about the
> rev_stemming parameter and instead just compact as much as needed
> during the import.
>
> HTH,
> Paul Davis




Re: Do document versions ever expire?

Posted by Paul Davis <pa...@gmail.com>.
On Tue, Jan 26, 2010 at 4:56 PM, Sean Clark Hess <se...@gmail.com> wrote:
> I'm wondering if old versions of documents ever expire. On servers that are
> disk-bound (like my tiny VPS slices will be) this could be something I had
> to design around.
>
> For example, when importing data (millions of rows) from a relational
> database, I want to be able to build a document a piece at a time. The
> relational schema is wacked - it has information about a given document in
> like 10 different tables, and I don't want to have to try to hold everything
> in memory just so I only have to write the document once.
>
> Any way to control it, or turn versioning off?  Is it even a concern?
> Thanks!
>

Sean,

Compaction removes the bodies of old documents. The only information
that remains is some historical information to allow for proper
merging during replication. The number of historical descriptions is
configurable so that even this information can be pruned during
compaction.

The closest you could get to purging all historical information is to
set the rev_stemming parameter low and compacting to get rid of the
extra data. I personally wouldn't worry too much about the
rev_stemming parameter and instead just compact as much as needed
during the import.

HTH,
Paul Davis

Re: Do document versions ever expire?

Posted by Metin Akat <ak...@gmail.com>.
One of the functions of database compaction is just that.

On Tue, Jan 26, 2010 at 11:56 PM, Sean Clark Hess <se...@gmail.com> wrote:
> I'm wondering if old versions of documents ever expire. On servers that are
> disk-bound (like my tiny VPS slices will be) this could be something I had
> to design around.
>
> For example, when importing data (millions of rows) from a relational
> database, I want to be able to build a document a piece at a time. The
> relational schema is wacked - it has information about a given document in
> like 10 different tables, and I don't want to have to try to hold everything
> in memory just so I only have to write the document once.
>
> Any way to control it, or turn versioning off?  Is it even a concern?
> Thanks!
>