You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Matthew Sinclair-Day <ms...@gmail.com> on 2010/03/15 20:09:07 UTC
Replicated database size
Hi folks,
I've been putting couch 10.1 on Solaris 10/x86 through its paces
lately trying to understand its replication performance and
behavior, and have noticed the size of pre-compacted replicas
can vary from one host to another.
In one test, the origin has roughly 1.2 million documents taking
up 263MB of storage, but replicated size varies from one server
to another:
origin : 263MB
replica 1: 0.6GB
replica 2: 0.7GB
replica 3: 1.0GB
As expected the replicas are larger than the compacted origin
database, but I didn't expect such size differences from replica
to replica.
After compacting the origin (again) and the replicas, their
sizes settle down to:
origin: : 262.4MB
replica 1: 262.4MB
replica 2: 262.5MB
replica 3: 262.4MB
I'm trying to understand what the reason could be for the
variance in pre-compacted database sizes. All replicas are
running the same build of CouchDB on the same version of
Solaris, though replica3 is running on newer hardware in a
VMWare container.
Matt
Re: Replicated database size
Posted by Adam Kocoloski <ko...@apache.org>.
On Mar 15, 2010, at 3:09 PM, Matthew Sinclair-Day wrote:
> Hi folks,
>
> I've been putting couch 10.1 on Solaris 10/x86 through its paces lately trying to understand its replication performance and behavior, and have noticed the size of pre-compacted replicas can vary from one host to another.
>
> In one test, the origin has roughly 1.2 million documents taking up 263MB of storage, but replicated size varies from one server to another:
>
> origin : 263MB
> replica 1: 0.6GB
> replica 2: 0.7GB
> replica 3: 1.0GB
>
> As expected the replicas are larger than the compacted origin database, but I didn't expect such size differences from replica to replica.
>
> After compacting the origin (again) and the replicas, their sizes settle down to:
>
> origin: : 262.4MB
> replica 1: 262.4MB
> replica 2: 262.5MB
> replica 3: 262.4MB
>
> I'm trying to understand what the reason could be for the variance in pre-compacted database sizes. All replicas are running the same build of CouchDB on the same version of Solaris, though replica3 is running on newer hardware in a VMWare container.
>
> Matt
Hi Matt, the variation in target DB file sizes is due to variations in number and size of _bulk_docs calls used by the replicator. The DB size is inversely correlated with the size of an average _bulk_docs POST, and the size of a POST is governed by the relative speed of the source and the target. If the target is fast and the replication is limited by the source throughput you'll see lots of very small calls to _bulk_docs. Conversely if the target is slow the replicator will batch writes together in blocks of 1000 and send them over.
In short, the faster your target server is the larger the un-compacted target DB will be. Looks like that VMWare container isn't slowing you down much at all :) Best,
Adam