You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Panny Wang <pa...@gmail.com> on 2014/09/25 10:43:56 UTC

Is there a general rule to estimate the maximum disk space for couchdb database?

Hi,

We are using CouchDB in our new project but there's question raised that
how much space should I prepare for a couchdb database?

My simple example is:
1000 * 1KB document that will be appended to database in the beginning of
each day and each document will be updated every hour a day (i.e. 24
revisions are kept for a document).
So can I know how much disk space at most will be used in this case?

I understand that the disk space is not as expensive as before and
compaction can be done every day to save the disk space. But if we don't
have a basis to know how much the disk space will be used, then the whole
system may run into a 'out of space' situation and it is unpredictable when
the space will be run out.

It is appreciated if you can shed some light on the rule about how estimate
the maximum disk space for a couchdb database.

Thank you!

Regards,
Panny

Re: Is there a general rule to estimate the maximum disk space for couchdb database?

Posted by Alexander Shorin <kx...@gmail.com>.
Let's see.

1000 documents 1KB sized added in the beginning of each day and
updates every hour. So for single day we have 1MB of initial data
+23MB from revisions. 24MB per day gives us 720 MB per average month
and 8.7 GB per year. However, for each day you accumulates 23MB of
overhead data for previous revisions which could be cleaned up during
compaction. So if you'll run compaction at the end of day, 1MB per day
will be the grow rate.

Also worth to take into account the fact that while CouchDB cleans up
old revisions on compaction, it doesn't removes their ids from
document to preserve his history: this will give you also small size
overhead on top, but not significant.

Additionally it's worth to know that CouchDB grows db by 4KiB chunks
no matter if stored document is even smaller then. Also take into
account that file system may preallocate more space for a file than it
actually contains.

--
,,,^..^,,,


On Thu, Sep 25, 2014 at 12:43 PM, Panny Wang <pa...@gmail.com> wrote:
> Hi,
>
> We are using CouchDB in our new project but there's question raised that
> how much space should I prepare for a couchdb database?
>
> My simple example is:
> 1000 * 1KB document that will be appended to database in the beginning of
> each day and each document will be updated every hour a day (i.e. 24
> revisions are kept for a document).
> So can I know how much disk space at most will be used in this case?
>
> I understand that the disk space is not as expensive as before and
> compaction can be done every day to save the disk space. But if we don't
> have a basis to know how much the disk space will be used, then the whole
> system may run into a 'out of space' situation and it is unpredictable when
> the space will be run out.
>
> It is appreciated if you can shed some light on the rule about how estimate
> the maximum disk space for a couchdb database.
>
> Thank you!
>
> Regards,
> Panny

Re: Is there a general rule to estimate the maximum disk space for couchdb database?

Posted by Stefan Klein <st...@gmail.com>.
Hi

2014-09-25 10:43 GMT+02:00 Panny Wang <pa...@gmail.com>:

> Hi,
>
> We are using CouchDB in our new project but there's question raised that
> how much space should I prepare for a couchdb database?
>
> My simple example is:
> 1000 * 1KB document that will be appended to database in the beginning of
> each day and each document will be updated every hour a day (i.e. 24
> revisions are kept for a document).
> So can I know how much disk space at most will be used in this case?
>

That's extremely hard to tell.
If you update each document individually you need more disk space than if
you use bulk updates on multiple documents at once (until compaction runs).
You also need to take account for the views and their compaction.
I have seen a view of 6,6GB before and 187MB size after compaction.
I think, not sure on that though, it should also depend on how good your
data compresses (if you use snappy).

So I would say you have to try it out to get a rough idea how to dimension
your file-systems.

regards,
Stefan

Re: Is there a general rule to estimate the maximum disk space for couchdb database?

Posted by Diego Garcia Vieira <di...@gmail.com>.
Can you tell us a little about the application? Maybe this can help to
isolate some boundaries and within these boundaries a easy manner to
calculate the size needed.

2014-09-25 5:43 GMT-03:00 Panny Wang <pa...@gmail.com>:

> Hi,
>
> We are using CouchDB in our new project but there's question raised that
> how much space should I prepare for a couchdb database?
>
> My simple example is:
> 1000 * 1KB document that will be appended to database in the beginning of
> each day and each document will be updated every hour a day (i.e. 24
> revisions are kept for a document).
> So can I know how much disk space at most will be used in this case?
>
> I understand that the disk space is not as expensive as before and
> compaction can be done every day to save the disk space. But if we don't
> have a basis to know how much the disk space will be used, then the whole
> system may run into a 'out of space' situation and it is unpredictable when
> the space will be run out.
>
> It is appreciated if you can shed some light on the rule about how estimate
> the maximum disk space for a couchdb database.
>
> Thank you!
>
> Regards,
> Panny
>