You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by "Geir Magnusson Jr." <ge...@pobox.com> on 2009/02/21 00:16:50 UTC

quick poll - how many databases do you have?

I'm pondering how people manage data, and one question I have is how  
many databases do people tend to have in their server instances?

IOW, do you configure your database server to only handle one  
database, or many?

geir



Re: quick poll - how many databases do you have?

Posted by lenz <no...@googlemail.com>.
i have some, iwantmyname uses 3 in the moment and i try to use a "type" key
in documents to pack more docs in one DB. the next project i work on uses
one DB for most stuff (probably a second one for user preferences).

cheers
lenz

On Sun, Feb 22, 2009 at 10:33 AM, Chris Anderson <jc...@apache.org> wrote:

> On Sat, Feb 21, 2009 at 1:09 PM, Patrick Antivackis
> <pa...@gmail.com> wrote:
> > Depends the application.
> > Some just one db, some 4 databases. In fact i split the data depending on
> > the way i will use replication.
> >
>
> For my music project, I put the parsed web pages and xml feeds into
> one database, and the parsed mp3 id3 tags etc in another db. This
> helps because while I want to rotate the web-crawl data over time, I
> want to continue to accumulate resource metadata...
>
> With them in separate dbs I can throw out the old crawl data but keep
> the file metadata around.
>
> --
> Chris Anderson
> http://jchris.mfdz.com
>



-- 
iWantMyName.com
painless domain registration (finally)

Re: quick poll - how many databases do you have?

Posted by Chris Anderson <jc...@apache.org>.
On Sat, Feb 21, 2009 at 1:09 PM, Patrick Antivackis
<pa...@gmail.com> wrote:
> Depends the application.
> Some just one db, some 4 databases. In fact i split the data depending on
> the way i will use replication.
>

For my music project, I put the parsed web pages and xml feeds into
one database, and the parsed mp3 id3 tags etc in another db. This
helps because while I want to rotate the web-crawl data over time, I
want to continue to accumulate resource metadata...

With them in separate dbs I can throw out the old crawl data but keep
the file metadata around.

-- 
Chris Anderson
http://jchris.mfdz.com

Re: quick poll - how many databases do you have?

Posted by Patrick Antivackis <pa...@gmail.com>.
Depends the application.
Some just one db, some 4 databases. In fact i split the data depending on
the way i will use replication.

2009/2/21 Geir Magnusson Jr. <ge...@pobox.com>

> yep - that's very much in line w/ my thinking on how people are/will build
> apps
>
>
> On Feb 20, 2009, at 6:40 PM, Jan Lehnardt wrote:
>
>
>> On 21 Feb 2009, at 00:16, Geir Magnusson Jr. wrote:
>>
>>  IOW, do you configure your database server to only handle one database,
>>> or many?
>>>
>>
>> Many. For certain applications it makes sense to have a db per user.
>>
>> Cheers
>> Jan
>> --
>>
>>
>

Re: quick poll - how many databases do you have?

Posted by "Geir Magnusson Jr." <ge...@pobox.com>.
yep - that's very much in line w/ my thinking on how people are/will  
build apps

On Feb 20, 2009, at 6:40 PM, Jan Lehnardt wrote:

>
> On 21 Feb 2009, at 00:16, Geir Magnusson Jr. wrote:
>
>> IOW, do you configure your database server to only handle one  
>> database, or many?
>
> Many. For certain applications it makes sense to have a db per user.
>
> Cheers
> Jan
> --
>


Re: quick poll - how many databases do you have?

Posted by Andreas Pieper <ap...@url.de>.
On Thu, Feb 26, 2009 at 2:45 PM, Jan Lehnardt <ja...@apache.org> wrote:
>
> On 26 Feb 2009, at 14:39, Andreas Pieper wrote:
>
>> let me chime in with another thing that concerns me wrt
>> database-per-user: How do I manage view function updates with this
>> scenario? Do I have to push the same map/reduce code to every
>> database, which is an unbounded number? I feel that deployments can
>> get quite hairy in such a setting.
>
> deployment on 1M users in a single DB will have other issues :)
>
> A common solution is to do "lazy-updating" users as they become
> active in the system while pre-loading your 10% (or whatever) most
> active users.
>
> But that all depends on your application and general advice is not
> really applicable.

You're certainly right. So the answer is "yes" I guess, and there is
no specific support for "shared" map/reduce functions between
databases (which is fine).

andi


> Cheers
> Jan
> --
>
>



-- 
andreas pieper
berlin

Re: quick poll - how many databases do you have?

Posted by Jan Lehnardt <ja...@apache.org>.
On 26 Feb 2009, at 14:39, Andreas Pieper wrote:

> let me chime in with another thing that concerns me wrt
> database-per-user: How do I manage view function updates with this
> scenario? Do I have to push the same map/reduce code to every
> database, which is an unbounded number? I feel that deployments can
> get quite hairy in such a setting.

deployment on 1M users in a single DB will have other issues :)

A common solution is to do "lazy-updating" users as they become
active in the system while pre-loading your 10% (or whatever) most
active users.

But that all depends on your application and general advice is not
really applicable.

Cheers
Jan
--


Re: quick poll - how many databases do you have?

Posted by Andreas Pieper <ar...@googlemail.com>.
Hi,

let me chime in with another thing that concerns me wrt
database-per-user: How do I manage view function updates with this
scenario? Do I have to push the same map/reduce code to every
database, which is an unbounded number? I feel that deployments can
get quite hairy in such a setting.

andi

Re: quick poll - how many databases do you have?

Posted by Brian Candler <B....@pobox.com>.
On Wed, Feb 25, 2009 at 01:03:21AM +1100, Sho Fukamachi wrote:
>
> On 21/02/2009, at 10:40 AM, Jan Lehnardt wrote:
>
>> Many. For certain applications it makes sense to have a db per user.
>
> Careful with that approach if there is any chance of having a lot of  
> users, or you don't control the machine. All the database files go into 
> one directory.

Another problem is if there are lots of writes across many databases.
Separate head seeks will be required for each database.

Re: quick poll - how many databases do you have?

Posted by Jan Lehnardt <ja...@apache.org>.
On 24 Feb 2009, at 21:27, Stefan Karpinski wrote:

> It seems awkward to have to name your databases (external) so that the
> directory structure of the storage system (interal) is happy. Would  
> it not
> make sense for couchdb to automatically store databases in a tree to  
> avoid
> this concern? SHA1 hashing the name of the database and then using  
> the first
> k letters of the hex hash value would make a lot of sense. That  
> would allow
> database names to contain anything you want without having to worry  
> about
> whether the filesystem allows those characters, and it's inherently  
> case
> insensitive. It also would automatically balance the number of items  
> in each
> directory.

It would also create a non-obvious mapping of databases on the  
filesystem to
databases through the HTTP API (poor admins!). Antony Blakey proposed a
patch to make database look like <7bit-ascii-slug>-<unique-hash> which  
would
adress the "anything" issue, but he never finished it.

But this (and your proposal as well) would still be open for putting  
everything
in <dbdir>/a/b/c if you craft your db name's hashes to start with "abc".

I like the /-trick.

If there's further discussion on this, it should happen on dev@. Thanks.

Cheers
Jan
--




>
> On Tue, Feb 24, 2009 at 7:12 AM, Jan Lehnardt <ja...@apache.org> wrote:
>
>>
>> On 24 Feb 2009, at 15:03, Sho Fukamachi wrote:
>>
>>
>>> On 21/02/2009, at 10:40 AM, Jan Lehnardt wrote:
>>>
>>> Many. For certain applications it makes sense to have a db per user.
>>>>
>>>
>>> Careful with that approach if there is any chance of having a lot of
>>> users, or you don't control the machine. All the database files go  
>>> into one
>>> directory.
>>>
>>> Personally I consider the "too many files in one directory" hype  
>>> to be a
>>> little overdone but still wouldn't want more than a few thousand,  
>>> and if
>>> you're not root, I've seen low limits set by quota systems as well  
>>> which
>>> could be an issue in managed environments.
>>>
>>> Just something to bear in mind ...
>>>
>>
>>
>> Quoting http://wiki.apache.org/couchdb/HTTP_database_API
>>
>> All database files are stored in a single directory on the file  
>> system. If
>> your database includes a / CouchDB will create a sub-directory  
>> structure in
>> the database directory. That is, a database named his/her, the  
>> database file
>> will be available at $dbdir/his/her.couch. This is useful when you  
>> a large
>> number of databases and your file system does not like that.
>>
>> Cheers
>> Jan
>> --
>>
>>


Re: quick poll - how many databases do you have?

Posted by Stefan Karpinski <st...@gmail.com>.
It seems awkward to have to name your databases (external) so that the
directory structure of the storage system (interal) is happy. Would it not
make sense for couchdb to automatically store databases in a tree to avoid
this concern? SHA1 hashing the name of the database and then using the first
k letters of the hex hash value would make a lot of sense. That would allow
database names to contain anything you want without having to worry about
whether the filesystem allows those characters, and it's inherently case
insensitive. It also would automatically balance the number of items in each
directory.

On Tue, Feb 24, 2009 at 7:12 AM, Jan Lehnardt <ja...@apache.org> wrote:

>
> On 24 Feb 2009, at 15:03, Sho Fukamachi wrote:
>
>
>> On 21/02/2009, at 10:40 AM, Jan Lehnardt wrote:
>>
>>  Many. For certain applications it makes sense to have a db per user.
>>>
>>
>> Careful with that approach if there is any chance of having a lot of
>> users, or you don't control the machine. All the database files go into one
>> directory.
>>
>> Personally I consider the "too many files in one directory" hype to be a
>> little overdone but still wouldn't want more than a few thousand, and if
>> you're not root, I've seen low limits set by quota systems as well which
>> could be an issue in managed environments.
>>
>> Just something to bear in mind ...
>>
>
>
> Quoting http://wiki.apache.org/couchdb/HTTP_database_API
>
>  All database files are stored in a single directory on the file system. If
> your database includes a / CouchDB will create a sub-directory structure in
> the database directory. That is, a database named his/her, the database file
> will be available at $dbdir/his/her.couch. This is useful when you a large
> number of databases and your file system does not like that.
>
> Cheers
> Jan
> --
>
>

Re: quick poll - how many databases do you have?

Posted by Jan Lehnardt <ja...@apache.org>.
On 26 Feb 2009, at 11:38, Sho Fukamachi wrote:

>
> On 25/02/2009, at 2:12 AM, Jan Lehnardt wrote:
>>
>> Quoting http://wiki.apache.org/couchdb/HTTP_database_API
>>
>> All database files are stored in a single directory on the file  
>> system. If your database includes a / CouchDB will create a sub- 
>> directory structure in the database directory. That is, a database  
>> named his/her, the database file will be available at $dbdir/his/ 
>> her.couch. This is useful when you a large number of databases and  
>> your file system does not like that.
>
> Oops. I stand corrected!
>
> Thanks for the heads up, I'd completely missed that (pretty cool)  
> tip...

Thanks for raising this issue, it still needs to be dealt with :)

Cheers
Jan
--


Re: quick poll - how many databases do you have?

Posted by Sho Fukamachi <sh...@gmail.com>.
On 25/02/2009, at 2:12 AM, Jan Lehnardt wrote:
>
> Quoting http://wiki.apache.org/couchdb/HTTP_database_API
>
>  All database files are stored in a single directory on the file  
> system. If your database includes a / CouchDB will create a sub- 
> directory structure in the database directory. That is, a database  
> named his/her, the database file will be available at $dbdir/his/ 
> her.couch. This is useful when you a large number of databases and  
> your file system does not like that.

Oops. I stand corrected!

Thanks for the heads up, I'd completely missed that (pretty cool) tip...

Sho



> Cheers
> Jan
> --
>


Re: quick poll - how many databases do you have?

Posted by Jan Lehnardt <ja...@apache.org>.
On 24 Feb 2009, at 15:03, Sho Fukamachi wrote:

>
> On 21/02/2009, at 10:40 AM, Jan Lehnardt wrote:
>
>> Many. For certain applications it makes sense to have a db per user.
>
> Careful with that approach if there is any chance of having a lot of  
> users, or you don't control the machine. All the database files go  
> into one directory.
>
> Personally I consider the "too many files in one directory" hype to  
> be a little overdone but still wouldn't want more than a few  
> thousand, and if you're not root, I've seen low limits set by quota  
> systems as well which could be an issue in managed environments.
>
> Just something to bear in mind ...


Quoting http://wiki.apache.org/couchdb/HTTP_database_API

   All database files are stored in a single directory on the file  
system. If your database includes a / CouchDB will create a sub- 
directory structure in the database directory. That is, a database  
named his/her, the database file will be available at $dbdir/his/ 
her.couch. This is useful when you a large number of databases and  
your file system does not like that.

Cheers
Jan
--


Re: quick poll - how many databases do you have?

Posted by Sho Fukamachi <sh...@gmail.com>.
On 21/02/2009, at 10:40 AM, Jan Lehnardt wrote:

> Many. For certain applications it makes sense to have a db per user.

Careful with that approach if there is any chance of having a lot of  
users, or you don't control the machine. All the database files go  
into one directory.

Personally I consider the "too many files in one directory" hype to be  
a little overdone but still wouldn't want more than a few thousand,  
and if you're not root, I've seen low limits set by quota systems as  
well which could be an issue in managed environments.

Just something to bear in mind ...

Sho


> Cheers
> Jan
> --
>


Re: quick poll - how many databases do you have?

Posted by Jan Lehnardt <ja...@apache.org>.
On 21 Feb 2009, at 00:16, Geir Magnusson Jr. wrote:

> IOW, do you configure your database server to only handle one  
> database, or many?

Many. For certain applications it makes sense to have a db per user.

Cheers
Jan
--