You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Der Engel <en...@gmail.com> on 2014/04/08 20:45:33 UTC

Would this be a sane use case for couchdb?

Hi, I'm a newbie trying to write his first real web application, couchdb
has interested me because of its http api and the thing that the data
model/structure of data of the app would probably be changing over time.
The webapp is going to be a invoicing software for certain niche,  so I
guess I probably need a DB were transactions are reliable and data
integrity is very good. I don't intend to run a cluster at first, but it
may become a necessity later on.

Would CouchDB be a good fit for this kind of application?

Thanks,
Der

Re: Would this be a sane use case for couchdb?

Posted by Jens Alfke <je...@couchbase.com>.

On Apr 8, 2014, at 11:45 AM, Der Engel <en...@gmail.com> wrote:

> The webapp is going to be a invoicing software for certain niche,  so I
> guess I probably need a DB were transactions are reliable and data
> integrity is very good.

CouchDB doesn’t offer transactions, at least not in a traditional sense. While updating a document is atomic, updating multiple documents is not; essentially each document update is its own little transaction. (Even if you use the _bulk_docs command.)

If you can accept that, data integrity is great. The append-only file format is almost immune to corruption, and databases are very easy to back up efficiently (via replication.) The MVCC feature prevents you from accidentally overwriting the wrong version of a document, or from doing a read-edit-write cycle that smashes someone else’s intervening write.

The lack of transactions may seem bad, but often it just requires thinking different[ly] about your data. In a document database you can aggressively *de*normalize by storing multiple small data items inside a single document, instead of in separate tables — a canonical example is an address card doc that stores arrays of phone numbers and emails. Such a document will always be internally consistent. It’s also possible to create a pseudo transaction as a document that contains a list of the IDs other documents and revisions that are collectively consistent.

Anyway, if you do get to clustering with any database, you’ll probably find that you have to lower your standards about transactions anyway. Clustered databases mostly respond to the constraints imposed by the CAP Theorem by relaxing Consistency. Those that don’t (by using shared locks) will instead suffer from lowered Availability or Partition-tolerance.

—Jens

Re: Would this be a sane use case for couchdb?

Posted by Jens Alfke <je...@couchbase.com>.

On Apr 9, 2014, at 6:05 PM, Scott Weber <sc...@sbcglobal.net> wrote:

> Couch refers to each table folder as a 'database', and each doc in it is a 'table'.  While this is true from their point of view because that you can just add more data to a doc (add rows). 

I’ve never seen anything that compares a CouchDB document to a table. They aren’t like that at all. 
Documents are comparable to rows in a relational database.
CouchDB doesn’t have any equivalent of a table because it doesn’t require rows/documents to have the same structure, and it doesn’t have different ID namespaces comparable to table primary keys.

> But security is limited just to the folder (database).

Read access is limited, yes. But you can have fine-grained write access on a per-user basis or by validating data in documents.

> The closest thing to a 'view' is a JavaScript function that enumerates each of the documents in a folder.  But you can't chain views.  And it can't see anything outside it's folder (database).

CouchDB views aren’t really like SQL views, and there is a lot more to them than JavaScript functions. They’re indexes. As indexes they’re more powerful than SQL indexes, but the query capability is more primitive.

> A document for each employee. It has SSN, payroll, and bunches of other confidential info.  If you stick all the docs in one folder, then you can run a view report (say, for a IRS form 941). But you have to lock it down so only the HR guy can read them, otherwise, every employee could create a simple URL to see every ones personal data. So now you can't allow a user to update their own data because of the lock down.

That’s assuming you even wanted to give users direct access to the database. It’s not necessary; you can have an app server that manages the database and has a web interface (or REST API or whatever) for users. Which is what you’d do with a traditional database like MySQL anyway.

I work on a DB server that’s compatible with CouchDB (but not based on it) that does support exactly this kind of fine-grained access control. Contact me off-list if you’re interested (I’ve gotten flak recently for mentioning it here because it’s not strictly speaking CouchDB.)

> And all the code is stored in strings that are embedded in JSON documents, so the strings have the be stripped of CRLF, making the code unreadable.  Kind of like 'minified' script code. There is no easy way to upload it

This is what “CouchApp” development frameworks like Kanso are for — they let you store your resources and view functions in local files and then compile/minify them and upload them to the database for you.

> There isn't any (or I haven't found any) thing like WebStorm, Visual Studio, or Fusion where you can just launch an IDE and start creating.  So the lack of tools hampers development severely.

*Shrug* I’ve never used an IDE like that for web development. There will never be special IDE support for cutting-edge systems because the tools always lag behind; for example, I don’t know if there are IDEs for Rails nowadays, but it used to be everyone used regular editors like TextMate or Emacs. IMHO saying this hampers development is an excuse (kind of like saying you can’t write good songs because you have the wrong brand of guitar.) The early Rails developers created websites like Basecamp that went beyond what anyone had done before, with no special tool support.

Re-reading that, it sounds too much like the grumpy grandpa complaining about how lazy today’s kids are. I recognize that IDEs are great and can make things easier or faster. But they’re not necessary, and there are other advantages to working with a newer system before everyone else catches up to it.

—Jens

Re: Would this be a sane use case for couchdb?

Posted by Scott Weber <sc...@sbcglobal.net>.

Well, this will likely start a flame war, but in my opinion, no.

This is based only on my opinion.  And it comes from my experience as a user which has been exposed to having to write apps for both SQL and CouchDB (not an expert in either area). So what follows are some disparate thoughts.

Couch is fast.  It stores documents.  Every document is in something akin to a folder, they call a database.  Think like individual rows (docs) in a table (a database folder).

All the interface is done with HTTP calls (Get, Post, Put...)  That is a
 big learning curve.  And there is a lot of doc out there, all online. 
But it's spread out.  If you want to see how to create a view, one place
 has it.  If you want to figure out how to create a user, it's on a 
completely different web site.  Be prepared to spend a lot of time 
searching different site for answers.  So far, at least, I haven't found any info that is different on two site when they do have the same topic.

The mail list is usually very helpful (unlike a different, un-named, C++ project where I pointed out to the group a bug caused an access violation crash, and was told how dare me accuse them of having a bug like that)

There is no referential integrity, because there is no foreign key.  And consequently, no transactions.

Couch refers to each table folder as a 'database', and each doc in it is a 'table'.  While this is true from their point of view because that you can just add more data to a doc (add rows).  But security is limited just to the folder (database).

The closest thing to a 'view' is a JavaScript function that enumerates each of the documents in a folder.  But you can't chain views.  And it can't see anything outside it's folder (database).

Imagine this:  A document for each employee. It has SSN, payroll, and bunches of other confidential info.  If you stick all the docs in one folder, then you can run a view report (say, for a IRS form 941). But you have to lock it down so only the HR guy can read them, otherwise, every employee could create a simple URL to see every ones personal data. So now you can't allow a user to update their own data because of the lock down.
Alternatively, you could put every employee in a separate folder (database) and restrict users to their DB folder, but you can't write a server side JS view to pull everyone for the report.
Alternately you could break the data up (flatten and normalize it). They you run the risk of losing referential integrity.

As for writing and debugging JS code, there is no debugger.  The system has evolved only to the level of 'print' to a log.  And all the code is stored in strings that are embedded in JSON documents, so the strings have the be stripped of CRLF, making the code unreadable.  Kind of like 'minified' script code. There is no easy way to upload it, and if it crashes, no easy way to debug, because the entire function is simply "Error on line #1".  So be prepared to do a lot of trial and error...

All these issues can be worked around. (security, debugging, uploading and managing tools). But it takes a lot of effort. So be prepared to write a ton of your own scripts to try to make life easier.  Because no one really enjoys writing command lines that are 120+ characters.

There isn't any (or I haven't found any) thing like WebStorm, Visual Studio, or Fusion where you can just launch an IDE and start creating.  So the lack of tools hampers development severely.

To conclude my tome, don't believe that I an saying Couch has no value. Or to stay away from it.  It's a really cool KISS based database system, especially for those of us who are tired of DBAs who make monster schemas that take up walls to display.  However, like anything you want to learn, start out with something miniscule and learn from there.

As I said, this is only my opinion from my experiences. I hope it doesn't get me excommunicated from the list :-)




________________________________
 From: Der Engel <en...@gmail.com>
To: user@couchdb.apache.org 
Sent: Tuesday, April 8, 2014 1:45 PM
Subject: Would this be a sane use case for couchdb?
 

Hi, I'm a newbie trying to write his first real web application, couchdb
has interested me because of its http api and the thing that the data
model/structure of data of the app would probably be changing over time.
The webapp is going to be a invoicing software for certain niche,  so I
guess I probably need a DB were transactions are reliable and data
integrity is very good. I don't intend to run a cluster at first, but it
may become a necessity later on.

Would CouchDB be a good fit for this kind of application?

Thanks,
Der