You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Benjamin Smith <li...@benjamindsmith.com> on 2009/06/18 04:46:29 UTC
High volume couchDB?
I recently ran across this project while doing research into Erlang for High
Availability, from what I can see, this project may be exactly what I've been
looking for.
We've been using a clustered filesystem API for our large php application to
keep files and related, semi-structured data. It was developed in-house. We see
perhaps 250,000 file operations per day, with 3 nodes to store data. Think:
Server-level RAID 1. Total data size is ~ 1 TB, with about 50% growth per
year.
What I'm looking for:
1) Ability to store misc files. (Mixed: PDFs, JPGs, iso images, text files, etc)
2) Ability to store related metadata close by (time stamp, ownership data,
application-specific data, etc) We do this now by having a "sister" file with
the data in PHP format serialized with a ".mdt" extension.
3) Redundancy: zero data loss in the event of a server failure. We achieve
this now by having our own, developed-in-house file server daemon running under
xinet.d. Conceptually, it's similar to WebDAV, but lighter weight.
4) Failover: ability to keep working even with partial cluster failure.
5) Healing: ability to get "back together" when downed servers are restored.
6) Performance that degrades gracefully: What happens when the screws get put
to CouchDB? What kinds of loads can it sustain given mid-range hardware?
7) Backups off-site: Disaster Recovery plans, currently we're using rsync run
nightly.
8) Reliability: It should "just work" without needing regular babysitting.
Am I right in reading that CouchDB accomplishes all/most/many of these goals?
If not all of them, which would need watching?
Thanks!
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.