You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Damien Katz <da...@apache.org> on 2008/12/02 20:34:39 UTC

1.0.0 wishlist/roadmap

Here is some stuff I'd like to see in a 1.0.0 release. Everything is  
open for discussion.

- Built-in reduce functions to avoid unnecessary JS overhead -

Count, Sum, Avg, Min, Max, Std dev. others?

- Restrict database read access -

Right now any user can read any database, we need to be able to  
restrict that at least on a whole database level.

- Replication performance enhancements -

Adam Kocoloski has some replication patches that greatly improve  
replication performance.

- Revision stemming: It should be possible to limit the number of  
revisions tracked -

By default each document edit produces a revision id that is tracked  
indefinitely. This guarantees conflicts versus subsequent edits can  
always be distinguished in ad-hoc replication, however the forever  
growing list of revisions isn't always desirable. THis can be  
addressed by limiting the number tracked and purging the oldest  
revisions. The downside is that if the revision tracking limited is N,  
then anyone who hasn't replicated a document since its last N edits  
will see a spurious edit conflict.

- Lucene/Full-text indexing integration -

We have this working to in side patches, this needs to be integrated  
to trunk and with the view engine

- Incremental document replication -

We need at the minimum the ability to incrementally replicate only the  
attachments that have changed in a document. This will save lots of  
network IO and CouchDB can be version control system with document  
diffs added as attachments.

This can work for document fields too, but the overhead may not be  
worth it.

- Built-in authentication module(s) -

The ability to host a CouchDB database used for HTTP authentication  
schemes. If storing passwords, they would need to be stored encrypted,  
decrypted on demand by the authentication process.

- View server enhancements (stale/partial index option) -

Chris Anderson has a side branch for this we need to finish and put  
into trunk.

- View index compaction -

Views indexes expand forever, and need to be compacted in a similar  
way the storage files are compacted. This work will tie into the View  
Server enhancements.

- Document integrity/deterministic revid -

For the sake of end to end document integrity, we need a way to hash a  
document's contents, and since we already have revision ids, I think  
the revision ids should be the hashes. The hashed document should be a  
canonical json representation, and it should have the _id and _rev  
fields in it. The _rev will be the PREVIOUS revision ID/hash the edit  
is based on, or blank if a new edit. Then the _rev is replaced with  
the new hash value.

- Fully tail append writes -

CouchDB uses zero-overwrite storage, but not fully tail append  
storage. Document json bodies are stored in internal buffers, written  
consecutively, one after another until the buffers in completely full,  
then another buffer is created at the end of the file for more  
documents. File attachments are written to similar buffers as well.  
Btree updates are always tail append, each update to a btree, even if  
its a deletion, causes new writes to the end of the file. Once the  
document, attachments and indexes are commited (fsync), the header is  
then written and flushed to disk, and that is always stored right at  
the beginning of the file (requiring another seek).

Document updates to CouchDB require 2 fsyncs with ~3 seeks for full  
committal and index consistency. This is true if you write 1 or 1000  
documents in a single transaction (bulk update), you still need ~ 3  
seeks. Using conventional transaction journalling, it's possible to  
get the committal down to a single seek and fsync, and worry about  
ensuring file and index consistency asynchronously, often in batch  
mode with other committed updates. This can perform very well, but has  
downsides like extra complexity and increased memory usage as data is  
cached waiting to be flushed to disk, and must do special consistency  
checks and fix-ups on startup if there is a crash.

If CouchDB used tail-append storage for everything, then all document  
updates can be completely flushed with full file consistency with a  
single seek and, depending on the file system, a single fsync. All the  
disk updates, documents, file attachments, indexes and file header,  
occur as appends to the end of the file.

The biggest changes will be in how file attachments and the headers  
are written and read, and the performance characteristics of view  
indexing as documents will no longer be packed into contiguous buffers.

File attachment will be written in chunks with the last chunk being an  
index to the other chunks.

Headers will be specially signed blocks written to the end of the  
file. Reading the header on database open will require scanning the  
file from the end, since the file might have partial updates that  
didn't complete since the last update.

The performance of the views will be impacted as the documents are  
more likely to be fragmented across the storage file. But they will  
still be in the order they will be accessed for indexing, so the read  
seeks are always moving forward. Also, the act of compacting the  
storage file will result in the documents being tightly packed again.

- Streaming document updates with attachment writes -

Using mime mulitpart encoding, it should be possible to send all parts  
of a document in a single http request, with the json and binary  
attachments sent as different mime parts. Attachments can be streamed  
to disk as bytes are received, keeping total memory overhead to a  
minimum. Attachments can also be written to disk in compressed format  
and served over http by default in that compressed format, using 0%  
CPU for compression at read time, but will require decompression if  
the client doesn't support the compression format.


- Partitioning/Clustering Support -

Clustering for failover and load balancing is priority. Large database  
support via partitioning may not make 1.0





Re: 1.0.0 wishlist/roadmap

Posted by Antony Blakey <an...@gmail.com>.
On 03/12/2008, at 6:04 AM, Damien Katz wrote:

> - Incremental document replication -
>
> We need at the minimum the ability to incrementally replicate only  
> the attachments that have changed in a document. This will save lots  
> of network IO and CouchDB can be version control system with  
> document diffs added as attachments.

And differentially replicate such attachments where possible.

- A plugin mechanism for deployment -

Allowing separately compiled and deployed extensions to Couch.

I have a branch here: http://github.com/AntonyBlakey/couchdb/tree/external 
  that includes that includes both the simplest possible plugin  
mechanism and a cleaned version of Paul Davis's external that includes  
update_seq in the external protocol (and changes db -> db_name for  
consistency).

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Borrow money from pessimists - they don't expect it back.
   -- Steven Wright



Re: 1.0.0 wishlist/roadmap

Posted by Jan Lehnardt <ja...@apache.org>.
  - Statistics: A module that collects runtime statistics (how many  
hits etc) and exports them to Futon and other tools for inspection and  
to SNMP for monitoring.

Cheers
Jan
--
On 2 Dec 2008, at 20:34, Damien Katz wrote:

> Here is some stuff I'd like to see in a 1.0.0 release. Everything is  
> open for discussion.
>
> - Built-in reduce functions to avoid unnecessary JS overhead -
>
> Count, Sum, Avg, Min, Max, Std dev. others?
>
> - Restrict database read access -
>
> Right now any user can read any database, we need to be able to  
> restrict that at least on a whole database level.
>
> - Replication performance enhancements -
>
> Adam Kocoloski has some replication patches that greatly improve  
> replication performance.
>
> - Revision stemming: It should be possible to limit the number of  
> revisions tracked -
>
> By default each document edit produces a revision id that is tracked  
> indefinitely. This guarantees conflicts versus subsequent edits can  
> always be distinguished in ad-hoc replication, however the forever  
> growing list of revisions isn't always desirable. THis can be  
> addressed by limiting the number tracked and purging the oldest  
> revisions. The downside is that if the revision tracking limited is  
> N, then anyone who hasn't replicated a document since its last N  
> edits will see a spurious edit conflict.
>
> - Lucene/Full-text indexing integration -
>
> We have this working to in side patches, this needs to be integrated  
> to trunk and with the view engine
>
> - Incremental document replication -
>
> We need at the minimum the ability to incrementally replicate only  
> the attachments that have changed in a document. This will save lots  
> of network IO and CouchDB can be version control system with  
> document diffs added as attachments.
>
> This can work for document fields too, but the overhead may not be  
> worth it.
>
> - Built-in authentication module(s) -
>
> The ability to host a CouchDB database used for HTTP authentication  
> schemes. If storing passwords, they would need to be stored  
> encrypted, decrypted on demand by the authentication process.
>
> - View server enhancements (stale/partial index option) -
>
> Chris Anderson has a side branch for this we need to finish and put  
> into trunk.
>
> - View index compaction -
>
> Views indexes expand forever, and need to be compacted in a similar  
> way the storage files are compacted. This work will tie into the  
> View Server enhancements.
>
> - Document integrity/deterministic revid -
>
> For the sake of end to end document integrity, we need a way to hash  
> a document's contents, and since we already have revision ids, I  
> think the revision ids should be the hashes. The hashed document  
> should be a canonical json representation, and it should have the  
> _id and _rev fields in it. The _rev will be the PREVIOUS revision ID/ 
> hash the edit is based on, or blank if a new edit. Then the _rev is  
> replaced with the new hash value.
>
> - Fully tail append writes -
>
> CouchDB uses zero-overwrite storage, but not fully tail append  
> storage. Document json bodies are stored in internal buffers,  
> written consecutively, one after another until the buffers in  
> completely full, then another buffer is created at the end of the  
> file for more documents. File attachments are written to similar  
> buffers as well. Btree updates are always tail append, each update  
> to a btree, even if its a deletion, causes new writes to the end of  
> the file. Once the document, attachments and indexes are commited  
> (fsync), the header is then written and flushed to disk, and that is  
> always stored right at the beginning of the file (requiring another  
> seek).
>
> Document updates to CouchDB require 2 fsyncs with ~3 seeks for full  
> committal and index consistency. This is true if you write 1 or 1000  
> documents in a single transaction (bulk update), you still need ~ 3  
> seeks. Using conventional transaction journalling, it's possible to  
> get the committal down to a single seek and fsync, and worry about  
> ensuring file and index consistency asynchronously, often in batch  
> mode with other committed updates. This can perform very well, but  
> has downsides like extra complexity and increased memory usage as  
> data is cached waiting to be flushed to disk, and must do special  
> consistency checks and fix-ups on startup if there is a crash.
>
> If CouchDB used tail-append storage for everything, then all  
> document updates can be completely flushed with full file  
> consistency with a single seek and, depending on the file system, a  
> single fsync. All the disk updates, documents, file attachments,  
> indexes and file header, occur as appends to the end of the file.
>
> The biggest changes will be in how file attachments and the headers  
> are written and read, and the performance characteristics of view  
> indexing as documents will no longer be packed into contiguous  
> buffers.
>
> File attachment will be written in chunks with the last chunk being  
> an index to the other chunks.
>
> Headers will be specially signed blocks written to the end of the  
> file. Reading the header on database open will require scanning the  
> file from the end, since the file might have partial updates that  
> didn't complete since the last update.
>
> The performance of the views will be impacted as the documents are  
> more likely to be fragmented across the storage file. But they will  
> still be in the order they will be accessed for indexing, so the  
> read seeks are always moving forward. Also, the act of compacting  
> the storage file will result in the documents being tightly packed  
> again.
>
> - Streaming document updates with attachment writes -
>
> Using mime mulitpart encoding, it should be possible to send all  
> parts of a document in a single http request, with the json and  
> binary attachments sent as different mime parts. Attachments can be  
> streamed to disk as bytes are received, keeping total memory  
> overhead to a minimum. Attachments can also be written to disk in  
> compressed format and served over http by default in that compressed  
> format, using 0% CPU for compression at read time, but will require  
> decompression if the client doesn't support the compression format.
>
>
> - Partitioning/Clustering Support -
>
> Clustering for failover and load balancing is priority. Large  
> database support via partitioning may not make 1.0
>
>
>
>
>


Re: 1.0.0 wishlist/roadmap

Posted by Noah Slater <ns...@apache.org>.
On Fri, Dec 05, 2008 at 05:52:38PM +0100, Benoit Chesneau wrote:
> I think i'm just tired to see such answer every time you speak about features
> in an opensource project ;)

Well, it's a nice way of nudging people into contributing. With the exception of
Damien, we were all casual users at one point.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: 1.0.0 wishlist/roadmap

Posted by Benoit Chesneau <bc...@gmail.com>.
On Fri, Dec 5, 2008 at 8:15 PM, Chris Anderson <jc...@apache.org> wrote:
> On Fri, Dec 5, 2008 at 8:52 AM, Benoit Chesneau <bc...@gmail.com> wrote:
>> Anyway so the tool should be in erlang ? I take it. I would like if
>> possible to not rely in http api for this one and just add a function
>> that could be used in erlang shell and then call by any script.
>>
>
> The tool could also be written as a couchjs script, which could just
> put the data to stdout. However, that would be using the HTTP api.
> (couchjs could use a little work to make command-line arguments
> available in the runtime, but even without changing it should be
> possible to build a proof-of-concept)
>
>
> --
> Chris Anderson
> http://jchris.mfdz.com
>

well command line is needed herer. I will give a shot this weekend.
and may bug the ml and irc  if I have any questions.

-- 
- benoît

Re: 1.0.0 wishlist/roadmap

Posted by Antony Blakey <an...@gmail.com>.
On 06/12/2008, at 8:49 AM, Chris Anderson wrote:

> On Fri, Dec 5, 2008 at 2:08 PM, Antony Blakey  
> <an...@gmail.com> wrote:
>>
>> Backup/Restore are trivial to write, and an Erlang plugin will be  
>> easy
>
> An HTTP request against
> /db/_all_docs_by_seq?include_docs=true&all_revs=true* could be
> streamed to a file. The result would be a single JSON list of all the
> uncompacted changes. I wonder if we could get mochiweb to gzip this by
> setting the right headers?

Doesn't include non-inline attachments?

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Always have a vision. Why spend your life making other people’s dreams?
  -- Orson Welles (1915-1985)


Re: 1.0.0 wishlist/roadmap

Posted by Jedediah Smith <je...@silencegreys.com>.
Using the actual view functions would be awfully convenient. People are 
going to want comet feeds of their fancy generated views. Having to 
rebuild them on the client would be a let down.

How about a view parameter that adds a seq field to each row and orders 
results by seq instead of key (but you still get the key)? Could that 
even work for reduced views, where seq would be reduced with max()?

With any comet scheme, wouldn't you want to return seq with all DB reads 
so that you can sync up your comet gotten data with non-comet data?

Chris Anderson wrote:
> On Thu, Dec 11, 2008 at 12:58 PM, Damien Katz <da...@apache.org> wrote:
>> I'm thinking of removing the _all_docs_by_seq HTTP view and replacing it
>> with something that will allow for more flexibility and also allow for Comet
>> like events, by providing a filter function that finds documents that meet a
>> criteria, and to be notified immediately when new documents are saved that
>> meet that criteria. This is meant to be used by the replicator and external
>> indexers, but to also try to make it be like a regular view and while
>> supporting all the other stuff is pointlessly complex.
>>
> 
> I think if we support an HTTP resource that provides GET and HEAD
> responses, representing a range of documents, in a JSON format like
> the one views use now, I'd call it a view.
> 
> GET /db/_by_seq?since=200&full=true
> 
> could return all changes since seq 200. And if a client makes the
> request when the db seq is 150, the response will just hang (like
> Comet) until 51 seq #s are incremented. We could optimize with a batch
> parameter, so the client could request that CouchDB never send less
> than N seqs in a single response. Or there could be an option to keep
> the connection open while waiting for new seqs to increment. (I think
> I'm describing a system Damien remarked about in IRC.)
> 
> I guess I think there's some gain to be had if the replication stream
> JSON format is similar to what comes back from a view request. For one
> thing, it would make it simpler to implement other non-CouchDB
> replication clients, which is a good thing.
> 

Re: 1.0.0 wishlist/roadmap

Posted by Chris Anderson <jc...@apache.org>.
On Thu, Dec 11, 2008 at 12:58 PM, Damien Katz <da...@apache.org> wrote:
>
> I'm thinking of removing the _all_docs_by_seq HTTP view and replacing it
> with something that will allow for more flexibility and also allow for Comet
> like events, by providing a filter function that finds documents that meet a
> criteria, and to be notified immediately when new documents are saved that
> meet that criteria. This is meant to be used by the replicator and external
> indexers, but to also try to make it be like a regular view and while
> supporting all the other stuff is pointlessly complex.
>

I think if we support an HTTP resource that provides GET and HEAD
responses, representing a range of documents, in a JSON format like
the one views use now, I'd call it a view.

GET /db/_by_seq?since=200&full=true

could return all changes since seq 200. And if a client makes the
request when the db seq is 150, the response will just hang (like
Comet) until 51 seq #s are incremented. We could optimize with a batch
parameter, so the client could request that CouchDB never send less
than N seqs in a single response. Or there could be an option to keep
the connection open while waiting for new seqs to increment. (I think
I'm describing a system Damien remarked about in IRC.)

I guess I think there's some gain to be had if the replication stream
JSON format is similar to what comes back from a view request. For one
thing, it would make it simpler to implement other non-CouchDB
replication clients, which is a good thing.

-- 
Chris Anderson
http://jchris.mfdz.com

Re: 1.0.0 wishlist/roadmap

Posted by Jan Lehnardt <ja...@apache.org>.
On 11 Dec 2008, at 22:12, David Schoonover wrote:

> The wishlist is long, but the two missing killer features for me are:
> - Native Erlang API

You can already use the internal API, but it is going to be changed  
and polished
for 1.0.


> - DB Sharding

(CouchDB)-client sharding is possible today. DB partitioning  
transparent to a
client is not planned for 1.0.

Cheers
Jan
--


>
>
> On Thu, Dec 11, 2008 at 4:02 PM, Antony Blakey <antony.blakey@gmail.com 
> > wrote:
>>
>> On 12/12/2008, at 7:28 AM, Damien Katz wrote:
>>
>>>
>>> On Dec 11, 2008, at 3:43 PM, Chris Anderson wrote:
>>>
>>>> On Fri, Dec 5, 2008 at 2:30 PM, Damien Katz <da...@apache.org>  
>>>> wrote:
>>>>>
>>>>> Yes yes yes! This is exactly how it should work, pull all the  
>>>>> docs in a
>>>>> single http request and also record the last seq num. Then later
>>>>> incrementally pull new changes using the seq num, lather rinse  
>>>>> repeat.
>>>>> Restore by POSTing the docs in bulk updates.
>>>>>
>>>>
>>>> With the same view (_all_docs_by_seq with super_include_docs),  
>>>> could
>>>> recipient-triggered replication be accomplished in a single HTTP
>>>> request? This might speed up replication a bunch as well.
>>>>
>>>
>>> I'm thinking of removing the _all_docs_by_seq HTTP view and  
>>> replacing it
>>> with something that will allow for more flexibility and also allow  
>>> for Comet
>>> like events, by providing a filter function that finds documents  
>>> that meet a
>>> criteria, and to be notified immediately when new documents are  
>>> saved that
>>> meet that criteria. This is meant to be used by the replicator and  
>>> external
>>> indexers, but to also try to make it be like a regular view and  
>>> while
>>> supporting all the other stuff is pointlessly complex.
>>
>> Mmmm, tasty!
>>
>> Antony Blakey
>> --------------------------
>> CTO, Linkuistics Pty Ltd
>> Ph: 0438 840 787
>>
>> Success is not the key to happiness. Happiness is the key to success.
>> -- Albert Schweitzer
>>
>>
>
>
>
> -- 
> LOVE DAVE
>


Re: 1.0.0 wishlist/roadmap

Posted by David Schoonover <da...@gmail.com>.
The wishlist is long, but the two missing killer features for me are:
- Native Erlang API
- DB Sharding

On Thu, Dec 11, 2008 at 4:02 PM, Antony Blakey <an...@gmail.com> wrote:
>
> On 12/12/2008, at 7:28 AM, Damien Katz wrote:
>
>>
>> On Dec 11, 2008, at 3:43 PM, Chris Anderson wrote:
>>
>>> On Fri, Dec 5, 2008 at 2:30 PM, Damien Katz <da...@apache.org> wrote:
>>>>
>>>> Yes yes yes! This is exactly how it should work, pull all the docs in a
>>>> single http request and also record the last seq num. Then later
>>>> incrementally pull new changes using the seq num, lather rinse repeat.
>>>> Restore by POSTing the docs in bulk updates.
>>>>
>>>
>>> With the same view (_all_docs_by_seq with super_include_docs), could
>>> recipient-triggered replication be accomplished in a single HTTP
>>> request? This might speed up replication a bunch as well.
>>>
>>
>> I'm thinking of removing the _all_docs_by_seq HTTP view and replacing it
>> with something that will allow for more flexibility and also allow for Comet
>> like events, by providing a filter function that finds documents that meet a
>> criteria, and to be notified immediately when new documents are saved that
>> meet that criteria. This is meant to be used by the replicator and external
>> indexers, but to also try to make it be like a regular view and while
>> supporting all the other stuff is pointlessly complex.
>
> Mmmm, tasty!
>
> Antony Blakey
> --------------------------
> CTO, Linkuistics Pty Ltd
> Ph: 0438 840 787
>
> Success is not the key to happiness. Happiness is the key to success.
>  -- Albert Schweitzer
>
>



-- 
LOVE DAVE

Re: 1.0.0 wishlist/roadmap

Posted by Antony Blakey <an...@gmail.com>.
On 12/12/2008, at 7:28 AM, Damien Katz wrote:

>
> On Dec 11, 2008, at 3:43 PM, Chris Anderson wrote:
>
>> On Fri, Dec 5, 2008 at 2:30 PM, Damien Katz <da...@apache.org>  
>> wrote:
>>> Yes yes yes! This is exactly how it should work, pull all the docs  
>>> in a
>>> single http request and also record the last seq num. Then later
>>> incrementally pull new changes using the seq num, lather rinse  
>>> repeat.
>>> Restore by POSTing the docs in bulk updates.
>>>
>>
>> With the same view (_all_docs_by_seq with super_include_docs), could
>> recipient-triggered replication be accomplished in a single HTTP
>> request? This might speed up replication a bunch as well.
>>
>
> I'm thinking of removing the _all_docs_by_seq HTTP view and  
> replacing it with something that will allow for more flexibility and  
> also allow for Comet like events, by providing a filter function  
> that finds documents that meet a criteria, and to be notified  
> immediately when new documents are saved that meet that criteria.  
> This is meant to be used by the replicator and external indexers,  
> but to also try to make it be like a regular view and while  
> supporting all the other stuff is pointlessly complex.

Mmmm, tasty!

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Success is not the key to happiness. Happiness is the key to success.
  -- Albert Schweitzer


Re: 1.0.0 wishlist/roadmap

Posted by Damien Katz <da...@apache.org>.
On Dec 11, 2008, at 3:43 PM, Chris Anderson wrote:

> On Fri, Dec 5, 2008 at 2:30 PM, Damien Katz <da...@apache.org> wrote:
>> Yes yes yes! This is exactly how it should work, pull all the docs  
>> in a
>> single http request and also record the last seq num. Then later
>> incrementally pull new changes using the seq num, lather rinse  
>> repeat.
>> Restore by POSTing the docs in bulk updates.
>>
>
> With the same view (_all_docs_by_seq with super_include_docs), could
> recipient-triggered replication be accomplished in a single HTTP
> request? This might speed up replication a bunch as well.
>

I'm thinking of removing the _all_docs_by_seq HTTP view and replacing  
it with something that will allow for more flexibility and also allow  
for Comet like events, by providing a filter function that finds  
documents that meet a criteria, and to be notified immediately when  
new documents are saved that meet that criteria. This is meant to be  
used by the replicator and external indexers, but to also try to make  
it be like a regular view and while supporting all the other stuff is  
pointlessly complex.

-Damien

Re: 1.0.0 wishlist/roadmap

Posted by Chris Anderson <jc...@apache.org>.
On Fri, Dec 5, 2008 at 2:30 PM, Damien Katz <da...@apache.org> wrote:
> Yes yes yes! This is exactly how it should work, pull all the docs in a
> single http request and also record the last seq num. Then later
> incrementally pull new changes using the seq num, lather rinse repeat.
> Restore by POSTing the docs in bulk updates.
>

With the same view (_all_docs_by_seq with super_include_docs), could
recipient-triggered replication be accomplished in a single HTTP
request? This might speed up replication a bunch as well.

-- 
Chris Anderson
http://jchris.mfdz.com

Re: 1.0.0 wishlist/roadmap

Posted by Damien Katz <da...@apache.org>.
Yes yes yes! This is exactly how it should work, pull all the docs in  
a single http request and also record the last seq num. Then later  
incrementally pull new changes using the seq num, lather rinse repeat.  
Restore by POSTing the docs in bulk updates.

-Damien


On Dec 5, 2008, at 5:19 PM, Chris Anderson wrote:

> On Fri, Dec 5, 2008 at 2:08 PM, Antony Blakey  
> <an...@gmail.com> wrote:
>>
>> Backup/Restore are trivial to write, and an Erlang plugin will be  
>> easy
>
> An HTTP request against
> /db/_all_docs_by_seq?include_docs=true&all_revs=true* could be
> streamed to a file. The result would be a single JSON list of all the
> uncompacted changes. I wonder if we could get mochiweb to gzip this by
> setting the right headers?
>
> * I think we might need to add more power to include_docs before it
> can pull every _rev, but it's worth a shot.
>
> Chris
>
> -- 
> Chris Anderson
> http://jchris.mfdz.com


Re: 1.0.0 wishlist/roadmap

Posted by Chris Anderson <jc...@apache.org>.
On Fri, Dec 5, 2008 at 2:08 PM, Antony Blakey <an...@gmail.com> wrote:
>
> Backup/Restore are trivial to write, and an Erlang plugin will be easy

An HTTP request against
/db/_all_docs_by_seq?include_docs=true&all_revs=true* could be
streamed to a file. The result would be a single JSON list of all the
uncompacted changes. I wonder if we could get mochiweb to gzip this by
setting the right headers?

* I think we might need to add more power to include_docs before it
can pull every _rev, but it's worth a shot.

Chris

-- 
Chris Anderson
http://jchris.mfdz.com

Re: 1.0.0 wishlist/roadmap

Posted by Antony Blakey <an...@gmail.com>.
On 06/12/2008, at 5:45 AM, Chris Anderson wrote:

> On Fri, Dec 5, 2008 at 8:52 AM, Benoit Chesneau  
> <bc...@gmail.com> wrote:
>> Anyway so the tool should be in erlang ? I take it. I would like if
>> possible to not rely in http api for this one and just add a function
>> that could be used in erlang shell and then call by any script.
>>
>
> The tool could also be written as a couchjs script, which could just
> put the data to stdout. However, that would be using the HTTP api.
> (couchjs could use a little work to make command-line arguments
> available in the runtime, but even without changing it should be
> possible to build a proof-of-concept)

Backup/Restore are trivial to write, and an Erlang plugin will be easy  
- it could respond with a tar or zip format stream over HTTP. The  
reverse could happen for restore, although I've had problems with  
Couch accepting chunked format input, which you would want. I've not  
checked whether Couch fully streams in both directions.

Anyway, I have a commercial requirement for this, so I'll do it unless  
someone else gets there before me.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

A reasonable man adapts himself to suit his environment. An  
unreasonable man persists in attempting to adapt his environment to  
suit himself. Therefore, all progress depends on the unreasonable man.
   -- George Bernard Shaw



Re: 1.0.0 wishlist/roadmap

Posted by Chris Anderson <jc...@apache.org>.
On Fri, Dec 5, 2008 at 8:52 AM, Benoit Chesneau <bc...@gmail.com> wrote:
> Anyway so the tool should be in erlang ? I take it. I would like if
> possible to not rely in http api for this one and just add a function
> that could be used in erlang shell and then call by any script.
>

The tool could also be written as a couchjs script, which could just
put the data to stdout. However, that would be using the HTTP api.
(couchjs could use a little work to make command-line arguments
available in the runtime, but even without changing it should be
possible to build a proof-of-concept)


-- 
Chris Anderson
http://jchris.mfdz.com

Re: 1.0.0 wishlist/roadmap

Posted by Benoit Chesneau <bc...@gmail.com>.
On Fri, Dec 5, 2008 at 5:32 PM, Noah Slater <ns...@apache.org> wrote:
> On Fri, Dec 05, 2008 at 05:23:36PM +0100, Benoit Chesneau wrote:
>> > You mean, written in Erlang? We don't want to add any more dependencies.
>>
>> That's excaltly why it could be in the release, not another tool /dependancy
>> to install.
>
> No, what I mean is... the current tools are written in Python so we would need
> an Erlang replacement for them. I was simply being explicit about what the
> requirements would be.
>
>> > So, patches welcome!
>> >
>>
>> Maybe I missed the point, I thought we speak about features for 1.0. I know
>> that patches are welcome, thanks to remember me anyway.
>
> Hey, I'm just being jovial! If you really want this, you could send us a patch
> and it would easily make the next release.
>

I think i'm just tired to see such answer every time you speak about
features in an opensource project ;)

Anyway so the tool should be in erlang ? I take it. I would like if
possible to not rely in http api for this one and just add a function
that could be used in erlang shell and then call by any script.

- benoît

Re: 1.0.0 wishlist/roadmap

Posted by Noah Slater <ns...@apache.org>.
On Fri, Dec 05, 2008 at 05:23:36PM +0100, Benoit Chesneau wrote:
> > You mean, written in Erlang? We don't want to add any more dependencies.
>
> That's excaltly why it could be in the release, not another tool /dependancy
> to install.

No, what I mean is... the current tools are written in Python so we would need
an Erlang replacement for them. I was simply being explicit about what the
requirements would be.

> > So, patches welcome!
> >
>
> Maybe I missed the point, I thought we speak about features for 1.0. I know
> that patches are welcome, thanks to remember me anyway.

Hey, I'm just being jovial! If you really want this, you could send us a patch
and it would easily make the next release.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: 1.0.0 wishlist/roadmap

Posted by Benoit Chesneau <bc...@gmail.com>.
On Fri, Dec 5, 2008 at 4:17 PM, Noah Slater <ns...@apache.org> wrote:
> On Fri, Dec 05, 2008 at 04:08:16PM +0100, Benoit Chesneau wrote:
>> I would like to have some dump/restore tools *provided with* couchdb
>> release. I know there are external tool, but would be better if there
>> will be something included with couchdb.
>
> You mean, written in Erlang? We don't want to add any more dependencies.

That's excaltly why it could be in the release, not another tool
/dependancy to install.

> So, patches welcome!
>

Maybe I missed the point, I thought we speak about features for 1.0. I
know that patches are welcome, thanks to remember me anyway.


- benoît

Re: 1.0.0 wishlist/roadmap

Posted by Noah Slater <ns...@apache.org>.
On Fri, Dec 05, 2008 at 04:08:16PM +0100, Benoit Chesneau wrote:
> I would like to have some dump/restore tools *provided with* couchdb
> release. I know there are external tool, but would be better if there
> will be something included with couchdb.

You mean, written in Erlang? We don't want to add any more dependencies.

So, patches welcome!

-- 
Noah Slater, http://tumbolia.org/nslater

Re: 1.0.0 wishlist/roadmap

Posted by Benoit Chesneau <bc...@gmail.com>.
I would like to have some dump/restore tools *provided with* couchdb
release. I know there are external tool, but would be better if there
will be something included with couchdb.



- benoît

Re: 1.0.0 wishlist/roadmap

Posted by Jan Lehnardt <ja...@apache.org>.
On 3 Dec 2008, at 12:47, Volker Mische wrote:

> An additional feature would be that you can return any arbitrary  
> JSON to
> the view that will be attached to the resulting document. An example
> would be returning a distance between a point specified in the query  
> and
> a geometry in a document.

as opposed to the "rank" the protocol uses now which is "limited" to  
fulltext
search.

+1

Cheers
Jan
--
>
> Damien Katz wrote:
>> Here is some stuff I'd like to see in a 1.0.0 release. Everything is
>> open for discussion.
>>
>> - Built-in reduce functions to avoid unnecessary JS overhead -
>>
>> Count, Sum, Avg, Min, Max, Std dev. others?
>>
>> - Restrict database read access -
>>
>> Right now any user can read any database, we need to be able to  
>> restrict
>> that at least on a whole database level.
>>
>> - Replication performance enhancements -
>>
>> Adam Kocoloski has some replication patches that greatly improve
>> replication performance.
>>
>> - Revision stemming: It should be possible to limit the number of
>> revisions tracked -
>>
>> By default each document edit produces a revision id that is tracked
>> indefinitely. This guarantees conflicts versus subsequent edits can
>> always be distinguished in ad-hoc replication, however the forever
>> growing list of revisions isn't always desirable. THis can be  
>> addressed
>> by limiting the number tracked and purging the oldest revisions. The
>> downside is that if the revision tracking limited is N, then anyone  
>> who
>> hasn't replicated a document since its last N edits will see a  
>> spurious
>> edit conflict.
>>
>> - Lucene/Full-text indexing integration -
>>
>> We have this working to in side patches, this needs to be  
>> integrated to
>> trunk and with the view engine
>>
>> - Incremental document replication -
>>
>> We need at the minimum the ability to incrementally replicate only  
>> the
>> attachments that have changed in a document. This will save lots of
>> network IO and CouchDB can be version control system with document  
>> diffs
>> added as attachments.
>>
>> This can work for document fields too, but the overhead may not be  
>> worth
>> it.
>>
>> - Built-in authentication module(s) -
>>
>> The ability to host a CouchDB database used for HTTP authentication
>> schemes. If storing passwords, they would need to be stored  
>> encrypted,
>> decrypted on demand by the authentication process.
>>
>> - View server enhancements (stale/partial index option) -
>>
>> Chris Anderson has a side branch for this we need to finish and put  
>> into
>> trunk.
>>
>> - View index compaction -
>>
>> Views indexes expand forever, and need to be compacted in a similar  
>> way
>> the storage files are compacted. This work will tie into the View  
>> Server
>> enhancements.
>>
>> - Document integrity/deterministic revid -
>>
>> For the sake of end to end document integrity, we need a way to  
>> hash a
>> document's contents, and since we already have revision ids, I  
>> think the
>> revision ids should be the hashes. The hashed document should be a
>> canonical json representation, and it should have the _id and _rev
>> fields in it. The _rev will be the PREVIOUS revision ID/hash the  
>> edit is
>> based on, or blank if a new edit. Then the _rev is replaced with  
>> the new
>> hash value.
>>
>> - Fully tail append writes -
>>
>> CouchDB uses zero-overwrite storage, but not fully tail append  
>> storage.
>> Document json bodies are stored in internal buffers, written
>> consecutively, one after another until the buffers in completely  
>> full,
>> then another buffer is created at the end of the file for more
>> documents. File attachments are written to similar buffers as well.
>> Btree updates are always tail append, each update to a btree, even if
>> its a deletion, causes new writes to the end of the file. Once the
>> document, attachments and indexes are commited (fsync), the header is
>> then written and flushed to disk, and that is always stored right  
>> at the
>> beginning of the file (requiring another seek).
>>
>> Document updates to CouchDB require 2 fsyncs with ~3 seeks for full
>> committal and index consistency. This is true if you write 1 or 1000
>> documents in a single transaction (bulk update), you still need ~ 3
>> seeks. Using conventional transaction journalling, it's possible to  
>> get
>> the committal down to a single seek and fsync, and worry about  
>> ensuring
>> file and index consistency asynchronously, often in batch mode with
>> other committed updates. This can perform very well, but has  
>> downsides
>> like extra complexity and increased memory usage as data is cached
>> waiting to be flushed to disk, and must do special consistency checks
>> and fix-ups on startup if there is a crash.
>>
>> If CouchDB used tail-append storage for everything, then all document
>> updates can be completely flushed with full file consistency with a
>> single seek and, depending on the file system, a single fsync. All  
>> the
>> disk updates, documents, file attachments, indexes and file header,
>> occur as appends to the end of the file.
>>
>> The biggest changes will be in how file attachments and the headers  
>> are
>> written and read, and the performance characteristics of view  
>> indexing
>> as documents will no longer be packed into contiguous buffers.
>>
>> File attachment will be written in chunks with the last chunk being  
>> an
>> index to the other chunks.
>>
>> Headers will be specially signed blocks written to the end of the  
>> file.
>> Reading the header on database open will require scanning the file  
>> from
>> the end, since the file might have partial updates that didn't  
>> complete
>> since the last update.
>>
>> The performance of the views will be impacted as the documents are  
>> more
>> likely to be fragmented across the storage file. But they will  
>> still be
>> in the order they will be accessed for indexing, so the read seeks  
>> are
>> always moving forward. Also, the act of compacting the storage file  
>> will
>> result in the documents being tightly packed again.
>>
>> - Streaming document updates with attachment writes -
>>
>> Using mime mulitpart encoding, it should be possible to send all  
>> parts
>> of a document in a single http request, with the json and binary
>> attachments sent as different mime parts. Attachments can be  
>> streamed to
>> disk as bytes are received, keeping total memory overhead to a  
>> minimum.
>> Attachments can also be written to disk in compressed format and  
>> served
>> over http by default in that compressed format, using 0% CPU for
>> compression at read time, but will require decompression if the  
>> client
>> doesn't support the compression format.
>>
>>
>> - Partitioning/Clustering Support -
>>
>> Clustering for failover and load balancing is priority. Large  
>> database
>> support via partitioning may not make 1.0
>>
>>
>>
>>
>
>


Re: 1.0.0 wishlist/roadmap

Posted by Jan Lehnardt <ja...@apache.org>.
On 3 Dec 2008, at 12:51, Noah Slater wrote:

> On Wed, Dec 03, 2008 at 10:47:11PM +1100, Volker Mische wrote:
>> An additional feature would be that you can return any arbitrary  
>> JSON to the
>> view that will be attached to the resulting document. An example  
>> would be
>> returning a distance between a point specified in the query and a  
>> geometry in
>> a document.
>
> I'm not sure I understand what it means to attach something to a  
> JSON document.

This is about the protocol that CouchDB talks to external indexers. It  
is line-based
text now, will be line-based JSON with the external2 branch. It  
currently hardcodes
that the document id and a rank is returned, this should be more  
flexible.

Cheers
Jan
--

Re: Introduction

Posted by David Pratt <fa...@gmail.com>.
Hi Jan and Johan. I can confirm Safari 3 with Futon has been working  
fine for me on mac. I think its jquery that is used in futon which  
claims broad support for all major browsers. That said, it should  
only be a matter of fixing bugs achieve broader browser compatibility.

Regards,
David

On Dec 3, 2008, at 10:19 AM, Jan Lehnardt wrote:

> Hi Johan,
>
> welcome to CouchDB :) Security & authentication landed in trunk two  
> weeks ago, but we are just compiling a list of things we want to  
> have done for the upcoming 0.9 and the alter 1.0 release. Feel free  
> to contribute to the discussion and pick something to implement or  
> fix :) There's also the bug tracker: https://issues.apache.org/jira/ 
> browse/COUCHDB
>
> Any help is greatly appreciated.
>
> Regarding Futon: We only support Firefox 3, but Safari should work,  
> too. Patches for other browsers are welcome :)
>
> Cheers
> Jan
> --
>
>
> On 3 Dec 2008, at 13:35, Johan Montelius wrote:
>
>> Hi,
>>
>> I'm new to this list and new to couchDB. I'm senior lecturer at  
>> the Royal Institute of Technology - KTH in Stockholm Seden were I  
>> among other things give courses in distributed systems and  
>> Internet security. Since I have an interest in Erlang being used  
>> in distributed applications I stumbled on couchDB and it got me  
>> interested. I hope to be able to contribute in this project but it  
>> will of course take me some time to get familiar with th code and  
>> how things are linked together. I saw that authentication is on  
>> the wish list and this miight be something that I might be able to  
>> contribute to. Also the replication scheme and fault tollerence is  
>> of interest as well as parallelisation. I start with using the  
>> system to get abetter understanding of what is actually privided  
>> and how.
>>
>> btw: I'm an Opera user and something is obviously in conflict  
>> there, tried Konqueror also without success. I guess tweeking  
>> around different browsers jscript engines in not a high priority  
>> but if u think this is worth looking into I could give it a shot -  
>> otherwise I'll use FF as everyone else.
>>
>>  Johan
>>
>>
>> -- 
>> KTH ICT
>> Johan Montelius
>>
>> "Slåttern den var bärgad, och bilen likaså. Jag trodde att den  
>> gick men det var jag som måste gå."  Det var samma dag som  
>> brandstation brann ner - Svenne Rubins
>>
>


Re: Introduction

Posted by Jan Lehnardt <ja...@apache.org>.
Hi Johan,

welcome to CouchDB :) Security & authentication landed in trunk two  
weeks ago, but we are just compiling a list of things we want to have  
done for the upcoming 0.9 and the alter 1.0 release. Feel free to  
contribute to the discussion and pick something to implement or fix :)  
There's also the bug tracker: https://issues.apache.org/jira/browse/COUCHDB

Any help is greatly appreciated.

Regarding Futon: We only support Firefox 3, but Safari should work,  
too. Patches for other browsers are welcome :)

Cheers
Jan
--


On 3 Dec 2008, at 13:35, Johan Montelius wrote:

> Hi,
>
> I'm new to this list and new to couchDB. I'm senior lecturer at the  
> Royal Institute of Technology - KTH in Stockholm Seden were I among  
> other things give courses in distributed systems and Internet  
> security. Since I have an interest in Erlang being used in  
> distributed applications I stumbled on couchDB and it got me  
> interested. I hope to be able to contribute in this project but it  
> will of course take me some time to get familiar with th code and  
> how things are linked together. I saw that authentication is on the  
> wish list and this miight be something that I might be able to  
> contribute to. Also the replication scheme and fault tollerence is  
> of interest as well as parallelisation. I start with using the  
> system to get abetter understanding of what is actually privided and  
> how.
>
> btw: I'm an Opera user and something is obviously in conflict there,  
> tried Konqueror also without success. I guess tweeking around  
> different browsers jscript engines in not a high priority but if u  
> think this is worth looking into I could give it a shot - otherwise  
> I'll use FF as everyone else.
>
>  Johan
>
>
> -- 
> KTH ICT
> Johan Montelius
>
> "Slåttern den var bärgad, och bilen likaså. Jag trodde att den gick  
> men det var jag som måste gå."  Det var samma dag som brandstation  
> brann ner - Svenne Rubins
>


Introduction

Posted by Johan Montelius <jo...@kth.se>.
Hi,

I'm new to this list and new to couchDB. I'm senior lecturer at the Royal  
Institute of Technology - KTH in Stockholm Seden were I among other things  
give courses in distributed systems and Internet security. Since I have an  
interest in Erlang being used in distributed applications I stumbled on  
couchDB and it got me interested. I hope to be able to contribute in this  
project but it will of course take me some time to get familiar with th  
code and how things are linked together. I saw that authentication is on  
the wish list and this miight be something that I might be able to  
contribute to. Also the replication scheme and fault tollerence is of  
interest as well as parallelisation. I start with using the system to get  
abetter understanding of what is actually privided and how.

btw: I'm an Opera user and something is obviously in conflict there, tried  
Konqueror also without success. I guess tweeking around different browsers  
jscript engines in not a high priority but if u think this is worth  
looking into I could give it a shot - otherwise I'll use FF as everyone  
else.

   Johan


-- 
KTH ICT
Johan Montelius

"Slåttern den var bärgad, och bilen likaså. Jag trodde att den gick men  
det var jag som måste gå."  Det var samma dag som brandstation brann ner -  
Svenne Rubins

Re: 1.0.0 wishlist/roadmap

Posted by Volker Mische <vo...@gmail.com>.
Chris Anderson wrote:
> On Wed, Dec 3, 2008 at 5:20 AM, Volker Mische <vo...@gmail.com> wrote:
>> Noah Slater wrote:
>>> On Wed, Dec 03, 2008 at 10:47:11PM +1100, Volker Mische wrote:
>>>> An additional feature would be that you can return any arbitrary JSON to the
>>>> view that will be attached to the resulting document. An example would be
>>>> returning a distance between a point specified in the query and a geometry in
>>>> a document.
>>> I'm not sure I understand what it means to attach something to a JSON document.
>>>
>> What I meant with "attach" is, that the result (custom JSON) from the
>> external service has to end up somehow in the final output of the view.
>>
> 
> This sound like something you could do by having say, a python or ruby
> (or js) view server that would query the external service while
> processing the view. Should be doable with the current trunk.

I don't want to have the dependency on a custom view server, the
arbitrary JSON should be "attached" to the document before it reaches
the view server.


Re: 1.0.0 wishlist/roadmap

Posted by Chris Anderson <jc...@apache.org>.
On Wed, Dec 3, 2008 at 5:20 AM, Volker Mische <vo...@gmail.com> wrote:
> Noah Slater wrote:
>> On Wed, Dec 03, 2008 at 10:47:11PM +1100, Volker Mische wrote:
>>> An additional feature would be that you can return any arbitrary JSON to the
>>> view that will be attached to the resulting document. An example would be
>>> returning a distance between a point specified in the query and a geometry in
>>> a document.
>>
>> I'm not sure I understand what it means to attach something to a JSON document.
>>
>
> What I meant with "attach" is, that the result (custom JSON) from the
> external service has to end up somehow in the final output of the view.
>

This sound like something you could do by having say, a python or ruby
(or js) view server that would query the external service while
processing the view. Should be doable with the current trunk.


-- 
Chris Anderson
http://jchris.mfdz.com

Re: 1.0.0 wishlist/roadmap

Posted by David Pratt <fa...@gmail.com>.
Hello. I am currently developing with couchdb and would like to see  
the full potential of erlang used to distribute map/reduce work for  
CouchDB 1.0. This would permit launching more nodes to perform the  
work in less time (ie 10 servers = 1/10 time of 1 server).

I would like to launch more nodes while the work is required and  
return to the original number of nodes afterwards. This would be  
extremely beneficial and also consistent with circumstances where  
queueing may also be employed for imports (since this result in the  
original map/reduce work). In this circumstance, based on work volume  
in queue, n servers are launched to distribute work to handle the  
heavy IO requirement of imports and terminated when no longer needed.  
Many thanks.

Regards,
David

Re: 1.0.0 wishlist/roadmap

Posted by Jan Lehnardt <ja...@apache.org>.
On 3 Dec 2008, at 14:20, Volker Mische wrote:

> Noah Slater wrote:
>> On Wed, Dec 03, 2008 at 10:47:11PM +1100, Volker Mische wrote:
>>> An additional feature would be that you can return any arbitrary  
>>> JSON to the
>>> view that will be attached to the resulting document. An example  
>>> would be
>>> returning a distance between a point specified in the query and a  
>>> geometry in
>>> a document.
>>
>> I'm not sure I understand what it means to attach something to a  
>> JSON document.
>>
>
> What I meant with "attach" is, that the result (custom JSON) from the
> external service has to end up somehow in the final output of the  
> view.
>

Cool, then I got it wrong, sorry!

Jan
--

Re: 1.0.0 wishlist/roadmap

Posted by Volker Mische <vo...@gmail.com>.
Noah Slater wrote:
> On Wed, Dec 03, 2008 at 10:47:11PM +1100, Volker Mische wrote:
>> An additional feature would be that you can return any arbitrary JSON to the
>> view that will be attached to the resulting document. An example would be
>> returning a distance between a point specified in the query and a geometry in
>> a document.
> 
> I'm not sure I understand what it means to attach something to a JSON document.
> 

What I meant with "attach" is, that the result (custom JSON) from the
external service has to end up somehow in the final output of the view.

Re: 1.0.0 wishlist/roadmap

Posted by Noah Slater <ns...@apache.org>.
On Wed, Dec 03, 2008 at 10:47:11PM +1100, Volker Mische wrote:
> An additional feature would be that you can return any arbitrary JSON to the
> view that will be attached to the resulting document. An example would be
> returning a distance between a point specified in the query and a geometry in
> a document.

I'm not sure I understand what it means to attach something to a JSON document.

-- 
Noah Slater, http://tumbolia.org/nslater

Re: 1.0.0 wishlist/roadmap

Posted by Volker Mische <vo...@gmail.com>.
Hi,

my wishlist item might be combined with the Full-text indexing
integration. It's about the intersection between externally retrieved
results and a view.

I'd like to be able to have an external service (e.g. a spatial index)
that gets updated via the db notifier interface. It should be possible
to query it like it is done in the external2 branch, but instead of
returning a result directly to the client, the result will be returned
to the view. The returned result hast to include the document ID, so it
can be intersected with the IDs the view returns.

An additional feature would be that you can return any arbitrary JSON to
the view that will be attached to the resulting document. An example
would be returning a distance between a point specified in the query and
a geometry in a document.

Cheers,
  Volker

Damien Katz wrote:
> Here is some stuff I'd like to see in a 1.0.0 release. Everything is
> open for discussion.
> 
> - Built-in reduce functions to avoid unnecessary JS overhead -
> 
> Count, Sum, Avg, Min, Max, Std dev. others?
> 
> - Restrict database read access -
> 
> Right now any user can read any database, we need to be able to restrict
> that at least on a whole database level.
> 
> - Replication performance enhancements -
> 
> Adam Kocoloski has some replication patches that greatly improve
> replication performance.
> 
> - Revision stemming: It should be possible to limit the number of
> revisions tracked -
> 
> By default each document edit produces a revision id that is tracked
> indefinitely. This guarantees conflicts versus subsequent edits can
> always be distinguished in ad-hoc replication, however the forever
> growing list of revisions isn't always desirable. THis can be addressed
> by limiting the number tracked and purging the oldest revisions. The
> downside is that if the revision tracking limited is N, then anyone who
> hasn't replicated a document since its last N edits will see a spurious
> edit conflict.
> 
> - Lucene/Full-text indexing integration -
> 
> We have this working to in side patches, this needs to be integrated to
> trunk and with the view engine
> 
> - Incremental document replication -
> 
> We need at the minimum the ability to incrementally replicate only the
> attachments that have changed in a document. This will save lots of
> network IO and CouchDB can be version control system with document diffs
> added as attachments.
> 
> This can work for document fields too, but the overhead may not be worth
> it.
> 
> - Built-in authentication module(s) -
> 
> The ability to host a CouchDB database used for HTTP authentication
> schemes. If storing passwords, they would need to be stored encrypted,
> decrypted on demand by the authentication process.
> 
> - View server enhancements (stale/partial index option) -
> 
> Chris Anderson has a side branch for this we need to finish and put into
> trunk.
> 
> - View index compaction -
> 
> Views indexes expand forever, and need to be compacted in a similar way
> the storage files are compacted. This work will tie into the View Server
> enhancements.
> 
> - Document integrity/deterministic revid -
> 
> For the sake of end to end document integrity, we need a way to hash a
> document's contents, and since we already have revision ids, I think the
> revision ids should be the hashes. The hashed document should be a
> canonical json representation, and it should have the _id and _rev
> fields in it. The _rev will be the PREVIOUS revision ID/hash the edit is
> based on, or blank if a new edit. Then the _rev is replaced with the new
> hash value.
> 
> - Fully tail append writes -
> 
> CouchDB uses zero-overwrite storage, but not fully tail append storage.
> Document json bodies are stored in internal buffers, written
> consecutively, one after another until the buffers in completely full,
> then another buffer is created at the end of the file for more
> documents. File attachments are written to similar buffers as well.
> Btree updates are always tail append, each update to a btree, even if
> its a deletion, causes new writes to the end of the file. Once the
> document, attachments and indexes are commited (fsync), the header is
> then written and flushed to disk, and that is always stored right at the
> beginning of the file (requiring another seek).
> 
> Document updates to CouchDB require 2 fsyncs with ~3 seeks for full
> committal and index consistency. This is true if you write 1 or 1000
> documents in a single transaction (bulk update), you still need ~ 3
> seeks. Using conventional transaction journalling, it's possible to get
> the committal down to a single seek and fsync, and worry about ensuring
> file and index consistency asynchronously, often in batch mode with
> other committed updates. This can perform very well, but has downsides
> like extra complexity and increased memory usage as data is cached
> waiting to be flushed to disk, and must do special consistency checks
> and fix-ups on startup if there is a crash.
> 
> If CouchDB used tail-append storage for everything, then all document
> updates can be completely flushed with full file consistency with a
> single seek and, depending on the file system, a single fsync. All the
> disk updates, documents, file attachments, indexes and file header,
> occur as appends to the end of the file.
> 
> The biggest changes will be in how file attachments and the headers are
> written and read, and the performance characteristics of view indexing
> as documents will no longer be packed into contiguous buffers.
> 
> File attachment will be written in chunks with the last chunk being an
> index to the other chunks.
> 
> Headers will be specially signed blocks written to the end of the file.
> Reading the header on database open will require scanning the file from
> the end, since the file might have partial updates that didn't complete
> since the last update.
> 
> The performance of the views will be impacted as the documents are more
> likely to be fragmented across the storage file. But they will still be
> in the order they will be accessed for indexing, so the read seeks are
> always moving forward. Also, the act of compacting the storage file will
> result in the documents being tightly packed again.
> 
> - Streaming document updates with attachment writes -
> 
> Using mime mulitpart encoding, it should be possible to send all parts
> of a document in a single http request, with the json and binary
> attachments sent as different mime parts. Attachments can be streamed to
> disk as bytes are received, keeping total memory overhead to a minimum.
> Attachments can also be written to disk in compressed format and served
> over http by default in that compressed format, using 0% CPU for
> compression at read time, but will require decompression if the client
> doesn't support the compression format.
> 
> 
> - Partitioning/Clustering Support -
> 
> Clustering for failover and load balancing is priority. Large database
> support via partitioning may not make 1.0
> 
> 
> 
> 


Re: 1.0.0 wishlist/roadmap

Posted by Jan Lehnardt <ja...@apache.org>.
  - A native Erlang API.

Cheers
Jan
--
On 2 Dec 2008, at 20:34, Damien Katz wrote:

> Here is some stuff I'd like to see in a 1.0.0 release. Everything is  
> open for discussion.
>
> - Built-in reduce functions to avoid unnecessary JS overhead -
>
> Count, Sum, Avg, Min, Max, Std dev. others?
>
> - Restrict database read access -
>
> Right now any user can read any database, we need to be able to  
> restrict that at least on a whole database level.
>
> - Replication performance enhancements -
>
> Adam Kocoloski has some replication patches that greatly improve  
> replication performance.
>
> - Revision stemming: It should be possible to limit the number of  
> revisions tracked -
>
> By default each document edit produces a revision id that is tracked  
> indefinitely. This guarantees conflicts versus subsequent edits can  
> always be distinguished in ad-hoc replication, however the forever  
> growing list of revisions isn't always desirable. THis can be  
> addressed by limiting the number tracked and purging the oldest  
> revisions. The downside is that if the revision tracking limited is  
> N, then anyone who hasn't replicated a document since its last N  
> edits will see a spurious edit conflict.
>
> - Lucene/Full-text indexing integration -
>
> We have this working to in side patches, this needs to be integrated  
> to trunk and with the view engine
>
> - Incremental document replication -
>
> We need at the minimum the ability to incrementally replicate only  
> the attachments that have changed in a document. This will save lots  
> of network IO and CouchDB can be version control system with  
> document diffs added as attachments.
>
> This can work for document fields too, but the overhead may not be  
> worth it.
>
> - Built-in authentication module(s) -
>
> The ability to host a CouchDB database used for HTTP authentication  
> schemes. If storing passwords, they would need to be stored  
> encrypted, decrypted on demand by the authentication process.
>
> - View server enhancements (stale/partial index option) -
>
> Chris Anderson has a side branch for this we need to finish and put  
> into trunk.
>
> - View index compaction -
>
> Views indexes expand forever, and need to be compacted in a similar  
> way the storage files are compacted. This work will tie into the  
> View Server enhancements.
>
> - Document integrity/deterministic revid -
>
> For the sake of end to end document integrity, we need a way to hash  
> a document's contents, and since we already have revision ids, I  
> think the revision ids should be the hashes. The hashed document  
> should be a canonical json representation, and it should have the  
> _id and _rev fields in it. The _rev will be the PREVIOUS revision ID/ 
> hash the edit is based on, or blank if a new edit. Then the _rev is  
> replaced with the new hash value.
>
> - Fully tail append writes -
>
> CouchDB uses zero-overwrite storage, but not fully tail append  
> storage. Document json bodies are stored in internal buffers,  
> written consecutively, one after another until the buffers in  
> completely full, then another buffer is created at the end of the  
> file for more documents. File attachments are written to similar  
> buffers as well. Btree updates are always tail append, each update  
> to a btree, even if its a deletion, causes new writes to the end of  
> the file. Once the document, attachments and indexes are commited  
> (fsync), the header is then written and flushed to disk, and that is  
> always stored right at the beginning of the file (requiring another  
> seek).
>
> Document updates to CouchDB require 2 fsyncs with ~3 seeks for full  
> committal and index consistency. This is true if you write 1 or 1000  
> documents in a single transaction (bulk update), you still need ~ 3  
> seeks. Using conventional transaction journalling, it's possible to  
> get the committal down to a single seek and fsync, and worry about  
> ensuring file and index consistency asynchronously, often in batch  
> mode with other committed updates. This can perform very well, but  
> has downsides like extra complexity and increased memory usage as  
> data is cached waiting to be flushed to disk, and must do special  
> consistency checks and fix-ups on startup if there is a crash.
>
> If CouchDB used tail-append storage for everything, then all  
> document updates can be completely flushed with full file  
> consistency with a single seek and, depending on the file system, a  
> single fsync. All the disk updates, documents, file attachments,  
> indexes and file header, occur as appends to the end of the file.
>
> The biggest changes will be in how file attachments and the headers  
> are written and read, and the performance characteristics of view  
> indexing as documents will no longer be packed into contiguous  
> buffers.
>
> File attachment will be written in chunks with the last chunk being  
> an index to the other chunks.
>
> Headers will be specially signed blocks written to the end of the  
> file. Reading the header on database open will require scanning the  
> file from the end, since the file might have partial updates that  
> didn't complete since the last update.
>
> The performance of the views will be impacted as the documents are  
> more likely to be fragmented across the storage file. But they will  
> still be in the order they will be accessed for indexing, so the  
> read seeks are always moving forward. Also, the act of compacting  
> the storage file will result in the documents being tightly packed  
> again.
>
> - Streaming document updates with attachment writes -
>
> Using mime mulitpart encoding, it should be possible to send all  
> parts of a document in a single http request, with the json and  
> binary attachments sent as different mime parts. Attachments can be  
> streamed to disk as bytes are received, keeping total memory  
> overhead to a minimum. Attachments can also be written to disk in  
> compressed format and served over http by default in that compressed  
> format, using 0% CPU for compression at read time, but will require  
> decompression if the client doesn't support the compression format.
>
>
> - Partitioning/Clustering Support -
>
> Clustering for failover and load balancing is priority. Large  
> database support via partitioning may not make 1.0
>
>
>
>
>


Re: 1.0.0 wishlist/roadmap

Posted by Jedediah Smith <je...@silencegreys.com>.

Damien Katz wrote:
> - Restrict database read access -
> 
> Right now any user can read any database, we need to be able to restrict 
> that at least on a whole database level.
 >
 > - Built-in authentication module(s) -
 >
 > The ability to host a CouchDB database used for HTTP authentication
 > schemes. If storing passwords, they would need to be stored encrypted,
 > decrypted on demand by the authentication process.

I wouldn't launch with anything less than extensible, per-doc read/write 
control as this is vital for pure couch web services and AJAX apps.

I've been brainstorming this and it's tricky, especially with views. 
I'll writeup some ideas when I have time.


> - Revision stemming: It should be possible to limit the number of 
> revisions tracked -
> 
> By default each document edit produces a revision id that is tracked 
> indefinitely. This guarantees conflicts versus subsequent edits can 
> always be distinguished in ad-hoc replication, however the forever 
> growing list of revisions isn't always desirable. THis can be addressed 
> by limiting the number tracked and purging the oldest revisions. The 
> downside is that if the revision tracking limited is N, then anyone who 
> hasn't replicated a document since its last N edits will see a spurious 
> edit conflict.

/db/_purge_revisions?before_seq=12345


> The _rev will be the PREVIOUS revision ID/hash the edit is 
> based on, or blank if a new edit. Then the _rev is replaced with the new 
> hash value.

This is quite clever

Re: 1.0.0 wishlist/roadmap

Posted by Jan Lehnardt <ja...@apache.org>.
Hi Damien,

thanks for taking the time to compile these lists.

I'd like to see parallel Map/Reduce execution in 1.0. Right view index  
creation is run in a
single process where it could run in multiple processes. Would it make  
sense to spawn N
`couchjs` instances (N = number of CPUs/cores, or configurable) to run  
the map and the
reduce stages in parallel? This would move (if it isn't already) the  
bottleneck towards disk
I/O rather than JSON serialization.

Cheers
Jan
--


On 2 Dec 2008, at 20:34, Damien Katz wrote:

> Here is some stuff I'd like to see in a 1.0.0 release. Everything is  
> open for discussion.
>
> - Built-in reduce functions to avoid unnecessary JS overhead -
>
> Count, Sum, Avg, Min, Max, Std dev. others?
>
> - Restrict database read access -
>
> Right now any user can read any database, we need to be able to  
> restrict that at least on a whole database level.
>
> - Replication performance enhancements -
>
> Adam Kocoloski has some replication patches that greatly improve  
> replication performance.
>
> - Revision stemming: It should be possible to limit the number of  
> revisions tracked -
>
> By default each document edit produces a revision id that is tracked  
> indefinitely. This guarantees conflicts versus subsequent edits can  
> always be distinguished in ad-hoc replication, however the forever  
> growing list of revisions isn't always desirable. THis can be  
> addressed by limiting the number tracked and purging the oldest  
> revisions. The downside is that if the revision tracking limited is  
> N, then anyone who hasn't replicated a document since its last N  
> edits will see a spurious edit conflict.
>
> - Lucene/Full-text indexing integration -
>
> We have this working to in side patches, this needs to be integrated  
> to trunk and with the view engine
>
> - Incremental document replication -
>
> We need at the minimum the ability to incrementally replicate only  
> the attachments that have changed in a document. This will save lots  
> of network IO and CouchDB can be version control system with  
> document diffs added as attachments.
>
> This can work for document fields too, but the overhead may not be  
> worth it.
>
> - Built-in authentication module(s) -
>
> The ability to host a CouchDB database used for HTTP authentication  
> schemes. If storing passwords, they would need to be stored  
> encrypted, decrypted on demand by the authentication process.
>
> - View server enhancements (stale/partial index option) -
>
> Chris Anderson has a side branch for this we need to finish and put  
> into trunk.
>
> - View index compaction -
>
> Views indexes expand forever, and need to be compacted in a similar  
> way the storage files are compacted. This work will tie into the  
> View Server enhancements.
>
> - Document integrity/deterministic revid -
>
> For the sake of end to end document integrity, we need a way to hash  
> a document's contents, and since we already have revision ids, I  
> think the revision ids should be the hashes. The hashed document  
> should be a canonical json representation, and it should have the  
> _id and _rev fields in it. The _rev will be the PREVIOUS revision ID/ 
> hash the edit is based on, or blank if a new edit. Then the _rev is  
> replaced with the new hash value.
>
> - Fully tail append writes -
>
> CouchDB uses zero-overwrite storage, but not fully tail append  
> storage. Document json bodies are stored in internal buffers,  
> written consecutively, one after another until the buffers in  
> completely full, then another buffer is created at the end of the  
> file for more documents. File attachments are written to similar  
> buffers as well. Btree updates are always tail append, each update  
> to a btree, even if its a deletion, causes new writes to the end of  
> the file. Once the document, attachments and indexes are commited  
> (fsync), the header is then written and flushed to disk, and that is  
> always stored right at the beginning of the file (requiring another  
> seek).
>
> Document updates to CouchDB require 2 fsyncs with ~3 seeks for full  
> committal and index consistency. This is true if you write 1 or 1000  
> documents in a single transaction (bulk update), you still need ~ 3  
> seeks. Using conventional transaction journalling, it's possible to  
> get the committal down to a single seek and fsync, and worry about  
> ensuring file and index consistency asynchronously, often in batch  
> mode with other committed updates. This can perform very well, but  
> has downsides like extra complexity and increased memory usage as  
> data is cached waiting to be flushed to disk, and must do special  
> consistency checks and fix-ups on startup if there is a crash.
>
> If CouchDB used tail-append storage for everything, then all  
> document updates can be completely flushed with full file  
> consistency with a single seek and, depending on the file system, a  
> single fsync. All the disk updates, documents, file attachments,  
> indexes and file header, occur as appends to the end of the file.
>
> The biggest changes will be in how file attachments and the headers  
> are written and read, and the performance characteristics of view  
> indexing as documents will no longer be packed into contiguous  
> buffers.
>
> File attachment will be written in chunks with the last chunk being  
> an index to the other chunks.
>
> Headers will be specially signed blocks written to the end of the  
> file. Reading the header on database open will require scanning the  
> file from the end, since the file might have partial updates that  
> didn't complete since the last update.
>
> The performance of the views will be impacted as the documents are  
> more likely to be fragmented across the storage file. But they will  
> still be in the order they will be accessed for indexing, so the  
> read seeks are always moving forward. Also, the act of compacting  
> the storage file will result in the documents being tightly packed  
> again.
>
> - Streaming document updates with attachment writes -
>
> Using mime mulitpart encoding, it should be possible to send all  
> parts of a document in a single http request, with the json and  
> binary attachments sent as different mime parts. Attachments can be  
> streamed to disk as bytes are received, keeping total memory  
> overhead to a minimum. Attachments can also be written to disk in  
> compressed format and served over http by default in that compressed  
> format, using 0% CPU for compression at read time, but will require  
> decompression if the client doesn't support the compression format.
>
>
> - Partitioning/Clustering Support -
>
> Clustering for failover and load balancing is priority. Large  
> database support via partitioning may not make 1.0
>
>
>
>
>


Re: 1.0.0 wishlist/roadmap

Posted by Kerr Rainey <ke...@rokera.com>.
I think you want to have a roadmap document in the wiki that collects
together everything and then but individual tasks in jira.

--
Kerr

2008/12/4 Jan Lehnardt <ja...@apache.org>:
>
> On 3 Dec 2008, at 17:08, Jan Lehnardt wrote:
>
>>
>> On 3 Dec 2008, at 17:01, Noah Slater wrote:
>>
>>> Should we get all this down on the wiki?
>>
>> Yes. I was waiting for more input (and deferring because of lack of time).
>> Feel free :)
>
> Actually, I am unsure whether the wiki or JIRA is the right place. Any
> preference?
>
> Cheers
> Jan
> --
>

Re: 1.0.0 wishlist/roadmap

Posted by Jan Lehnardt <ja...@apache.org>.
On 3 Dec 2008, at 17:08, Jan Lehnardt wrote:

>
> On 3 Dec 2008, at 17:01, Noah Slater wrote:
>
>> Should we get all this down on the wiki?
>
> Yes. I was waiting for more input (and deferring because of lack of  
> time). Feel free :)

Actually, I am unsure whether the wiki or JIRA is the right place. Any  
preference?

Cheers
Jan
--

Re: 1.0.0 wishlist/roadmap

Posted by Jan Lehnardt <ja...@apache.org>.
On 3 Dec 2008, at 17:01, Noah Slater wrote:

> Should we get all this down on the wiki?

Yes. I was waiting for more input (and deferring because of lack of  
time). Feel free :)

Cheers
Jan
--

Re: 1.0.0 wishlist/roadmap

Posted by Noah Slater <ns...@apache.org>.
Should we get all this down on the wiki?

-- 
Noah Slater, http://tumbolia.org/nslater

Re: 1.0.0 wishlist/roadmap

Posted by Paul Davis <pa...@gmail.com>.
On Wed, Dec 3, 2008 at 10:42 AM, Adam Kocoloski
<ad...@gmail.com> wrote:
> On Dec 3, 2008, at 10:29 AM, Paul Davis wrote:
>
>> Not sure if I have this right in my head, but I AFAIK, replication is
>> pushing from one db to _bulk_docs on another. We might want to have
>> pull-oriented to account for things like NAT.
>
> Hi Paul, I think we have that covered.  Replication can be
>
> local -> local
> local -> remote
> remote -> local
> remote -> remote
>
> so if NAT is a problem just do a remote -> local, that is always initiate
> the replication from the target DB. In that case it's all GET requests and
> local writes.  Let me know if I misunderstood you.  Best, Adam
>

I haven't played with replication at all. If there's replication via
GET then I'm satisfied on that point.

Thanks,
Paul

Re: 1.0.0 wishlist/roadmap

Posted by Adam Kocoloski <ad...@gmail.com>.
On Dec 3, 2008, at 10:29 AM, Paul Davis wrote:

> Not sure if I have this right in my head, but I AFAIK, replication is
> pushing from one db to _bulk_docs on another. We might want to have
> pull-oriented to account for things like NAT.

Hi Paul, I think we have that covered.  Replication can be

local -> local
local -> remote
remote -> local
remote -> remote

so if NAT is a problem just do a remote -> local, that is always  
initiate the replication from the target DB. In that case it's all GET  
requests and local writes.  Let me know if I misunderstood you.  Best,  
Adam

Re: 1.0.0 wishlist/roadmap

Posted by Paul Davis <pa...@gmail.com>.
While we're throwing stuff out there:

Some sort of URL prettifier for couchdb hosted applications. Perhaps
some sort of method for setting up Routes type mappings?

Not sure if I have this right in my head, but I AFAIK, replication is
pushing from one db to _bulk_docs on another. We might want to have
pull-oriented to account for things like NAT.

On Wed, Dec 3, 2008 at 10:17 AM, Jan Lehnardt <ja...@apache.org> wrote:
> Adding:
>
>  - Listening on multiple network addresses: Having MochiWeb bind to multiple
> (mixed IPv4 and IPv6) IP addresses an ports would help setting CouchDB in
> highly customized setups. We have a patch for listening on a second port on
> the same IP address, this, combined with a bit of couch_config magic should
> do the trick.
>
> Cheers
> Jan
> ==
>
> On 2 Dec 2008, at 20:34, Damien Katz wrote:
>
>> Here is some stuff I'd like to see in a 1.0.0 release. Everything is open
>> for discussion.
>>
>> - Built-in reduce functions to avoid unnecessary JS overhead -
>>
>> Count, Sum, Avg, Min, Max, Std dev. others?
>>
>> - Restrict database read access -
>>
>> Right now any user can read any database, we need to be able to restrict
>> that at least on a whole database level.
>>
>> - Replication performance enhancements -
>>
>> Adam Kocoloski has some replication patches that greatly improve
>> replication performance.
>>
>> - Revision stemming: It should be possible to limit the number of
>> revisions tracked -
>>
>> By default each document edit produces a revision id that is tracked
>> indefinitely. This guarantees conflicts versus subsequent edits can always
>> be distinguished in ad-hoc replication, however the forever growing list of
>> revisions isn't always desirable. THis can be addressed by limiting the
>> number tracked and purging the oldest revisions. The downside is that if the
>> revision tracking limited is N, then anyone who hasn't replicated a document
>> since its last N edits will see a spurious edit conflict.
>>
>> - Lucene/Full-text indexing integration -
>>
>> We have this working to in side patches, this needs to be integrated to
>> trunk and with the view engine
>>
>> - Incremental document replication -
>>
>> We need at the minimum the ability to incrementally replicate only the
>> attachments that have changed in a document. This will save lots of network
>> IO and CouchDB can be version control system with document diffs added as
>> attachments.
>>
>> This can work for document fields too, but the overhead may not be worth
>> it.
>>
>> - Built-in authentication module(s) -
>>
>> The ability to host a CouchDB database used for HTTP authentication
>> schemes. If storing passwords, they would need to be stored encrypted,
>> decrypted on demand by the authentication process.
>>
>> - View server enhancements (stale/partial index option) -
>>
>> Chris Anderson has a side branch for this we need to finish and put into
>> trunk.
>>
>> - View index compaction -
>>
>> Views indexes expand forever, and need to be compacted in a similar way
>> the storage files are compacted. This work will tie into the View Server
>> enhancements.
>>
>> - Document integrity/deterministic revid -
>>
>> For the sake of end to end document integrity, we need a way to hash a
>> document's contents, and since we already have revision ids, I think the
>> revision ids should be the hashes. The hashed document should be a canonical
>> json representation, and it should have the _id and _rev fields in it. The
>> _rev will be the PREVIOUS revision ID/hash the edit is based on, or blank if
>> a new edit. Then the _rev is replaced with the new hash value.
>>
>> - Fully tail append writes -
>>
>> CouchDB uses zero-overwrite storage, but not fully tail append storage.
>> Document json bodies are stored in internal buffers, written consecutively,
>> one after another until the buffers in completely full, then another buffer
>> is created at the end of the file for more documents. File attachments are
>> written to similar buffers as well. Btree updates are always tail append,
>> each update to a btree, even if its a deletion, causes new writes to the end
>> of the file. Once the document, attachments and indexes are commited
>> (fsync), the header is then written and flushed to disk, and that is always
>> stored right at the beginning of the file (requiring another seek).
>>
>> Document updates to CouchDB require 2 fsyncs with ~3 seeks for full
>> committal and index consistency. This is true if you write 1 or 1000
>> documents in a single transaction (bulk update), you still need ~ 3 seeks.
>> Using conventional transaction journalling, it's possible to get the
>> committal down to a single seek and fsync, and worry about ensuring file and
>> index consistency asynchronously, often in batch mode with other committed
>> updates. This can perform very well, but has downsides like extra complexity
>> and increased memory usage as data is cached waiting to be flushed to disk,
>> and must do special consistency checks and fix-ups on startup if there is a
>> crash.
>>
>> If CouchDB used tail-append storage for everything, then all document
>> updates can be completely flushed with full file consistency with a single
>> seek and, depending on the file system, a single fsync. All the disk
>> updates, documents, file attachments, indexes and file header, occur as
>> appends to the end of the file.
>>
>> The biggest changes will be in how file attachments and the headers are
>> written and read, and the performance characteristics of view indexing as
>> documents will no longer be packed into contiguous buffers.
>>
>> File attachment will be written in chunks with the last chunk being an
>> index to the other chunks.
>>
>> Headers will be specially signed blocks written to the end of the file.
>> Reading the header on database open will require scanning the file from the
>> end, since the file might have partial updates that didn't complete since
>> the last update.
>>
>> The performance of the views will be impacted as the documents are more
>> likely to be fragmented across the storage file. But they will still be in
>> the order they will be accessed for indexing, so the read seeks are always
>> moving forward. Also, the act of compacting the storage file will result in
>> the documents being tightly packed again.
>>
>> - Streaming document updates with attachment writes -
>>
>> Using mime mulitpart encoding, it should be possible to send all parts of
>> a document in a single http request, with the json and binary attachments
>> sent as different mime parts. Attachments can be streamed to disk as bytes
>> are received, keeping total memory overhead to a minimum. Attachments can
>> also be written to disk in compressed format and served over http by default
>> in that compressed format, using 0% CPU for compression at read time, but
>> will require decompression if the client doesn't support the compression
>> format.
>>
>>
>> - Partitioning/Clustering Support -
>>
>> Clustering for failover and load balancing is priority. Large database
>> support via partitioning may not make 1.0
>>
>>
>>
>>
>>
>
>

Re: 1.0.0 wishlist/roadmap

Posted by Jan Lehnardt <ja...@apache.org>.
Adding:

  - Listening on multiple network addresses: Having MochiWeb bind to  
multiple (mixed IPv4 and IPv6) IP addresses an ports would help  
setting CouchDB in highly customized setups. We have a patch for  
listening on a second port on the same IP address, this, combined with  
a bit of couch_config magic should do the trick.

Cheers
Jan
==

On 2 Dec 2008, at 20:34, Damien Katz wrote:

> Here is some stuff I'd like to see in a 1.0.0 release. Everything is  
> open for discussion.
>
> - Built-in reduce functions to avoid unnecessary JS overhead -
>
> Count, Sum, Avg, Min, Max, Std dev. others?
>
> - Restrict database read access -
>
> Right now any user can read any database, we need to be able to  
> restrict that at least on a whole database level.
>
> - Replication performance enhancements -
>
> Adam Kocoloski has some replication patches that greatly improve  
> replication performance.
>
> - Revision stemming: It should be possible to limit the number of  
> revisions tracked -
>
> By default each document edit produces a revision id that is tracked  
> indefinitely. This guarantees conflicts versus subsequent edits can  
> always be distinguished in ad-hoc replication, however the forever  
> growing list of revisions isn't always desirable. THis can be  
> addressed by limiting the number tracked and purging the oldest  
> revisions. The downside is that if the revision tracking limited is  
> N, then anyone who hasn't replicated a document since its last N  
> edits will see a spurious edit conflict.
>
> - Lucene/Full-text indexing integration -
>
> We have this working to in side patches, this needs to be integrated  
> to trunk and with the view engine
>
> - Incremental document replication -
>
> We need at the minimum the ability to incrementally replicate only  
> the attachments that have changed in a document. This will save lots  
> of network IO and CouchDB can be version control system with  
> document diffs added as attachments.
>
> This can work for document fields too, but the overhead may not be  
> worth it.
>
> - Built-in authentication module(s) -
>
> The ability to host a CouchDB database used for HTTP authentication  
> schemes. If storing passwords, they would need to be stored  
> encrypted, decrypted on demand by the authentication process.
>
> - View server enhancements (stale/partial index option) -
>
> Chris Anderson has a side branch for this we need to finish and put  
> into trunk.
>
> - View index compaction -
>
> Views indexes expand forever, and need to be compacted in a similar  
> way the storage files are compacted. This work will tie into the  
> View Server enhancements.
>
> - Document integrity/deterministic revid -
>
> For the sake of end to end document integrity, we need a way to hash  
> a document's contents, and since we already have revision ids, I  
> think the revision ids should be the hashes. The hashed document  
> should be a canonical json representation, and it should have the  
> _id and _rev fields in it. The _rev will be the PREVIOUS revision ID/ 
> hash the edit is based on, or blank if a new edit. Then the _rev is  
> replaced with the new hash value.
>
> - Fully tail append writes -
>
> CouchDB uses zero-overwrite storage, but not fully tail append  
> storage. Document json bodies are stored in internal buffers,  
> written consecutively, one after another until the buffers in  
> completely full, then another buffer is created at the end of the  
> file for more documents. File attachments are written to similar  
> buffers as well. Btree updates are always tail append, each update  
> to a btree, even if its a deletion, causes new writes to the end of  
> the file. Once the document, attachments and indexes are commited  
> (fsync), the header is then written and flushed to disk, and that is  
> always stored right at the beginning of the file (requiring another  
> seek).
>
> Document updates to CouchDB require 2 fsyncs with ~3 seeks for full  
> committal and index consistency. This is true if you write 1 or 1000  
> documents in a single transaction (bulk update), you still need ~ 3  
> seeks. Using conventional transaction journalling, it's possible to  
> get the committal down to a single seek and fsync, and worry about  
> ensuring file and index consistency asynchronously, often in batch  
> mode with other committed updates. This can perform very well, but  
> has downsides like extra complexity and increased memory usage as  
> data is cached waiting to be flushed to disk, and must do special  
> consistency checks and fix-ups on startup if there is a crash.
>
> If CouchDB used tail-append storage for everything, then all  
> document updates can be completely flushed with full file  
> consistency with a single seek and, depending on the file system, a  
> single fsync. All the disk updates, documents, file attachments,  
> indexes and file header, occur as appends to the end of the file.
>
> The biggest changes will be in how file attachments and the headers  
> are written and read, and the performance characteristics of view  
> indexing as documents will no longer be packed into contiguous  
> buffers.
>
> File attachment will be written in chunks with the last chunk being  
> an index to the other chunks.
>
> Headers will be specially signed blocks written to the end of the  
> file. Reading the header on database open will require scanning the  
> file from the end, since the file might have partial updates that  
> didn't complete since the last update.
>
> The performance of the views will be impacted as the documents are  
> more likely to be fragmented across the storage file. But they will  
> still be in the order they will be accessed for indexing, so the  
> read seeks are always moving forward. Also, the act of compacting  
> the storage file will result in the documents being tightly packed  
> again.
>
> - Streaming document updates with attachment writes -
>
> Using mime mulitpart encoding, it should be possible to send all  
> parts of a document in a single http request, with the json and  
> binary attachments sent as different mime parts. Attachments can be  
> streamed to disk as bytes are received, keeping total memory  
> overhead to a minimum. Attachments can also be written to disk in  
> compressed format and served over http by default in that compressed  
> format, using 0% CPU for compression at read time, but will require  
> decompression if the client doesn't support the compression format.
>
>
> - Partitioning/Clustering Support -
>
> Clustering for failover and load balancing is priority. Large  
> database support via partitioning may not make 1.0
>
>
>
>
>