You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Karl Seguin <ka...@openmymind.net> on 2011/10/24 08:28:05 UTC

noobie questions

Hi,
I'm just starting to learn CouchDB and I've accumulated four questions.

First, I only found some vague references, but are views only updated on
read request? (assuming there's something to update). There's no built-in
mechanism to have views updated in the background say for every X
changed/new documents or X seconds?

Secondly, it seems like if you want to update a document, you need to send
the complete document over. I understand that, given CouchDB's versioning,
this makes sense. In theory, would it be possible for CouchDB to expose an
API to allow updates to specific fields, then on the backend, it would clone
the document and overwrite the field.  Again, I know that isn't possible
with the current API, I'm just wondering if there's anything that would stop
that from working. You'd essentially send over the doc_id, rev, the field
name and the new value.

Third, any bulk update or delete needs to be done in code by loop through
the result of a view? Say, I want to delete all the posts older than 1 year.
I create  view keyed by the post date, I query the view with my specific
filter, and then I loop through it deleting each document? It's pretty much
the same story for updates, but I can use the bulk update api. There's no
direct analog to delete from posts where date < ?, it's more of a select id
from posts where date < ? then delete those ids. Right?

Forth, and I'm sorry for asking this, I realize it's asked a lot, but I
couldnt' figure it out despite that...I'm trying to retrieve all of the
posts with a specific tag, sorted by date. My view looks like:

function(doc) {
  if (doc.doc_type != 'post') { return; }
  for(var tag in doc.tags) {
    emit([doc.dated, doc.tags[tag]], null);
  }
}

I was hoping that a query like this might work:

for row in db.view('application/post_by_tags', key='[{}, "blah"]'):
  print row

But it doesn't.

Thanks for the help,
Karl

Re: noobie questions

Posted by Karl Seguin <ka...@openmymind.net>.
Thanks a lot for all the answers. I got everything I wanted :)

Cheers,
Karl

On Mon, Oct 24, 2011 at 4:00 PM, CGS <cg...@gmail.com> wrote:

> Hi Karl,
>
> First of all, welcome!
>
> I am no expert, but few things I can clarify for you from what I understood
> from the documentation posted at:
>
> I. http://guide.couchdb.org/**editions/1/en/index.html<http://guide.couchdb.org/editions/1/en/index.html>
> II. mens.de/:/couchdbref (the short version of the first)
>
> 1. Views are completed at the first time the request is triggered and
> updated only when the DB is updated. So, for example, if you want a view to
> be updated at certain number of documents, you just need to make a bulk
> operation on the DB. You can find more at http://guide.couchdb.org/**
> editions/1/en/views.html<http://guide.couchdb.org/editions/1/en/views.html>(efficient lookup).
>
> 2. Depending on what language (programming, scripting etc.) you use, there
> are different application which can help you. I used only one, CouchBEAM,
> which is based on Erlang and it provides the API you ask for: retrieve a
> document, retrieve the value of a key from the document, modify only that
> value (or key) as you like it and save the document (don't forget to remove
> _rev from the modified document before saving it - I usually kept forgetting
> that) without you caring about what's behind. I don't think this is the only
> application of this type, so, just check the CouchDB list of application
> based on this product and choose which one is the most suitable for you.
>
> 3. Check page 2 (bulk operation) from II for _delete. You need two
> operations only: retrieve the _rev's for the documents you want to delete
> and a bulk operation to delete them. It is even simpler if you create a view
> to give you back all those _rev's and make a script to connect with the bulk
> operation. Another option is to keep those posts, but not to show them.
>
> 4. CouchDB is designed for web. Therefore, you can pass the keys values as
> queries in the address bar. Check the documentation I mentioned at point 1
> and you will have there some query examples like
> |/blog/_design/docs/_view/by_**date?key="2009/01/30 18:04:11"|.
>
> I hope this will help you for the start. From my experience, I can say
> there are two things you have to keep them in mind when you operate with
> CouchDB:
> A. The documents are not necessarily within the same format (containing the
> same set of keys). That is usually forgotten by the SQL developers and here
> some may find CouchDB more difficult.
> B. CouchDB has JavaScript to interface with the web. If there is something
> missing in CouchDB as functionality, there can be added by the means of
> JavaScript.
>
> Sometimes is not easy to work with CouchDB and there are things which can
> be done in SQL faster or easier, but once you get a bit of grip on CouchDB,
> you will see it's not so hard. I repeat, I am not an expert in CouchDB and I
> haven't used all its functions yet because I wasn't in need of using them
> all. But as much as I used CouchDB, JavaScript and Erlang were powerful
> enough to create what I needed.
>
> Cheers,
> CGS
>
>
>
>
>
> On 10/24/2011 08:28 AM, Karl Seguin wrote:
>
>> Hi,
>> I'm just starting to learn CouchDB and I've accumulated four questions.
>>
>> First, I only found some vague references, but are views only updated on
>> read request? (assuming there's something to update). There's no built-in
>> mechanism to have views updated in the background say for every X
>> changed/new documents or X seconds?
>>
>> Secondly, it seems like if you want to update a document, you need to send
>> the complete document over. I understand that, given CouchDB's versioning,
>> this makes sense. In theory, would it be possible for CouchDB to expose an
>> API to allow updates to specific fields, then on the backend, it would
>> clone
>> the document and overwrite the field.  Again, I know that isn't possible
>> with the current API, I'm just wondering if there's anything that would
>> stop
>> that from working. You'd essentially send over the doc_id, rev, the field
>> name and the new value.
>>
>> Third, any bulk update or delete needs to be done in code by loop through
>> the result of a view? Say, I want to delete all the posts older than 1
>> year.
>> I create  view keyed by the post date, I query the view with my specific
>> filter, and then I loop through it deleting each document? It's pretty
>> much
>> the same story for updates, but I can use the bulk update api. There's no
>> direct analog to delete from posts where date<  ?, it's more of a select
>> id
>> from posts where date<  ? then delete those ids. Right?
>>
>> Forth, and I'm sorry for asking this, I realize it's asked a lot, but I
>> couldnt' figure it out despite that...I'm trying to retrieve all of the
>> posts with a specific tag, sorted by date. My view looks like:
>>
>> function(doc) {
>>   if (doc.doc_type != 'post') { return; }
>>   for(var tag in doc.tags) {
>>     emit([doc.dated, doc.tags[tag]], null);
>>   }
>> }
>>
>> I was hoping that a query like this might work:
>>
>> for row in db.view('application/post_by_**tags', key='[{}, "blah"]'):
>>   print row
>>
>> But it doesn't.
>>
>> Thanks for the help,
>> Karl
>>
>>
>

Re: noobie questions

Posted by CGS <cg...@gmail.com>.
Hi Karl,

First of all, welcome!

I am no expert, but few things I can clarify for you from what I 
understood from the documentation posted at:

I. http://guide.couchdb.org/editions/1/en/index.html
II. mens.de/:/couchdbref (the short version of the first)

1. Views are completed at the first time the request is triggered and 
updated only when the DB is updated. So, for example, if you want a view 
to be updated at certain number of documents, you just need to make a 
bulk operation on the DB. You can find more at 
http://guide.couchdb.org/editions/1/en/views.html (efficient lookup).

2. Depending on what language (programming, scripting etc.) you use, 
there are different application which can help you. I used only one, 
CouchBEAM, which is based on Erlang and it provides the API you ask for: 
retrieve a document, retrieve the value of a key from the document, 
modify only that value (or key) as you like it and save the document 
(don't forget to remove _rev from the modified document before saving it 
- I usually kept forgetting that) without you caring about what's 
behind. I don't think this is the only application of this type, so, 
just check the CouchDB list of application based on this product and 
choose which one is the most suitable for you.

3. Check page 2 (bulk operation) from II for _delete. You need two 
operations only: retrieve the _rev's for the documents you want to 
delete and a bulk operation to delete them. It is even simpler if you 
create a view to give you back all those _rev's and make a script to 
connect with the bulk operation. Another option is to keep those posts, 
but not to show them.

4. CouchDB is designed for web. Therefore, you can pass the keys values 
as queries in the address bar. Check the documentation I mentioned at 
point 1 and you will have there some query examples like 
|/blog/_design/docs/_view/by_date?key="2009/01/30 18:04:11"|.

I hope this will help you for the start. From my experience, I can say 
there are two things you have to keep them in mind when you operate with 
CouchDB:
A. The documents are not necessarily within the same format (containing 
the same set of keys). That is usually forgotten by the SQL developers 
and here some may find CouchDB more difficult.
B. CouchDB has JavaScript to interface with the web. If there is 
something missing in CouchDB as functionality, there can be added by the 
means of JavaScript.

Sometimes is not easy to work with CouchDB and there are things which 
can be done in SQL faster or easier, but once you get a bit of grip on 
CouchDB, you will see it's not so hard. I repeat, I am not an expert in 
CouchDB and I haven't used all its functions yet because I wasn't in 
need of using them all. But as much as I used CouchDB, JavaScript and 
Erlang were powerful enough to create what I needed.

Cheers,
CGS




On 10/24/2011 08:28 AM, Karl Seguin wrote:
> Hi,
> I'm just starting to learn CouchDB and I've accumulated four questions.
>
> First, I only found some vague references, but are views only updated on
> read request? (assuming there's something to update). There's no built-in
> mechanism to have views updated in the background say for every X
> changed/new documents or X seconds?
>
> Secondly, it seems like if you want to update a document, you need to send
> the complete document over. I understand that, given CouchDB's versioning,
> this makes sense. In theory, would it be possible for CouchDB to expose an
> API to allow updates to specific fields, then on the backend, it would clone
> the document and overwrite the field.  Again, I know that isn't possible
> with the current API, I'm just wondering if there's anything that would stop
> that from working. You'd essentially send over the doc_id, rev, the field
> name and the new value.
>
> Third, any bulk update or delete needs to be done in code by loop through
> the result of a view? Say, I want to delete all the posts older than 1 year.
> I create  view keyed by the post date, I query the view with my specific
> filter, and then I loop through it deleting each document? It's pretty much
> the same story for updates, but I can use the bulk update api. There's no
> direct analog to delete from posts where date<  ?, it's more of a select id
> from posts where date<  ? then delete those ids. Right?
>
> Forth, and I'm sorry for asking this, I realize it's asked a lot, but I
> couldnt' figure it out despite that...I'm trying to retrieve all of the
> posts with a specific tag, sorted by date. My view looks like:
>
> function(doc) {
>    if (doc.doc_type != 'post') { return; }
>    for(var tag in doc.tags) {
>      emit([doc.dated, doc.tags[tag]], null);
>    }
> }
>
> I was hoping that a query like this might work:
>
> for row in db.view('application/post_by_tags', key='[{}, "blah"]'):
>    print row
>
> But it doesn't.
>
> Thanks for the help,
> Karl
>


Re: noobie questions

Posted by Alon Keren <al...@gmail.com>.
Hi,

On 24 October 2011 08:28, Karl Seguin <ka...@openmymind.net> wrote:

> Hi,
> I'm just starting to learn CouchDB and I've accumulated four questions.
>
> First, I only found some vague references, but are views only updated on
> read request? (assuming there's something to update). There's no built-in
> mechanism to have views updated in the background say for every X
> changed/new documents or X seconds?
>

Views by default are updated when queried, just before the read is
performed. You can alter how a specific query affects the view, using the
'stale' parameter (see
http://wiki.apache.org/couchdb/HTTP_view_API#Querying_Options).
To update a view every N changes, perhaps you could use the 'stale=ok' for
all queries but the N-th.
To update a view every N seconds, you could use 'stale=ok' for all queries,
while having some task running in the background that queries the view
without 'stale' every N seconds.


>
> Secondly, it seems like if you want to update a document, you need to send
> the complete document over. I understand that, given CouchDB's versioning,
> this makes sense. In theory, would it be possible for CouchDB to expose an
> API to allow updates to specific fields, then on the backend, it would
> clone
> the document and overwrite the field.  Again, I know that isn't possible
> with the current API, I'm just wondering if there's anything that would
> stop
> that from working. You'd essentially send over the doc_id, rev, the field
> name and the new value.
>

Check out Update Handlers:
http://wiki.apache.org/couchdb/Document_Update_Handlers


>
> Third, any bulk update or delete needs to be done in code by loop through
> the result of a view? Say, I want to delete all the posts older than 1
> year.
> I create  view keyed by the post date, I query the view with my specific
> filter, and then I loop through it deleting each document? It's pretty much
> the same story for updates, but I can use the bulk update api. There's no
> direct analog to delete from posts where date < ?, it's more of a select id
> from posts where date < ? then delete those ids. Right?
>

No there isn't.
You can also delete documents in bulk. See:
http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API#Modify_Multiple_Documents_With_a_Single_Request


>
> Forth, and I'm sorry for asking this, I realize it's asked a lot, but I
> couldnt' figure it out despite that...I'm trying to retrieve all of the
> posts with a specific tag, sorted by date. My view looks like:
>
> function(doc) {
>  if (doc.doc_type != 'post') { return; }
>  for(var tag in doc.tags) {
>    emit([doc.dated, doc.tags[tag]], null);
>  }
> }
>
> I was hoping that a query like this might work:
>
> for row in db.view('application/post_by_tags', key='[{}, "blah"]'):
>  print row
>
> But it doesn't.
>
>
You're not looking for a single emitted row, so using the 'key' paramter is
wrong here.
Instead, you need to supply a range of keys - using the 'startkey' and
'endkey' parameters.
You want all keys for a specific tag to be in the same key-range. So, first,
you'd want all keys of emitted rows to be prefixed by the tag-name.
Next, you want the emitted rows for each tag to have an internal order. That
means that the second parameter should be the date:

function(doc) {
 if (doc.doc_type != 'post') { return; }
 for(var tag in doc.tags) {
   emit([doc.tags[tag], doc.dated], doc);
 }
}

query it with "/?startkey=["blah"]&endkey=["blah", {}]


> Thanks for the help,
> Karl
>

Re: noobie questions

Posted by "Johannes J. Schmidt" <sc...@netzmerk.com>.
Hi Karl,

regarding your question to update a specific part of a document you
might want to read
http://wiki.apache.org/couchdb/Document_Update_Handlers

g jo
Am Montag, den 24.10.2011, 14:28 +0800 schrieb Karl Seguin:
> Hi,
> I'm just starting to learn CouchDB and I've accumulated four questions.
> 
> First, I only found some vague references, but are views only updated on
> read request? (assuming there's something to update). There's no built-in
> mechanism to have views updated in the background say for every X
> changed/new documents or X seconds?
> 
> Secondly, it seems like if you want to update a document, you need to send
> the complete document over. I understand that, given CouchDB's versioning,
> this makes sense. In theory, would it be possible for CouchDB to expose an
> API to allow updates to specific fields, then on the backend, it would clone
> the document and overwrite the field.  Again, I know that isn't possible
> with the current API, I'm just wondering if there's anything that would stop
> that from working. You'd essentially send over the doc_id, rev, the field
> name and the new value.
> 
> Third, any bulk update or delete needs to be done in code by loop through
> the result of a view? Say, I want to delete all the posts older than 1 year.
> I create  view keyed by the post date, I query the view with my specific
> filter, and then I loop through it deleting each document? It's pretty much
> the same story for updates, but I can use the bulk update api. There's no
> direct analog to delete from posts where date < ?, it's more of a select id
> from posts where date < ? then delete those ids. Right?
> 
> Forth, and I'm sorry for asking this, I realize it's asked a lot, but I
> couldnt' figure it out despite that...I'm trying to retrieve all of the
> posts with a specific tag, sorted by date. My view looks like:
> 
> function(doc) {
>   if (doc.doc_type != 'post') { return; }
>   for(var tag in doc.tags) {
>     emit([doc.dated, doc.tags[tag]], null);
>   }
> }
> 
> I was hoping that a query like this might work:
> 
> for row in db.view('application/post_by_tags', key='[{}, "blah"]'):
>   print row
> 
> But it doesn't.
> 
> Thanks for the help,
> Karl



Re: noobie questions

Posted by Jamie Talbot <ja...@jamietalbot.com>.
Hi,

Is the Wiki out of date?  From
http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views

"Note that by default views are not created and updated when a
document is saved, but rather, when they are accessed. As a result,
the first access might take some time depending on the size of your
data while CouchDB creates the view. If preferable the views can also
be updated when a document is saved using an external script that
calls the views when updates have been made."

It also lists an example of how to update views asynchronously:

http://wiki.apache.org/couchdb/Regenerating_views_on_update?action=show&redirect=RegeneratingViewsOnUpdate

You could also consider stale=update_after, which will give you a
quick return at the cost of potentially out of date data, but which
will also trigger a view update, so the next queries will potentially
be more up to date.

I'll leave Bulk Operations and View Collation for experts to answer,
but I suspect for part 4 that you'll have more joy if you emit the tag
first, then the date and collate something like

startkey=["blah", "2010-10"]&endkey=["blah", "2010-10{}"]

Cheers,

Jamie.

On Mon, Oct 24, 2011 at 17:11, Roger Rohrbach <ro...@ecstatic.com> wrote:
> I'll answer the easy question: views are updated at document insertion time.
>
> On Oct 24, 2011, at 8:28 AM, Karl Seguin wrote:
>
>> First, I only found some vague references, but are views only updated on
>> read request? (assuming there's something to update). There's no built-in
>> mechanism to have views updated in the background say for every X
>> changed/new documents or X seconds?
>
>



-- 
---
http://jamietalbot.com

Re: noobie questions

Posted by Roger Rohrbach <ro...@ecstatic.com>.
Sorry for the blatantly incorrect reply.  I've experienced the view recompilation hiccup, so I should know better.  I  *do* know better.  I'm not sure what I was smoking. 

On Oct 24, 2011, at 9:11 AM, Roger Rohrbach wrote:

> I'll answer the easy question: views are updated at document insertion time.


Re: noobie questions

Posted by Roger Rohrbach <ro...@ecstatic.com>.
I'll answer the easy question: views are updated at document insertion time.

On Oct 24, 2011, at 8:28 AM, Karl Seguin wrote:

> First, I only found some vague references, but are views only updated on
> read request? (assuming there's something to update). There's no built-in
> mechanism to have views updated in the background say for every X
> changed/new documents or X seconds?