You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Jim Puls <ji...@nondifferentiable.com> on 2009/06/18 20:11:32 UTC

Why is CouchDB frequently re-indexing my views from scratch?

I've noticed this morning that my CouchDB (0.9.0) server is re- 
indexing all 19000 records in a database for a design doc:

View Group Indexer	events _design/Event	<0.19507.9>	Processed 10186 of  
19210 changes (53%)

Any ideas on why this might be happening? It's happening over and over  
again, killing my site. It was happily indexing incrementally until  
this morning, too, which confuses me.

-> jp


Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Paul Davis <pa...@gmail.com>.
And also check the language and design options values.

On Thu, Jun 18, 2009 at 4:10 PM, Damien Katz<da...@apache.org> wrote:
> Is it possible it's a character encoding issue? Are there any non-ascii
> chars in your view funs that are maybe getting changed in the round trip?
> Also, do the order of the views change?
>
> If it is a bug in couchdb, can you create a test case that shows the
> problem?
>
> -Damien
>
>
> On Jun 18, 2009, at 4:02 PM, Jim Puls wrote:
>
>> On Jun 18, 2009, at 12:46 PM, Paul Davis wrote:
>>
>>> On Thu, Jun 18, 2009 at 3:42 PM, Damien Katz<da...@apache.org> wrote:
>>>>
>>>>
>>>> I don't know what CouchRest is doing, but CouchDB does check for changes
>>>> in
>>>> the view, not just that the view design doc changed. It's always
>>>> possible
>>>> there is a bug in the view indexer, but I'm betting it's CouchRest
>>>> modifying
>>>> a view definition.
>>>>
>>>> -Damien
>>>>
>>>
>>> First place you might want to check is if its adding spaces or
>>> newlines to the end of a view.
>>
>> I copied the contents of the "views" field from Futon for the design doc,
>> revision 172. Then I restarted the app server and did the same for the
>> design doc, revision 174. They're byte-for-byte identical.
>>
>> -> jp
>
>

Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Chris Anderson <jc...@apache.org>.
On Thu, Jun 18, 2009 at 3:08 PM, Jim Puls<ji...@nondifferentiable.com> wrote:
> I couldn't figure out how to file a bug (seriously!) but here's how I'm
> testing it. I have a short Ruby file "test_reindexing.rb":
>
> require 'rubygems'
> require 'couchrest'
>
> class SomeModel < CouchRest::ExtendedDocument
>
>  use_database CouchRest.database!('http://localhost:5984/testing')
>
>  def self.load
>    20000.times do |i|
>      some_model = new
>      some_model.type = 'foo'
>      some_model.save
>    end
>  end
>
>  property :type
>
>  view_by :bar, :map => %q{
>      function (event) {
>        if (event.type === 'quux' && event.bar) {
>          emit(event.bar, null);
>        }
>      }
>    }
> end
>
> where I can first populate the database:
> ruby -r test_reindexing -e SomeModel.load
>
> and then I can have it query a view and see the indexing every time:
> ruby -r test_reindexing -e SomeModel.by_bar
>
> To verify that there are no changes other than the revision number and
> ordering of the views, I take the the MD5 hash of the sorted list of
> characters in the view record:
>
> ruby -r test_reindexing -e SomeModel.by_bar && curl
> http://localhost:5984/testing/_design/SomeModel 2>/dev/null | ruby -ne
> '$_.sub! /"_rev":"[^"]+",/, "";puts $_.split(//).sort.join("")' | openssl
> md5
>
> -> jp
>

> On Jun 18, 2009, at 2:20 PM, Damien Katz wrote:
>
>> Hmm, that's a bug in CouchDB IMO. Can you write a bug with a failing test
>> for that?

Damien's suggesting we should sort the views in some deterministic way
internally before taking our hash of them. I'll keep that in mind as
I'm gonna be delving into that code to implement the patch for

https://issues.apache.org/jira/browse/COUCHDB-218

I've added a note to that ticket

>>
>> -Damien
>>
>> On Jun 18, 2009, at 5:07 PM, Jim Puls wrote:
>>
>>> On Jun 18, 2009, at 1:10 PM, Damien Katz wrote:
>>>
>>>> Also, do the order of the views change?
>>>
>>> Indeed, I see when I do a raw HTTP GET of the design document that
>>> CouchRest is changing the order of the views. Futon, of course - actually my
>>> browser parsing the JSON data - was nice enough to alphabetize them for me,
>>> completely obscuring the problem.
>>>
>>> -> jp
>>>
>
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Jim Puls <ji...@nondifferentiable.com>.
I couldn't figure out how to file a bug (seriously!) but here's how  
I'm testing it. I have a short Ruby file "test_reindexing.rb":

require 'rubygems'
require 'couchrest'

class SomeModel < CouchRest::ExtendedDocument

   use_database CouchRest.database!('http://localhost:5984/testing')

   def self.load
     20000.times do |i|
       some_model = new
       some_model.type = 'foo'
       some_model.save
     end
   end

   property :type

   view_by :bar, :map => %q{
       function (event) {
         if (event.type === 'quux' && event.bar) {
           emit(event.bar, null);
         }
       }
     }
end

where I can first populate the database:
ruby -r test_reindexing -e SomeModel.load

and then I can have it query a view and see the indexing every time:
ruby -r test_reindexing -e SomeModel.by_bar

To verify that there are no changes other than the revision number and  
ordering of the views, I take the the MD5 hash of the sorted list of  
characters in the view record:

ruby -r test_reindexing -e SomeModel.by_bar && curl http://localhost:5984/testing/_design/SomeModel 
  2>/dev/null | ruby -ne '$_.sub! /"_rev":"[^"]+",/, "";puts  
$_.split(//).sort.join("")' | openssl md5

-> jp

On Jun 18, 2009, at 2:20 PM, Damien Katz wrote:

> Hmm, that's a bug in CouchDB IMO. Can you write a bug with a failing  
> test for that?
>
> -Damien
>
> On Jun 18, 2009, at 5:07 PM, Jim Puls wrote:
>
>> On Jun 18, 2009, at 1:10 PM, Damien Katz wrote:
>>
>>> Also, do the order of the views change?
>>
>> Indeed, I see when I do a raw HTTP GET of the design document that  
>> CouchRest is changing the order of the views. Futon, of course -  
>> actually my browser parsing the JSON data - was nice enough to  
>> alphabetize them for me, completely obscuring the problem.
>>
>> -> jp
>>


Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Damien Katz <da...@apache.org>.
Hmm, that's a bug in CouchDB IMO. Can you write a bug with a failing  
test for that?

-Damien

On Jun 18, 2009, at 5:07 PM, Jim Puls wrote:

> On Jun 18, 2009, at 1:10 PM, Damien Katz wrote:
>
>> Also, do the order of the views change?
>
> Indeed, I see when I do a raw HTTP GET of the design document that  
> CouchRest is changing the order of the views. Futon, of course -  
> actually my browser parsing the JSON data - was nice enough to  
> alphabetize them for me, completely obscuring the problem.
>
> -> jp
>


Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Jim Puls <ji...@nondifferentiable.com>.
On Jun 18, 2009, at 1:10 PM, Damien Katz wrote:

> Also, do the order of the views change?

Indeed, I see when I do a raw HTTP GET of the design document that  
CouchRest is changing the order of the views. Futon, of course -  
actually my browser parsing the JSON data - was nice enough to  
alphabetize them for me, completely obscuring the problem.

-> jp


Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Damien Katz <da...@apache.org>.
Is it possible it's a character encoding issue? Are there any non- 
ascii chars in your view funs that are maybe getting changed in the  
round trip? Also, do the order of the views change?

If it is a bug in couchdb, can you create a test case that shows the  
problem?

-Damien


On Jun 18, 2009, at 4:02 PM, Jim Puls wrote:

> On Jun 18, 2009, at 12:46 PM, Paul Davis wrote:
>
>> On Thu, Jun 18, 2009 at 3:42 PM, Damien Katz<da...@apache.org>  
>> wrote:
>>>
>>>
>>> I don't know what CouchRest is doing, but CouchDB does check for  
>>> changes in
>>> the view, not just that the view design doc changed. It's always  
>>> possible
>>> there is a bug in the view indexer, but I'm betting it's CouchRest  
>>> modifying
>>> a view definition.
>>>
>>> -Damien
>>>
>>
>> First place you might want to check is if its adding spaces or
>> newlines to the end of a view.
>
> I copied the contents of the "views" field from Futon for the design  
> doc, revision 172. Then I restarted the app server and did the same  
> for the design doc, revision 174. They're byte-for-byte identical.
>
> -> jp


Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Jim Puls <ji...@nondifferentiable.com>.
On Jun 18, 2009, at 12:46 PM, Paul Davis wrote:

> On Thu, Jun 18, 2009 at 3:42 PM, Damien Katz<da...@apache.org> wrote:
>>
>>
>> I don't know what CouchRest is doing, but CouchDB does check for  
>> changes in
>> the view, not just that the view design doc changed. It's always  
>> possible
>> there is a bug in the view indexer, but I'm betting it's CouchRest  
>> modifying
>> a view definition.
>>
>> -Damien
>>
>
> First place you might want to check is if its adding spaces or
> newlines to the end of a view.

I copied the contents of the "views" field from Futon for the design  
doc, revision 172. Then I restarted the app server and did the same  
for the design doc, revision 174. They're byte-for-byte identical.

-> jp

Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Paul Davis <pa...@gmail.com>.
First place you might want to check is if its adding spaces or
newlines to the end of a view.

On Thu, Jun 18, 2009 at 3:42 PM, Damien Katz<da...@apache.org> wrote:
>
> On Jun 18, 2009, at 3:32 PM, Jim Puls wrote:
>
>> On Jun 18, 2009, at 11:53 AM, Damien Katz wrote:
>>
>>> On Jun 18, 2009, at 2:42 PM, Nadav Samet wrote:
>>>
>>>> Does the design doc change? Views are regenerated following any edit of
>>>> the
>>>> design doc that contains them.
>>>
>>> Yes, good question. Though one small correction, it's not just any edit
>>> of the design document, only a change to a view will trigger the reindexing.
>>> You can edit other parts of a design doc just fine.
>>>
>>> -Damien
>>
>> It appears that every time my server restarts, CouchRest is re-saving the
>> design doc for its models. This brings up two questions:
>>
>> 1. Should CouchRest be doing this? Should I find a way around it so that
>> the design docs don't get re-saved unless they change? I seem to recall that
>> previous versions took a hash of the view contents to avoid this very
>> problem.
>> 2. Should CouchDB be reindexing views that were re-saved, unchanged from
>> their previous content?
>>
>
> I don't know what CouchRest is doing, but CouchDB does check for changes in
> the view, not just that the view design doc changed. It's always possible
> there is a bug in the view indexer, but I'm betting it's CouchRest modifying
> a view definition.
>
> -Damien
>

Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Damien Katz <da...@apache.org>.
On Jun 18, 2009, at 3:32 PM, Jim Puls wrote:

> On Jun 18, 2009, at 11:53 AM, Damien Katz wrote:
>
>> On Jun 18, 2009, at 2:42 PM, Nadav Samet wrote:
>>
>>> Does the design doc change? Views are regenerated following any  
>>> edit of the
>>> design doc that contains them.
>>
>> Yes, good question. Though one small correction, it's not just any  
>> edit of the design document, only a change to a view will trigger  
>> the reindexing. You can edit other parts of a design doc just fine.
>>
>> -Damien
>
> It appears that every time my server restarts, CouchRest is re- 
> saving the design doc for its models. This brings up two questions:
>
> 1. Should CouchRest be doing this? Should I find a way around it so  
> that the design docs don't get re-saved unless they change? I seem  
> to recall that previous versions took a hash of the view contents to  
> avoid this very problem.
> 2. Should CouchDB be reindexing views that were re-saved, unchanged  
> from their previous content?
>

I don't know what CouchRest is doing, but CouchDB does check for  
changes in the view, not just that the view design doc changed. It's  
always possible there is a bug in the view indexer, but I'm betting  
it's CouchRest modifying a view definition.

-Damien

Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Chris Anderson <jc...@apache.org>.
On Thu, Jun 18, 2009 at 12:32 PM, Jim Puls<ji...@nondifferentiable.com> wrote:
> On Jun 18, 2009, at 11:53 AM, Damien Katz wrote:
>
>> On Jun 18, 2009, at 2:42 PM, Nadav Samet wrote:
>>
>>> Does the design doc change? Views are regenerated following any edit of
>>> the
>>> design doc that contains them.
>>
>> Yes, good question. Though one small correction, it's not just any edit of
>> the design document, only a change to a view will trigger the reindexing.
>> You can edit other parts of a design doc just fine.
>>
>> -Damien
>
> It appears that every time my server restarts, CouchRest is re-saving the
> design doc for its models. This brings up two questions:
>
> 1. Should CouchRest be doing this? Should I find a way around it so that the
> design docs don't get re-saved unless they change? I seem to recall that
> previous versions took a hash of the view contents to avoid this very
> problem.
> 2. Should CouchDB be reindexing views that were re-saved, unchanged from
> their previous content?
>

This is a known CouchRest issue. It generates the design docs from the
model, and the process is often different for no good reason. I
haven't dug in to figure out why or what to do about it, as I don't
use that feature anymore (I prefer to manage my views by hand using
couchapp).

CouchRest is probably changing something subtle with the model each
time it boots. It's a fairly serious issue for an ORM library, but as
I said I don't use that feature, so I'm not in a position to change
it. Generally the process of writing and using CouchRest was enough to
convince me that I don't want to deal such a thick library around
CouchDB. I still use CouchRest but I stick with the raw Document
class, myself.

Chris


-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Jim Puls <ji...@nondifferentiable.com>.
On Jun 18, 2009, at 11:53 AM, Damien Katz wrote:

> On Jun 18, 2009, at 2:42 PM, Nadav Samet wrote:
>
>> Does the design doc change? Views are regenerated following any  
>> edit of the
>> design doc that contains them.
>
> Yes, good question. Though one small correction, it's not just any  
> edit of the design document, only a change to a view will trigger  
> the reindexing. You can edit other parts of a design doc just fine.
>
> -Damien

It appears that every time my server restarts, CouchRest is re-saving  
the design doc for its models. This brings up two questions:

1. Should CouchRest be doing this? Should I find a way around it so  
that the design docs don't get re-saved unless they change? I seem to  
recall that previous versions took a hash of the view contents to  
avoid this very problem.
2. Should CouchDB be reindexing views that were re-saved, unchanged  
from their previous content?

-> jp


Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Damien Katz <da...@apache.org>.
On Jun 18, 2009, at 2:42 PM, Nadav Samet wrote:

> Does the design doc change? Views are regenerated following any edit  
> of the
> design doc that contains them.

Yes, good question. Though one small correction, it's not just any  
edit of the design document, only a change to a view will trigger the  
reindexing. You can edit other parts of a design doc just fine.

-Damien

>
> -Nadav
>
> On Thu, Jun 18, 2009 at 11:41 AM, Jim Puls  
> <ji...@nondifferentiable.com>wrote:
>
>> It finishes just fine - as far as I can tell - and then starts over a
>> (couple of?) requests for the view later.
>> -> jp
>>
>>
>> On Jun 18, 2009, at 11:34 AM, Damien Katz wrote:
>>
>> Does it ever finish? It might be a particular document it's timing  
>> out on
>>> and is restarting the indexing.
>>>
>>> -Damien
>>>
>>>
>>> On Jun 18, 2009, at 2:11 PM, Jim Puls wrote:
>>>
>>> I've noticed this morning that my CouchDB (0.9.0) server is re- 
>>> indexing
>>>> all 19000 records in a database for a design doc:
>>>>
>>>> View Group Indexer
>>>> events _design/Event
>>>> <0.19507.9>
>>>> Processed 10186 of 19210 changes (53%)
>>>>
>>>> Any ideas on why this might be happening? It's happening over and  
>>>> over
>>>> again, killing my site. It was happily indexing incrementally  
>>>> until this
>>>> morning, too, which confuses me.
>>>>
>>>> -> jp
>>>>
>>>>
>>


Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Nadav Samet <th...@gmail.com>.
Does the design doc change? Views are regenerated following any edit of the
design doc that contains them.

-Nadav

On Thu, Jun 18, 2009 at 11:41 AM, Jim Puls <ji...@nondifferentiable.com>wrote:

> It finishes just fine - as far as I can tell - and then starts over a
> (couple of?) requests for the view later.
> -> jp
>
>
> On Jun 18, 2009, at 11:34 AM, Damien Katz wrote:
>
>  Does it ever finish? It might be a particular document it's timing out on
>> and is restarting the indexing.
>>
>> -Damien
>>
>>
>> On Jun 18, 2009, at 2:11 PM, Jim Puls wrote:
>>
>>  I've noticed this morning that my CouchDB (0.9.0) server is re-indexing
>>> all 19000 records in a database for a design doc:
>>>
>>> View Group Indexer
>>> events _design/Event
>>> <0.19507.9>
>>> Processed 10186 of 19210 changes (53%)
>>>
>>> Any ideas on why this might be happening? It's happening over and over
>>> again, killing my site. It was happily indexing incrementally until this
>>> morning, too, which confuses me.
>>>
>>> -> jp
>>>
>>>
>

Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Jim Puls <ji...@nondifferentiable.com>.
It finishes just fine - as far as I can tell - and then starts over a  
(couple of?) requests for the view later.
-> jp

On Jun 18, 2009, at 11:34 AM, Damien Katz wrote:

> Does it ever finish? It might be a particular document it's timing  
> out on and is restarting the indexing.
>
> -Damien
>
>
> On Jun 18, 2009, at 2:11 PM, Jim Puls wrote:
>
>> I've noticed this morning that my CouchDB (0.9.0) server is re- 
>> indexing all 19000 records in a database for a design doc:
>>
>> View Group Indexer
>> events _design/Event
>> <0.19507.9>
>> Processed 10186 of 19210 changes (53%)
>>
>> Any ideas on why this might be happening? It's happening over and  
>> over again, killing my site. It was happily indexing incrementally  
>> until this morning, too, which confuses me.
>>
>> -> jp
>>


Re: Why is CouchDB frequently re-indexing my views from scratch?

Posted by Damien Katz <da...@apache.org>.
Does it ever finish? It might be a particular document it's timing out  
on and is restarting the indexing.

-Damien


On Jun 18, 2009, at 2:11 PM, Jim Puls wrote:

> I've noticed this morning that my CouchDB (0.9.0) server is re- 
> indexing all 19000 records in a database for a design doc:
>
> View Group Indexer
> events _design/Event
> <0.19507.9>
> Processed 10186 of 19210 changes (53%)
>
> Any ideas on why this might be happening? It's happening over and  
> over again, killing my site. It was happily indexing incrementally  
> until this morning, too, which confuses me.
>
> -> jp
>