You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Damien Katz <da...@apache.org> on 2009/01/23 05:18:08 UTC

Long running task status monitoring

I just checked in code to allow the checking of the status of long  
running tasks, like view indexes and compaction.

During a long view build or compaction, if you want to see the status  
of what's happening, simply GET _active_tasks and you'll get back a  
list of JSON objects describing the currently running tasks.

Example results while 2 tasks are running:
[{"type":"Database Compaction","task":"speed","status":"Copied 10001  
of 39001 changes (25%)","pid":"<0.78.0>"},
{"type":"View Group Indexer","task":"speed _design/ 
test","status":"Processed 0 of 39001 changes (0%)","pid":"<0.91.0>"}]

We should probably add task tracking code for replication as well.

-Damien

Re: Long running task status monitoring

Posted by kowsik <ko...@gmail.com>.
Neat! If there's one thing that I'm constantly worried about on a
write heavy site, it's compaction. The file sizes do grow pretty
rapidly because of append-only-write nature of Couch. Knowing the
status can help the rest of app throttle itself if necessary.

One suggestion though, instead of preformatting the status, can you
emit actual json objects instead?

For example:

status: {
    total: 39001,
    current: 10001
}

and so on which makes it machine parsable. Otherwise the caller has to
resort to regex magic which is just painful.

Thanks,

K.

On Thu, Jan 22, 2009 at 8:18 PM, Damien Katz <da...@apache.org> wrote:
> I just checked in code to allow the checking of the status of long running
> tasks, like view indexes and compaction.
>
> During a long view build or compaction, if you want to see the status of
> what's happening, simply GET _active_tasks and you'll get back a list of
> JSON objects describing the currently running tasks.
>
> Example results while 2 tasks are running:
> [{"type":"Database Compaction","task":"speed","status":"Copied 10001 of
> 39001 changes (25%)","pid":"<0.78.0>"},
> {"type":"View Group Indexer","task":"speed _design/test","status":"Processed
> 0 of 39001 changes (0%)","pid":"<0.91.0>"}]
>
> We should probably add task tracking code for replication as well.
>
> -Damien
>

Re: Long running task status monitoring

Posted by Chris Anderson <jc...@apache.org>.
On Thu, Jan 22, 2009 at 8:18 PM, Damien Katz <da...@apache.org> wrote:
> I just checked in code to allow the checking of the status of long running
> tasks, like view indexes and compaction.
>
> During a long view build or compaction, if you want to see the status of
> what's happening, simply GET _active_tasks and you'll get back a list of
> JSON objects describing the currently running tasks.
>
> Example results while 2 tasks are running:
> [{"type":"Database Compaction","task":"speed","status":"Copied 10001 of
> 39001 changes (25%)","pid":"<0.78.0>"},
> {"type":"View Group Indexer","task":"speed _design/test","status":"Processed
> 0 of 39001 changes (0%)","pid":"<0.91.0>"}]
>
> We should probably add task tracking code for replication as well.
>

This is the dev list, so I'll add some dev notes based on my reading
of the source. Damien, please correct me if I get this wrong.

A process can register itself with the couch_task_status server, by calling

couch_task_status:add_task(Type, TaskName, StatusText)

then it can update the status as it progresses by calling

couch_task_status:update(StatusText)

It has the option to throttle its own updates (eg updates coming too
soon after previous updates are ignored) by calling
set_update_frequency(Msecs)

A task is removed from the active task list when the process that
registered is dies. Also, each process can only have a single active
task.

Adding this to replication should be straightforward, if anyone wants
to have a go at it.

Chris

-- 
Chris Anderson
http://jchris.mfdz.com

Re: Long running task status monitoring

Posted by Damien Katz <da...@apache.org>.
Yes, errors during the test suite aren't necessarily bugs. Many of the  
tests forcibly restart the database server, errors that may occur  
during these restarts can be ignored.

If you are getting the errors in production though, that's likely a bug.

-Damien


On Jan 23, 2009, at 11:44 AM, Chris Anderson wrote:

> I've been seein that exception as well, but only occasionally. My  
> first guess is that it is related to creating and destroying the  
> test suite db in such a short time span.
>
> Sent from my iPhone
>
> On Jan 23, 2009, at 2:20 AM, Robert Dionne <bo...@gmail.com>  
> wrote:
>
>> This is very cool, I'll check it out.
>>
>> After a build with yesterday's changes I started seeing these  
>> exceptions again: http://gist.github.com/50826
>>
>> This was fixed (#213) a few days ago. I lowered max_open_databases  
>> on my machine to 50 and it still occurs. It only occurs once when  
>> running the tests from Futon, which all pass. However when running  
>> the tests standalone from the runner script, you'll see two or  
>> three. Running these requires the patch I submitted to couch_js.c
>>
>> I'll try to run this down (no pun intended :)
>>
>> Bob
>>
>>
>> On Jan 22, 2009, at 11:18 PM, Damien Katz wrote:
>>
>>> I just checked in code to allow the checking of the status of long  
>>> running tasks, like view indexes and compaction.
>>>
>>> During a long view build or compaction, if you want to see the  
>>> status of what's happening, simply GET _active_tasks and you'll  
>>> get back a list of JSON objects describing the currently running  
>>> tasks.
>>>
>>> Example results while 2 tasks are running:
>>> [{"type":"Database Compaction","task":"speed","status":"Copied  
>>> 10001 of 39001 changes (25%)","pid":"<0.78.0>"},
>>> {"type":"View Group Indexer","task":"speed _design/ 
>>> test","status":"Processed 0 of 39001 changes  
>>> (0%)","pid":"<0.91.0>"}]
>>>
>>> We should probably add task tracking code for replication as well.
>>>
>>> -Damien
>>


Re: Long running task status monitoring

Posted by Chris Anderson <jc...@gmail.com>.
I've been seein that exception as well, but only occasionally. My  
first guess is that it is related to creating and destroying the test  
suite db in such a short time span.

Sent from my iPhone

On Jan 23, 2009, at 2:20 AM, Robert Dionne <bo...@gmail.com> wrote:

> This is very cool, I'll check it out.
>
> After a build with yesterday's changes I started seeing these  
> exceptions again: http://gist.github.com/50826
>
> This was fixed (#213) a few days ago. I lowered max_open_databases  
> on my machine to 50 and it still occurs. It only occurs once when  
> running the tests from Futon, which all pass. However when running  
> the tests standalone from the runner script, you'll see two or  
> three. Running these requires the patch I submitted to couch_js.c
>
> I'll try to run this down (no pun intended :)
>
> Bob
>
>
> On Jan 22, 2009, at 11:18 PM, Damien Katz wrote:
>
>> I just checked in code to allow the checking of the status of long  
>> running tasks, like view indexes and compaction.
>>
>> During a long view build or compaction, if you want to see the  
>> status of what's happening, simply GET _active_tasks and you'll get  
>> back a list of JSON objects describing the currently running tasks.
>>
>> Example results while 2 tasks are running:
>> [{"type":"Database Compaction","task":"speed","status":"Copied  
>> 10001 of 39001 changes (25%)","pid":"<0.78.0>"},
>> {"type":"View Group Indexer","task":"speed _design/ 
>> test","status":"Processed 0 of 39001 changes (0%)","pid":"<0.91.0>"}]
>>
>> We should probably add task tracking code for replication as well.
>>
>> -Damien
>

Re: Long running task status monitoring

Posted by Paul Davis <pa...@gmail.com>.
On Fri, Jan 23, 2009 at 5:20 AM, Robert Dionne <bo...@gmail.com> wrote:
> This is very cool, I'll check it out.
>
> After a build with yesterday's changes I started seeing these exceptions
> again: http://gist.github.com/50826
>
> This was fixed (#213) a few days ago. I lowered max_open_databases on my
> machine to 50 and it still occurs. It only occurs once when running the
> tests from Futon, which all pass. However when running the tests standalone
> from the runner script, you'll see two or three. Running these requires the
> patch I submitted to couch_js.c
>
> I'll try to run this down (no pun intended :)
>

My guess is that those are erlang-ism errors. As in, they're
notifications that a gen_server was forcibly killed and then it exited
with a status other than normal. I tried adding a conditional block to
couch_file.erl that just printed a single line that the gen_server was
exiting forcefully and then exited normally to see if that would
alleviate the problem but I still seemed to be getting them. I still
think that's the right part of the code to be looking at though.

HTH,
Paul Davis

> Bob
>
>
> On Jan 22, 2009, at 11:18 PM, Damien Katz wrote:
>
>> I just checked in code to allow the checking of the status of long running
>> tasks, like view indexes and compaction.
>>
>> During a long view build or compaction, if you want to see the status of
>> what's happening, simply GET _active_tasks and you'll get back a list of
>> JSON objects describing the currently running tasks.
>>
>> Example results while 2 tasks are running:
>> [{"type":"Database Compaction","task":"speed","status":"Copied 10001 of
>> 39001 changes (25%)","pid":"<0.78.0>"},
>> {"type":"View Group Indexer","task":"speed
>> _design/test","status":"Processed 0 of 39001 changes
>> (0%)","pid":"<0.91.0>"}]
>>
>> We should probably add task tracking code for replication as well.
>>
>> -Damien
>
>

Re: Long running task status monitoring

Posted by Robert Dionne <bo...@gmail.com>.
This is very cool, I'll check it out.

After a build with yesterday's changes I started seeing these  
exceptions again: http://gist.github.com/50826

This was fixed (#213) a few days ago. I lowered max_open_databases on  
my machine to 50 and it still occurs. It only occurs once when  
running the tests from Futon, which all pass. However when running  
the tests standalone from the runner script, you'll see two or three.  
Running these requires the patch I submitted to couch_js.c

I'll try to run this down (no pun intended :)

Bob


On Jan 22, 2009, at 11:18 PM, Damien Katz wrote:

> I just checked in code to allow the checking of the status of long  
> running tasks, like view indexes and compaction.
>
> During a long view build or compaction, if you want to see the  
> status of what's happening, simply GET _active_tasks and you'll get  
> back a list of JSON objects describing the currently running tasks.
>
> Example results while 2 tasks are running:
> [{"type":"Database Compaction","task":"speed","status":"Copied  
> 10001 of 39001 changes (25%)","pid":"<0.78.0>"},
> {"type":"View Group Indexer","task":"speed _design/ 
> test","status":"Processed 0 of 39001 changes (0%)","pid":"<0.91.0>"}]
>
> We should probably add task tracking code for replication as well.
>
> -Damien