You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@couchdb.apache.org by Damien Katz <da...@apache.org> on 2008/10/27 22:31:20 UTC

Fwd: near view state server success

I'm bringing this conversation into couchdb-dev in case others are  
interested.

-Damien

Begin forwarded message:

> From: Damien Katz <da...@apache.org>
> Date: October 27, 2008 3:43:15 PM CDT
> To: Chris Anderson <jc...@apache.org>
> Subject: Re: near view state server success
>
> Overall this looks goo,d but it looks you've gone down a little bit  
> of a wrong path withe purge_seq nums. The "clients" should only be  
> sending update seq nums in the get_updated messages, and the  
> view_group servers should never worry about the purge_seq_nums  
> stuff, only the updater process worries about it, and for now, don't  
> worry about it at all.
>
> I understand what you are saying about the gen_server stuff and  
> resending the client message. I dug down in the the gen_server code,  
> and found how to send the right message to yourself, but I think  
> it's a bad idea as it relies on under-documented things that maybe  
> could change. I think the best options is to switch to gen_server  
> and have it send a "resend" response to the client, and the client  
> resends the message.
>
> The seperate SpawnFun and SpawnArgs isn't necessary, because you can  
> use a closure to get the spawn and create the custom function:
>
> SpawnArgs = Foo(),
> SpawnFun  = fun () -> do_spawn(SpawnArgs) end,
>
> Finally I think the initial update seq num in the couch_view_group  
> server should be -1, instead of 0. That way, when the no_update  
> client asks for the group object, you won't return him a nil if the  
> updater process hasn't returned yet. Once the updater returns an  
> intial group, it will have a seq_num of 0 or greater.
>
> -Damien
>
> On Oct 24, 2008, at 9:25 PM, Chris Anderson wrote:
>
>> Damien,
>>
>> Here's the latest progress on the update=false patch.
>>
>> I've attached the diff, but you can also access the repository here:
>>
>> http://github.com/jchris/couchdb/tree/update-false
>>
>> I stopped short of converting to gen_server, because I think I don't
>> understand gen_server well enough to see how to replace
>> couch_view_group:server_loop with a gen_server. It relies on  
>> resending
>> itself the same message it just received, and the
>> couch_view:get_updated_group is synchronous, but the server_loop is
>> asynchronous.
>>
>> I attempted to pass the purge_seq into couch_view_group, and setup
>> guards so that it would only return view indexes who's SeqId and
>> PurgeSeq were both greater than or equal to the database at request
>> time. All the tests still pass, except purge, which hangs now instead
>> of failing.
>>
>> The reason seems to be that purging can make the database's update
>> sequence move backwards (based on what I'm getting in my debug logs).
>> Is that possible, or am I reading it wrong?
>>
>> If purge can move the sequence number down, then couch_view_group  
>> (and
>> not just couch_view_updater) will need to know about the +1 purge vs
>> more than one purge differences issue, so it can invalidate indexes
>> it's hanging onto.
>>
>> I've got time to work on this again, so I hope I can finish it in the
>> next few weeks. Any advice you have is appreciated.
>>
>> Chris
>>
>>
>> On Wed, Oct 8, 2008 at 2:51 AM, Chris Anderson <jc...@apache.org>  
>> wrote:
>>> Damien,
>>>
>>> I'm fairly certain I'm on the right track now. All the tests are
>>> passing, except for purge. My guess is that purging doesn't  
>>> increment
>>> the database's seq_id. We can probably treat purge_seq_id like  
>>> seq_id,
>>> in the guard clauses. I wanted to make sure I'm in the right  
>>> ballpark
>>> before I start to polish this code.
>>>
>>> To that end, the whitespace is a mess, and I haven't tested
>>> update=false (I'm fairly certain it will work). I'll definitely want
>>> to clean this up before I commit it, but I'd like to get a thumbs up
>>> from you before I put the time into polishing it.
>>>
>>> Chris
>>>
>>> --
>>> Chris Anderson
>>> http://jchris.mfdz.com
>>>
>>
>>
>>
>> -- 
>> Chris Anderson
>> http://jchris.mfdz.com
>> <upfalse.diff>
>

Re: near view state server success

Posted by Ayende Rahien <ay...@ayende.com>.

Getting a stale snapshot of the view should be very easy.Until you commit
the transaction, you can just get the current node.

On Sat, Nov 8, 2008 at 5:53 PM, Chris Anderson <jc...@apache.org> wrote:

> On Fri, Nov 7, 2008 at 11:06 PM, Ayende Rahien <ay...@ayende.com> wrote:
> > I want to be able to get a response in a limited amount of time.Even if
> that
> > response is optionally stale (marked by a header?)
> >
>
> This would be ideal. The hard part here is getting access to a stale
> view representation from disk. I'm not sure you can get  a btree root
> from the disk, that is known to be consistent, unless it just came
> from the updater process. Maybe it'd be simple enough to do... but in
> any case, it's not really the feature that this patch is working on,
> although the patch does lay the groundwork for these more complex view
> query modes.
>
> Damien maybe you can chime in about getting consistent but stale view
> indexes from the disk?
>
> > The absolute worse thing to happen is to have a request just wait for 5
> > minutes before timing out or returning a result.
> > That ties up resources across the entire network
>
> Once again, I think it really is the site operator's responsibility to
> keep any user-facing views up to a certain degree of readiness. In
> some cases, a quick error may be better than waiting 5 seconds (or 5
> minutes) for a view to generate. But in other cases, we'd send the
> error instead of generating the view, and then turn out to have the
> view generate in less than a second. There's no way to know ahead of
> time how long it will take to update a view. Luckily, keeping your
> views up to date makes all these problems go away.
>
> Chris
>
>
> --
> Chris Anderson
> http://jchris.mfdz.com
>

Re: near view state server success

Posted by Chris Anderson <jc...@apache.org>.

On Fri, Nov 7, 2008 at 11:06 PM, Ayende Rahien <ay...@ayende.com> wrote:
> I want to be able to get a response in a limited amount of time.Even if that
> response is optionally stale (marked by a header?)
>

This would be ideal. The hard part here is getting access to a stale
view representation from disk. I'm not sure you can get  a btree root
from the disk, that is known to be consistent, unless it just came
from the updater process. Maybe it'd be simple enough to do... but in
any case, it's not really the feature that this patch is working on,
although the patch does lay the groundwork for these more complex view
query modes.

Damien maybe you can chime in about getting consistent but stale view
indexes from the disk?

> The absolute worse thing to happen is to have a request just wait for 5
> minutes before timing out or returning a result.
> That ties up resources across the entire network

Once again, I think it really is the site operator's responsibility to
keep any user-facing views up to a certain degree of readiness. In
some cases, a quick error may be better than waiting 5 seconds (or 5
minutes) for a view to generate. But in other cases, we'd send the
error instead of generating the view, and then turn out to have the
view generate in less than a second. There's no way to know ahead of
time how long it will take to update a view. Luckily, keeping your
views up to date makes all these problems go away.

Chris

-- 
Chris Anderson
http://jchris.mfdz.com

Re: near view state server success

Posted by Jan Lehnardt <ja...@apache.org>.

On Nov 9, 2008, at 07:30, Ask Bjørn Hansen wrote:

> On Nov 7, 2008, at 23:06, Ayende Rahien wrote:
>
>> I want to be able to get a response in a limited amount of  
>> time.Even if that
>> response is optionally stale (marked by a header?)
>
> Just ask for the view not to be updated for the things that want a  
> fast response (and have another process occasionally update the view).

That doesn't work (at the moment) for views that have never been built  
or are
updating, this thread discusses the nature of how this feature is  
actually supposed
to work and how it is named. See previous mails :)

Cheers
Jan
--

Re: near view state server success

Posted by Ask Bjørn Hansen <as...@apache.org>.

On Nov 7, 2008, at 23:06, Ayende Rahien wrote:

> I want to be able to get a response in a limited amount of time.Even  
> if that
> response is optionally stale (marked by a header?)

Just ask for the view not to be updated for the things that want a  
fast response (and have another process occasionally update the view).

   - ask

Re: near view state server success

Posted by Ayende Rahien <ay...@ayende.com>.

I want to be able to get a response in a limited amount of time.Even if that
response is optionally stale (marked by a header?)

The absolute worse thing to happen is to have a request just wait for 5
minutes before timing out or returning a result.
That ties up resources across the entire network

On Sat, Nov 8, 2008 at 8:52 AM, Paul Davis <pa...@gmail.com>wrote:

> On Sat, Nov 8, 2008 at 1:36 AM, Chris Anderson <jc...@apache.org> wrote:
> > On Mon, Oct 27, 2008 at 1:31 PM, Damien Katz <da...@apache.org> wrote:
> >> I'm bringing this conversation into couchdb-dev in case others are
> >> interested.
> >>
> >> -Damien
> >>
> >> Begin forwarded message:
> >>
> >>> From: Damien Katz <da...@apache.org>
> >>> Date: October 27, 2008 3:43:15 PM CDT
> >>> To: Chris Anderson <jc...@apache.org>
> >>> Subject: Re: near view state server success
> >>>
> >>> Overall this looks goo,d but it looks you've gone down a little bit of
> a
> >>> wrong path withe purge_seq nums. The "clients" should only be sending
> update
> >>> seq nums in the get_updated messages, and the view_group servers should
> >>> never worry about the purge_seq_nums stuff, only the updater process
> worries
> >>> about it, and for now, don't worry about it at all.
> >
> > Cool. I pulled out the purge stuff, and I'll just ignore the failing
> > purge test for now. One of these days I'll sit down and come to
> > understand the full purge implementation, but for now it's probably
> > better to skip it.
> >
> >>>
> >>> I think the best options
> >>> is to switch to gen_server and have it send a "resend" response to the
> >>> client, and the client resends the message.
> >
> > I'm just not sure how to implement this. When I start to think it
> > through, I always come to the conclusion that the way I'm doing it now
> > is simpler. I understand that gen_server has its benefits, but
> > everything I think of ends up having a explicit receive and ! calls
> > somewhere, even if they end up living in the client around the place
> > where it expects the "resend" response.
> >
> > Keeping the interlocking receive calls in a function like
> > couch_view_group:server_loop just seems like the simplest option.
> > Maybe you see a clear way to do this with gen_server. I just can't see
> > how to go down that route without just reimplementing something like
> > couch_view_group:server_loop by another name in another module.
> >
> >>>
> >>> The seperate SpawnFun and SpawnArgs isn't necessary, because you can
> use a
> >>> closure to get the spawn and create the custom function:
> >>>
> >>> SpawnArgs = Foo(),
> >>> SpawnFun  = fun () -> do_spawn(SpawnArgs) end,
> >>>
> >
> > Duh, thanks. The code is a bit simpler now, and just as fast. Erlang
> > closures FTW.
> >
> >>> Finally I think the initial update seq num in the couch_view_group
> server
> >>> should be -1, instead of 0.
> >
> > Done.
> >
> > All these changes are available in the update-false branch of my git
> repo.
> >
> > http://github.com/jchris/couchdb/tree/update-false
> >
> >
> > Paul Davis and I had a chat on IRC about the fact that the name of
> > this feature is not a very good description for what it actually
> > provides. Basically the only win this internal cache reliably provides
> > is the possibility of lower latency (with the tradeoff of out-of-date
> > views). It doesn't give the ability to "peek" at intermediate and
> > potentially inconsistent view states while the view is building, nor
> > does it give you the ability to pull the most recent consistent view
> > state from disk if the view hasn't yet been accessed since server
> > boot.
> >
> > The best succinct name I could come up with for this feature is
> > stale=ok, because in the case of an ungenerated or uncached view, it
> > could potentially behave just like a normal view request (that is,
> > wait to respond until the view is updated.)
> >
> > I'm mostly excited about this work because it lays the foundation for
> > a way to get progress reports on currently building views. We could
> > just add another receive clause for group_status that adds the latest
> > status to the state, and then an http api for asking the view_group
> > for it's status.
> >
> > It's really too bad that gen_server doesn't provide an easy way to
> > stick a message back into the mailbox after initiating an action based
> > on it. That would make all this so simple.
> >
> > --
> > Chris Anderson
> > http://jchris.mfdz.com
> >
>
> Also, for those interested parties paying attention I'd like to see if
> we can get a raise of hands on the underlying issue that me and Chris
> spent quite some time discussing.
>
> As I see it, there are three main theoretical points:
>
> 1. (My interpretation) Give me view results with a guranteed
> millisecond response time even if it means throwing an error or just
> returning no results.
> 2. (What Chris argues for) Try and give me a quick response, but wait
> for consistent data if need be. Chris makes the good argument that
> this method relies on the dev/admin team to ensure the view generation
> is never too far out of date.
> 3. A subtle difference that only makes sense with knowledge of the
> internals, but boils down to "give me a true dirty read of a view
> generation in progress"
>
> I think all three are valid positions to take. Yet we all have our
> preconceived notions on what the feature would be used for and how it
> might be implemented. If everyone agrees on a specific interpretation
> I think it'd help me and Chris from going 15 rounds when we don't have
> a white board to draw furiously on.
>
> Paul
>

Re: near view state server success

Posted by Paul Davis <pa...@gmail.com>.

On Sat, Nov 8, 2008 at 1:36 AM, Chris Anderson <jc...@apache.org> wrote:
> On Mon, Oct 27, 2008 at 1:31 PM, Damien Katz <da...@apache.org> wrote:
>> I'm bringing this conversation into couchdb-dev in case others are
>> interested.
>>
>> -Damien
>>
>> Begin forwarded message:
>>
>>> From: Damien Katz <da...@apache.org>
>>> Date: October 27, 2008 3:43:15 PM CDT
>>> To: Chris Anderson <jc...@apache.org>
>>> Subject: Re: near view state server success
>>>
>>> Overall this looks goo,d but it looks you've gone down a little bit of a
>>> wrong path withe purge_seq nums. The "clients" should only be sending update
>>> seq nums in the get_updated messages, and the view_group servers should
>>> never worry about the purge_seq_nums stuff, only the updater process worries
>>> about it, and for now, don't worry about it at all.
>
> Cool. I pulled out the purge stuff, and I'll just ignore the failing
> purge test for now. One of these days I'll sit down and come to
> understand the full purge implementation, but for now it's probably
> better to skip it.
>
>>>
>>> I think the best options
>>> is to switch to gen_server and have it send a "resend" response to the
>>> client, and the client resends the message.
>
> I'm just not sure how to implement this. When I start to think it
> through, I always come to the conclusion that the way I'm doing it now
> is simpler. I understand that gen_server has its benefits, but
> everything I think of ends up having a explicit receive and ! calls
> somewhere, even if they end up living in the client around the place
> where it expects the "resend" response.
>
> Keeping the interlocking receive calls in a function like
> couch_view_group:server_loop just seems like the simplest option.
> Maybe you see a clear way to do this with gen_server. I just can't see
> how to go down that route without just reimplementing something like
> couch_view_group:server_loop by another name in another module.
>
>>>
>>> The seperate SpawnFun and SpawnArgs isn't necessary, because you can use a
>>> closure to get the spawn and create the custom function:
>>>
>>> SpawnArgs = Foo(),
>>> SpawnFun  = fun () -> do_spawn(SpawnArgs) end,
>>>
>
> Duh, thanks. The code is a bit simpler now, and just as fast. Erlang
> closures FTW.
>
>>> Finally I think the initial update seq num in the couch_view_group server
>>> should be -1, instead of 0.
>
> Done.
>
> All these changes are available in the update-false branch of my git repo.
>
> http://github.com/jchris/couchdb/tree/update-false
>
>
> Paul Davis and I had a chat on IRC about the fact that the name of
> this feature is not a very good description for what it actually
> provides. Basically the only win this internal cache reliably provides
> is the possibility of lower latency (with the tradeoff of out-of-date
> views). It doesn't give the ability to "peek" at intermediate and
> potentially inconsistent view states while the view is building, nor
> does it give you the ability to pull the most recent consistent view
> state from disk if the view hasn't yet been accessed since server
> boot.
>
> The best succinct name I could come up with for this feature is
> stale=ok, because in the case of an ungenerated or uncached view, it
> could potentially behave just like a normal view request (that is,
> wait to respond until the view is updated.)
>
> I'm mostly excited about this work because it lays the foundation for
> a way to get progress reports on currently building views. We could
> just add another receive clause for group_status that adds the latest
> status to the state, and then an http api for asking the view_group
> for it's status.
>
> It's really too bad that gen_server doesn't provide an easy way to
> stick a message back into the mailbox after initiating an action based
> on it. That would make all this so simple.
>
> --
> Chris Anderson
> http://jchris.mfdz.com
>

Also, for those interested parties paying attention I'd like to see if
we can get a raise of hands on the underlying issue that me and Chris
spent quite some time discussing.

As I see it, there are three main theoretical points:

1. (My interpretation) Give me view results with a guranteed
millisecond response time even if it means throwing an error or just
returning no results.
2. (What Chris argues for) Try and give me a quick response, but wait
for consistent data if need be. Chris makes the good argument that
this method relies on the dev/admin team to ensure the view generation
is never too far out of date.
3. A subtle difference that only makes sense with knowledge of the
internals, but boils down to "give me a true dirty read of a view
generation in progress"

I think all three are valid positions to take. Yet we all have our
preconceived notions on what the feature would be used for and how it
might be implemented. If everyone agrees on a specific interpretation
I think it'd help me and Chris from going 15 rounds when we don't have
a white board to draw furiously on.

Paul

Re: near view state server success

Posted by Chris Anderson <jc...@apache.org>.

On Mon, Oct 27, 2008 at 1:31 PM, Damien Katz <da...@apache.org> wrote:
> I'm bringing this conversation into couchdb-dev in case others are
> interested.
>
> -Damien
>
> Begin forwarded message:
>
>> From: Damien Katz <da...@apache.org>
>> Date: October 27, 2008 3:43:15 PM CDT
>> To: Chris Anderson <jc...@apache.org>
>> Subject: Re: near view state server success
>>
>> Overall this looks goo,d but it looks you've gone down a little bit of a
>> wrong path withe purge_seq nums. The "clients" should only be sending update
>> seq nums in the get_updated messages, and the view_group servers should
>> never worry about the purge_seq_nums stuff, only the updater process worries
>> about it, and for now, don't worry about it at all.

Cool. I pulled out the purge stuff, and I'll just ignore the failing
purge test for now. One of these days I'll sit down and come to
understand the full purge implementation, but for now it's probably
better to skip it.

>>
>> I think the best options
>> is to switch to gen_server and have it send a "resend" response to the
>> client, and the client resends the message.

I'm just not sure how to implement this. When I start to think it
through, I always come to the conclusion that the way I'm doing it now
is simpler. I understand that gen_server has its benefits, but
everything I think of ends up having a explicit receive and ! calls
somewhere, even if they end up living in the client around the place
where it expects the "resend" response.

Keeping the interlocking receive calls in a function like
couch_view_group:server_loop just seems like the simplest option.
Maybe you see a clear way to do this with gen_server. I just can't see
how to go down that route without just reimplementing something like
couch_view_group:server_loop by another name in another module.

>>
>> The seperate SpawnFun and SpawnArgs isn't necessary, because you can use a
>> closure to get the spawn and create the custom function:
>>
>> SpawnArgs = Foo(),
>> SpawnFun  = fun () -> do_spawn(SpawnArgs) end,
>>

Duh, thanks. The code is a bit simpler now, and just as fast. Erlang
closures FTW.

>> Finally I think the initial update seq num in the couch_view_group server
>> should be -1, instead of 0.

Done.

All these changes are available in the update-false branch of my git repo.

http://github.com/jchris/couchdb/tree/update-false

Paul Davis and I had a chat on IRC about the fact that the name of
this feature is not a very good description for what it actually
provides. Basically the only win this internal cache reliably provides
is the possibility of lower latency (with the tradeoff of out-of-date
views). It doesn't give the ability to "peek" at intermediate and
potentially inconsistent view states while the view is building, nor
does it give you the ability to pull the most recent consistent view
state from disk if the view hasn't yet been accessed since server
boot.

The best succinct name I could come up with for this feature is
stale=ok, because in the case of an ungenerated or uncached view, it
could potentially behave just like a normal view request (that is,
wait to respond until the view is updated.)

I'm mostly excited about this work because it lays the foundation for
a way to get progress reports on currently building views. We could
just add another receive clause for group_status that adds the latest
status to the state, and then an http api for asking the view_group
for it's status.

It's really too bad that gen_server doesn't provide an easy way to
stick a message back into the mailbox after initiating an action based
on it. That would make all this so simple.

-- 
Chris Anderson
http://jchris.mfdz.com