You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Benoit Chesneau <bc...@gmail.com> on 2010/08/20 11:09:04 UTC

splitting the code in different apps or rewrite httpd layer

Hi all,

I work a lot these days around the httpd code and the more I work on
the more I think we should refactor it to make it easier to hack and
extend.  There is indeed a lot of code in one module (couch_httpd_db)
and recent issue like vhost and location rewriting could be easier to
solve if we had an http layer more organized in my opinion.

Actually we do (in 1.0.1 or trunk) :

request -> couch_httpd loop -> request_handler -> check vhost and
eventually rewrite url -> request_int -> request_db -> request
doc|request _design | request attachment | request global handler |
request misc handler

with extra level : request_design -> rewrite handler|
show|lists|update\lview ... and request_int that catch all errors and
has the responsibility to send errors if anything happend and wasn't
catched on other layers.

It could be easier. We could do it more resource oriented for example
than it is. 1 module, 1 resource. Refactoring httpd code would also
allow us to reuse more code than we do actually maybe by wrapping api.

How :

- Some times ago we started to port it using webmachine with davisp,
but we didn't finish. Maybe it's a good time ? Or do we want to follow
another way ?

- If we go on this refactoring it could be also a good time to split
couchdb in different apps : couchdb-core and couchdb foe example
(maybe couchdb-appengine ?) so we could develop independantly each
levels and make code history cleaner.


Thoughts ?


- benoit

Re: splitting the code in different apps or rewrite httpd layer

Posted by Jan Lehnardt <ja...@apache.org>.
On 23 Aug 2010, at 13:46, Benoit Chesneau wrote:

> On Mon, Aug 23, 2010 at 1:07 PM, Robert Dionne
> <di...@dionne-associates.com> wrote:
>> 
>> 
>> 
>> On Aug 22, 2010, at 4:58 PM, Mikeal Rogers wrote:
>> 
>>> One idea that was floated at least once was to replace all the code currently have on top of mochiweb directly with webmachine.
>> 
>> If I recall, Paul Davis did some prototyping work on this at one point
>> 
> 
> Yes some parts is on its repo some other on mine. But it's a 6 months
> old work now.

Does that mean you consider it a failed experiment? If yes, why? If not,
should we get some effort going to finish the code and get it into trunk?

Cheers
Jan
-- 


Re: splitting the code in different apps or rewrite httpd layer

Posted by Benoit Chesneau <bc...@gmail.com>.
On Mon, Aug 23, 2010 at 1:07 PM, Robert Dionne
<di...@dionne-associates.com> wrote:
>
>
>
> On Aug 22, 2010, at 4:58 PM, Mikeal Rogers wrote:
>
>> One idea that was floated at least once was to replace all the code currently have on top of mochiweb directly with webmachine.
>
> If I recall, Paul Davis did some prototyping work on this at one point
>

Yes some parts is on its repo some other on mine. But it's a 6 months
old work now.

- benoît

Re: splitting the code in different apps or rewrite httpd layer

Posted by Robert Dionne <di...@dionne-associates.com>.


On Aug 22, 2010, at 4:58 PM, Mikeal Rogers wrote:

> One idea that was floated at least once was to replace all the code currently have on top of mochiweb directly with webmachine.

If I recall, Paul Davis did some prototyping work on this at one point



> 
> This would make extensions and improvements follow already well defined patterns provided by webmachine.
> 
> -Mikeal
> 
> Sent from my iPhone
> 
> On Aug 20, 2010, at 2:09 AM, Benoit Chesneau <bc...@gmail.com> wrote:
> 
>> Hi all,
>> 
>> I work a lot these days around the httpd code and the more I work on
>> the more I think we should refactor it to make it easier to hack and
>> extend.  There is indeed a lot of code in one module (couch_httpd_db)
>> and recent issue like vhost and location rewriting could be easier to
>> solve if we had an http layer more organized in my opinion.
>> 
>> Actually we do (in 1.0.1 or trunk) :
>> 
>> request -> couch_httpd loop -> request_handler -> check vhost and
>> eventually rewrite url -> request_int -> request_db -> request
>> doc|request _design | request attachment | request global handler |
>> request misc handler
>> 
>> with extra level : request_design -> rewrite handler|
>> show|lists|update\lview ... and request_int that catch all errors and
>> has the responsibility to send errors if anything happend and wasn't
>> catched on other layers.
>> 
>> It could be easier. We could do it more resource oriented for example
>> than it is. 1 module, 1 resource. Refactoring httpd code would also
>> allow us to reuse more code than we do actually maybe by wrapping api.
>> 
>> How :
>> 
>> - Some times ago we started to port it using webmachine with davisp,
>> but we didn't finish. Maybe it's a good time ? Or do we want to follow
>> another way ?
>> 
>> - If we go on this refactoring it could be also a good time to split
>> couchdb in different apps : couchdb-core and couchdb foe example
>> (maybe couchdb-appengine ?) so we could develop independantly each
>> levels and make code history cleaner.
>> 
>> 
>> Thoughts ?
>> 
>> 
>> - benoit


Re: splitting the code in different apps or rewrite httpd layer

Posted by Mikeal Rogers <mi...@gmail.com>.
One idea that was floated at least once was to replace all the code currently have on top of mochiweb directly with webmachine.

This would make extensions and improvements follow already well defined patterns provided by webmachine.

-Mikeal

Sent from my iPhone

On Aug 20, 2010, at 2:09 AM, Benoit Chesneau <bc...@gmail.com> wrote:

> Hi all,
> 
> I work a lot these days around the httpd code and the more I work on
> the more I think we should refactor it to make it easier to hack and
> extend.  There is indeed a lot of code in one module (couch_httpd_db)
> and recent issue like vhost and location rewriting could be easier to
> solve if we had an http layer more organized in my opinion.
> 
> Actually we do (in 1.0.1 or trunk) :
> 
> request -> couch_httpd loop -> request_handler -> check vhost and
> eventually rewrite url -> request_int -> request_db -> request
> doc|request _design | request attachment | request global handler |
> request misc handler
> 
> with extra level : request_design -> rewrite handler|
> show|lists|update\lview ... and request_int that catch all errors and
> has the responsibility to send errors if anything happend and wasn't
> catched on other layers.
> 
> It could be easier. We could do it more resource oriented for example
> than it is. 1 module, 1 resource. Refactoring httpd code would also
> allow us to reuse more code than we do actually maybe by wrapping api.
> 
> How :
> 
> - Some times ago we started to port it using webmachine with davisp,
> but we didn't finish. Maybe it's a good time ? Or do we want to follow
> another way ?
> 
> - If we go on this refactoring it could be also a good time to split
> couchdb in different apps : couchdb-core and couchdb foe example
> (maybe couchdb-appengine ?) so we could develop independantly each
> levels and make code history cleaner.
> 
> 
> Thoughts ?
> 
> 
> - benoit

Re: splitting the code in different apps or rewrite httpd layer

Posted by Filipe David Manana <fd...@apache.org>.
On Sun, Aug 22, 2010 at 7:37 PM, Benoit Chesneau <bc...@gmail.com>wrote:

> It seems that noone except us is interrestted in that ;) Anyway I'm
> thinking that in case of indexer it would be very useful to have a
> generic way to add some kind of handler allowing any people to plug
> its own stuff to the system indeed.
>

I haven't replied before, however I'm interested in that as well.
I think right now it makes sense to split only the couch_httpd_db.erl
functions.

 Having modules with REST like names:  couch_db_resource,
couch_doc_resource, etc would sound good. I'm open to other suggestions.

>
> Also refactoring would allow us to add comments to the code which
> would help to review code.
>

+1

>
> - benoit
>



-- 
Filipe David Manana,
fdmanana@gmail.com, fdmanana@apache.org

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

Re: splitting the code in different apps or rewrite httpd layer

Posted by Klaus Trainer <kl...@web.de>.
> It seems that noone except us is interrestted in that ;)

I'm indeed very interested in that. However, as I won't be able to
contribute much to the refactoring, I didn't feel like having to say
something in this regard. So, nonetheless, here are my two cents.

Recently, I've spent a few hours diving into the source code of riak and
riak_core. In doing so, I got the impression that with regard to
modularization and organizing the codebase around the abstractions, the
Riak guys are one step ahead in respect of their codebase's evolution.

Note that I've only looked at some parts of Riak's and CouchDB's
codebase, respectively. Also note that at this point in time, my
knowledge of both Riak's and CouchDB's codebase is still quite limited.

Cheers,
Klaus


On Sun, 2010-08-22 at 20:37 +0200, Benoit Chesneau wrote:
> On Fri, Aug 20, 2010 at 1:32 PM, Volker Mische <vo...@gmail.com> wrote:
> > +1 for a refactor.
> >
> > GeoCouch duplicates a lot of code. I tried to keep the names in as similar
> > (though meaningful) to the original ones as possible to see where the
> > duplicated code is.
> >
> > I would love to see that everyone who wants a new kind of indexer just need
> > to provide the data structure and all the design document handling, updater
> > (group) handling, list functions etc is done automatically.
> >
> > Cheers,
> >  Volker
> >
> It seems that noone except us is interrestted in that ;) Anyway I'm
> thinking that in case of indexer it would be very useful to have a
> generic way to add some kind of handler allowing any people to plug
> its own stuff to the system indeed.
> 
> Also refactoring would allow us to add comments to the code which
> would help to review code.
> 
> - benoit



Re: splitting the code in different apps or rewrite httpd layer

Posted by Benoit Chesneau <bc...@gmail.com>.
On Fri, Aug 20, 2010 at 1:32 PM, Volker Mische <vo...@gmail.com> wrote:
> +1 for a refactor.
>
> GeoCouch duplicates a lot of code. I tried to keep the names in as similar
> (though meaningful) to the original ones as possible to see where the
> duplicated code is.
>
> I would love to see that everyone who wants a new kind of indexer just need
> to provide the data structure and all the design document handling, updater
> (group) handling, list functions etc is done automatically.
>
> Cheers,
>  Volker
>
It seems that noone except us is interrestted in that ;) Anyway I'm
thinking that in case of indexer it would be very useful to have a
generic way to add some kind of handler allowing any people to plug
its own stuff to the system indeed.

Also refactoring would allow us to add comments to the code which
would help to review code.

- benoit

Re: splitting the code in different apps or rewrite httpd layer

Posted by Nicolas Dufour <nr...@gmail.com>.
Hello,

I remember following your great tutorial on how to create a new
indexer (numidx).
A very good way to build a new one, but it did show the internal
architecture needs a serious refactor.
You have the impression everything is spread wherever it might fit.

Also I saw that Riak did such transformative change in their source
code and seemed to obtain a nice result with it. A clean way to
separate concerns.

I'm not an official couchdb dev but such change would really help us all.
(I can help on it too ;) ).

Thank you,

Nicolas Dufour
nrdufour@gmail.com

--
“Investment in knowledge pays the best interest.”
                               —Benjamin Franklin



On Fri, Aug 20, 2010 at 7:32 AM, Volker Mische <vo...@gmail.com> wrote:
> +1 for a refactor.
>
> GeoCouch duplicates a lot of code. I tried to keep the names in as similar
> (though meaningful) to the original ones as possible to see where the
> duplicated code is.
>
> I would love to see that everyone who wants a new kind of indexer just need
> to provide the data structure and all the design document handling, updater
> (group) handling, list functions etc is done automatically.
>
> Cheers,
>  Volker
>
> On 20.08.2010 13:06, Robert Dionne wrote:
>>
>> +1
>>
>> I would change the or in the subject line to and, .ie. do both :)
>>
>> I think this is an excellent idea and a good time to start this. At a
>> conceptual level CouchDB is dirt simple internally. This fact and it's use
>> of Erlang in my opinion should be seen as it's main advantage. One way to
>> leverage that advantage is to enable programmers who want to extend couch. I
>> know of at least three projects [1,2,3] that have done this. A good measure
>> of a successful refactor would be how much code these projects could throw
>> away.
>>
>> In my terminology prototype [3] I'm currently using bitcask for
>> persistence so I basically only extend the HTTP front end piece and need
>> programmatic access to the b-tree storage layer. All this needs to be is
>> some sort of mapping that let's one run a function over the b-tree, support
>> for ranges, and access to changes.
>>
>> Doing this is a thankless task, anyone already deeply familiar with the
>> internals would likely have little *interest* (academic, financial, etc..)
>> in it. CouchDB runs on phones now and in the cloud which is awesome and of
>> course a strong argument to maintain the simple design. As the complexity of
>> the code base increases however, the use of Erlang becomes a barrier to
>> entry.
>>
>> Best,
>>
>> Bob
>>
>> [1] http://github.com/normanb/couchdb-multiview
>> [2] http://github.com/vmx/couchdb
>> [3] http://github.com/bdionne/bitstore
>>
>>
>>
>>
>> On Aug 20, 2010, at 5:09 AM, Benoit Chesneau wrote:
>>
>>> Hi all,
>>>
>>> I work a lot these days around the httpd code and the more I work on
>>> the more I think we should refactor it to make it easier to hack and
>>> extend.  There is indeed a lot of code in one module (couch_httpd_db)
>>> and recent issue like vhost and location rewriting could be easier to
>>> solve if we had an http layer more organized in my opinion.
>>>
>>> Actually we do (in 1.0.1 or trunk) :
>>>
>>> request ->  couch_httpd loop ->  request_handler ->  check vhost and
>>> eventually rewrite url ->  request_int ->  request_db ->  request
>>> doc|request _design | request attachment | request global handler |
>>> request misc handler
>>>
>>> with extra level : request_design ->  rewrite handler|
>>> show|lists|update\lview ... and request_int that catch all errors and
>>> has the responsibility to send errors if anything happend and wasn't
>>> catched on other layers.
>>>
>>> It could be easier. We could do it more resource oriented for example
>>> than it is. 1 module, 1 resource. Refactoring httpd code would also
>>> allow us to reuse more code than we do actually maybe by wrapping api.
>>>
>>> How :
>>>
>>> - Some times ago we started to port it using webmachine with davisp,
>>> but we didn't finish. Maybe it's a good time ? Or do we want to follow
>>> another way ?
>>>
>>> - If we go on this refactoring it could be also a good time to split
>>> couchdb in different apps : couchdb-core and couchdb foe example
>>> (maybe couchdb-appengine ?) so we could develop independantly each
>>> levels and make code history cleaner.
>>>
>>>
>>> Thoughts ?
>>>
>>>
>>> - benoit
>>
>
>

Re: splitting the code in different apps or rewrite httpd layer

Posted by Volker Mische <vo...@gmail.com>.
+1 for a refactor.

GeoCouch duplicates a lot of code. I tried to keep the names in as 
similar (though meaningful) to the original ones as possible to see 
where the duplicated code is.

I would love to see that everyone who wants a new kind of indexer just 
need to provide the data structure and all the design document handling, 
updater (group) handling, list functions etc is done automatically.

Cheers,
   Volker

On 20.08.2010 13:06, Robert Dionne wrote:
> +1
>
> I would change the or in the subject line to and, .ie. do both :)
>
> I think this is an excellent idea and a good time to start this. At a conceptual level CouchDB is dirt simple internally. This fact and it's use of Erlang in my opinion should be seen as it's main advantage. One way to leverage that advantage is to enable programmers who want to extend couch. I know of at least three projects [1,2,3] that have done this. A good measure of a successful refactor would be how much code these projects could throw away.
>
> In my terminology prototype [3] I'm currently using bitcask for persistence so I basically only extend the HTTP front end piece and need programmatic access to the b-tree storage layer. All this needs to be is some sort of mapping that let's one run a function over the b-tree, support for ranges, and access to changes.
>
> Doing this is a thankless task, anyone already deeply familiar with the internals would likely have little *interest* (academic, financial, etc..) in it. CouchDB runs on phones now and in the cloud which is awesome and of course a strong argument to maintain the simple design. As the complexity of the code base increases however, the use of Erlang becomes a barrier to entry.
>
> Best,
>
> Bob
>
> [1] http://github.com/normanb/couchdb-multiview
> [2] http://github.com/vmx/couchdb
> [3] http://github.com/bdionne/bitstore
>
>
>
>
> On Aug 20, 2010, at 5:09 AM, Benoit Chesneau wrote:
>
>> Hi all,
>>
>> I work a lot these days around the httpd code and the more I work on
>> the more I think we should refactor it to make it easier to hack and
>> extend.  There is indeed a lot of code in one module (couch_httpd_db)
>> and recent issue like vhost and location rewriting could be easier to
>> solve if we had an http layer more organized in my opinion.
>>
>> Actually we do (in 1.0.1 or trunk) :
>>
>> request ->  couch_httpd loop ->  request_handler ->  check vhost and
>> eventually rewrite url ->  request_int ->  request_db ->  request
>> doc|request _design | request attachment | request global handler |
>> request misc handler
>>
>> with extra level : request_design ->  rewrite handler|
>> show|lists|update\lview ... and request_int that catch all errors and
>> has the responsibility to send errors if anything happend and wasn't
>> catched on other layers.
>>
>> It could be easier. We could do it more resource oriented for example
>> than it is. 1 module, 1 resource. Refactoring httpd code would also
>> allow us to reuse more code than we do actually maybe by wrapping api.
>>
>> How :
>>
>> - Some times ago we started to port it using webmachine with davisp,
>> but we didn't finish. Maybe it's a good time ? Or do we want to follow
>> another way ?
>>
>> - If we go on this refactoring it could be also a good time to split
>> couchdb in different apps : couchdb-core and couchdb foe example
>> (maybe couchdb-appengine ?) so we could develop independantly each
>> levels and make code history cleaner.
>>
>>
>> Thoughts ?
>>
>>
>> - benoit
>


Re: splitting the code in different apps or rewrite httpd layer

Posted by Robert Dionne <di...@dionne-associates.com>.
+1

I would change the or in the subject line to and, .ie. do both :)

I think this is an excellent idea and a good time to start this. At a conceptual level CouchDB is dirt simple internally. This fact and it's use of Erlang in my opinion should be seen as it's main advantage. One way to leverage that advantage is to enable programmers who want to extend couch. I know of at least three projects [1,2,3] that have done this. A good measure of a successful refactor would be how much code these projects could throw away. 

In my terminology prototype [3] I'm currently using bitcask for persistence so I basically only extend the HTTP front end piece and need programmatic access to the b-tree storage layer. All this needs to be is some sort of mapping that let's one run a function over the b-tree, support for ranges, and access to changes.

Doing this is a thankless task, anyone already deeply familiar with the internals would likely have little *interest* (academic, financial, etc..) in it. CouchDB runs on phones now and in the cloud which is awesome and of course a strong argument to maintain the simple design. As the complexity of the code base increases however, the use of Erlang becomes a barrier to entry. 

Best,

Bob

[1] http://github.com/normanb/couchdb-multiview
[2] http://github.com/vmx/couchdb
[3] http://github.com/bdionne/bitstore




On Aug 20, 2010, at 5:09 AM, Benoit Chesneau wrote:

> Hi all,
> 
> I work a lot these days around the httpd code and the more I work on
> the more I think we should refactor it to make it easier to hack and
> extend.  There is indeed a lot of code in one module (couch_httpd_db)
> and recent issue like vhost and location rewriting could be easier to
> solve if we had an http layer more organized in my opinion.
> 
> Actually we do (in 1.0.1 or trunk) :
> 
> request -> couch_httpd loop -> request_handler -> check vhost and
> eventually rewrite url -> request_int -> request_db -> request
> doc|request _design | request attachment | request global handler |
> request misc handler
> 
> with extra level : request_design -> rewrite handler|
> show|lists|update\lview ... and request_int that catch all errors and
> has the responsibility to send errors if anything happend and wasn't
> catched on other layers.
> 
> It could be easier. We could do it more resource oriented for example
> than it is. 1 module, 1 resource. Refactoring httpd code would also
> allow us to reuse more code than we do actually maybe by wrapping api.
> 
> How :
> 
> - Some times ago we started to port it using webmachine with davisp,
> but we didn't finish. Maybe it's a good time ? Or do we want to follow
> another way ?
> 
> - If we go on this refactoring it could be also a good time to split
> couchdb in different apps : couchdb-core and couchdb foe example
> (maybe couchdb-appengine ?) so we could develop independantly each
> levels and make code history cleaner.
> 
> 
> Thoughts ?
> 
> 
> - benoit