You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Damien Katz <da...@apache.org> on 2008/10/02 21:48:51 UTC

CouchDB HTTP refactoring and new extension support

I just checked in the first round of refactoring of front end HTTPd  
code. Previously, nearly 100% of the CouchDB httpd was in  
couch_httpd.erl, now it's been divided up to couch_httpd_db.erl,  
couch_httpd_view.erl and couch_httpd_misc_handlers.erl.

The couch_httpd module now mostly provides a wrapper around mochiweb  
and implements dispatch facility for finding the correct module to  
handle incoming HTTP requests.

On startup, CouchDB reads the ini config files to figure out what  
module and function, if any, should be invoked in response to special  
HTTP request.

Here is an example of a default.ini read at startup:
[httpd_global_handlers]
/ = {couch_httpd_misc_handlers, handle_welcome_req, <<"Welcome">>}
_utils = {couch_httpd_misc_handlers, handle_utils_dir_req, "../../ 
share/www"}
_all_dbs = {couch_httpd_misc_handlers, handle_all_dbs_req}
_config = {couch_httpd_misc_handlers, handle_config_req}
_replicate = {couch_httpd_misc_handlers, handle_replicate_req}
_uuids = {couch_httpd_misc_handlers, handle_uuids_req}
_restart = {couch_httpd_misc_handlers, handle_restart_req}

[httpd_db_handlers]
_view = {couch_httpd_view, handle_view_req}
_temp_view = {couch_httpd_view, handle_temp_view_req}

[daemons]
view_manager={couch_view, start_link, []}
db_update_notifier={couch_db_update_notifier_sup, start_link, []}
full_text_query={couch_ft_query, start_link, []}
query_servers={couch_query_servers, start_link, []}
httpd={couch_httpd, start_link, []}

/ end

The [httpd_global_handlers] are the modules and function names that  
get invoked for special urls (plus an optional third argument) After  
reading the ini key/values into a dictionary in memory, every request  
URL that comes in is parsed to see if the first URL path segment  
matches a special key. For example, for a request like  "GET /_utils/ 
images/image.gif", CouchDB will parse the url to get the "_utils"  
part, then find a matching "_utils" handler in the handler dictionary,  
then invoke the handler with the couch_http request object.

If there is no matching httpd_global_handler, then CouchDB hands the  
request off the the couch_httpd_db module where it might invoke a  
[httpd_db_handlers] for it. The couch_httpd_db module firsts looks at  
the second URL path segment (Example: In "GET /db/_view/foo", the  
"_view" is the second path segment) and If it finds an db handler for  
it, then it open the database and invokes the handler with the HTTP  
request and database passed in as the context. But if no handler  
matches, the couch_httpd_db module attempts to serve the request  
itself (including some special urls, like _all_docs, and _compact).

This will allow for custom CouchDB database extensions. A simple  
example that's currently disabled by default is  
couch_httpd_misc_handlers:increment_update_seq_req/2. It purpose is to  
allow a client to increment the database update seq# and have it  
returned the client. This was needed by someone using CouchDB as an  
IMAP storage backend, but probably isn't generally useful. Therefore,  
anyone who wants to can enable this extension by adding this to their  
local.ini file:

[httpd_db_handlers]
_increment_update_seq = {couch_httpd_misc_handlers,  
increment_update_seq_req}

Once enabled, whenever a client does a "POST /db/ 
_increment_update_seq", it will invoke the handler.

The handlers can have a 3rd argument, which will always be passed to  
the handler as an extra arg, which must be a valid erlang term. In the  
main example, we pass the welcome message as an argument like this:
/ = {couch_httpd_misc_handlers, handle_welcome_req, <<"Welcome">>}

The daemon support causes CouchDB to load up a new OTP server  
processes. You provide a name as the key and the module, function and  
start arguments as the value, and CouchDB will attempt to load the  
modules and start the subprocesses. If the sub-processes crash,  
CouchDB will restart them just as any other OTP server process. And  
these OTP process can also spawn external child OS processes if  
necessary.

By combining daemons and and new HTTP handlers, it is possible to  
create new CouchDB services, like a full text search engine written in  
Erlang. The search engine daemon will keep the indexes up to date, and  
the http handlers will process incoming requests and query the  
indexes, likely by interacting with the daemon.

I think we might still need to provide a way for CouchDB to find 3rd  
party extension modules, I'm thinking that should probably be an ini  
setting, with multiple directories for erlang to search for modules.

Feed back welcome. Remember, nothing is set in stone, and much still  
can be done to further organize the code. Fire away.

-Damien