You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Jason Smith <jh...@couch.io> on 2010/08/19 14:30:23 UTC

Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

On Thu, Aug 19, 2010 at 01:48, Benoit Chesneau (JIRA) <ji...@apache.org>wrote:

>
>    [
> https://issues.apache.org/jira/browse/COUCHDB-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899954#action_12899954]
>
> Benoit Chesneau commented on COUCHDB-230:
> -----------------------------------------
>
> the $ stuff is already in the code with the vhost refactoring. The only
> thing to add is this path stuff which is a little more complex btw. I will
> work on it this week, it will likely land  on friday.
>

Woah! Can we all please take a step back and talk about what problem this
solves?

Benoit, your work is *excellent* however on behalf of couchdb
administrators, I must say, these proposed vhost rules look complicated, and
difficult or impossible to support in a reverse-proxy environment.

What exactly is the problem that vhosts and _rewrite cannot solve?

I have no final opinion. However I think a CouchApp or Futon feature which
manages vhosts might be better, long-term, despite being more work to
implement at this time.

-- 
Jason Smith
Couchio Hosting

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Jason Smith <jh...@couch.io>.
On Fri, Aug 20, 2010 at 00:37, Benoit Chesneau <bc...@gmail.com> wrote:

> On Thu, Aug 19, 2010 at 7:22 PM, Jason Smith <jh...@couch.io> wrote:
> > On Fri, Aug 20, 2010 at 00:13, Benoit Chesneau <bc...@gmail.com>
> wrote:
> >
> >> A couchapp should be  "domain" independant, this is the principle of a
> >> couchapp . So I can replicate anywhere and not only in
> >> centralizedhost.com . Following this principle, it sound weird to set
> >> an hostname in the CouchApp.
> >>
> >
> > That is true.
> >
> > But partially, the reason couchapps run immediately after replication is
> > because all couchapps are still very simple. In the future they will be
> like
> > The mature PHP apps. First you copy to the target. Then you run a
> one-time
> > config to input your email address, site name, theme preferences, etc.
> etc.
> > In there might be the vhost/rewrite questions.
> >
> there is no reason to not keep the simplicity while the couchapps
> feature grow. And I personnaly hope that CouchDB will help to remove
> the need of a centralized hosting and just use these services as a
> facility to put online for a time our data.
>
> But that's just the way I see it. Today it's already possible to had
> an hostname to any couchapp and write an handler listening on each dbs
> to get updates in ddocs and then set hostname.
>
> What could be interresting is y to ease management of  modules in
> couch so you can add your mod_vhost like you can do on apache.
>

I am also very interested in that. It is very easy with Erlang modules. I
think the only problem is not many people know Erlang. When I wrote my first
auth handler, it was I think 15 lines of code. And I edited local.ini.
Voila!


-- 
Jason Smith
Couchio Hosting

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
> But that's just the way I see it. Today it's already possible to had
s/had/add

sorry for the typos.

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Aug 19, 2010 at 7:22 PM, Jason Smith <jh...@couch.io> wrote:
> On Fri, Aug 20, 2010 at 00:13, Benoit Chesneau <bc...@gmail.com> wrote:
>
>> A couchapp should be  "domain" independant, this is the principle of a
>> couchapp . So I can replicate anywhere and not only in
>> centralizedhost.com . Following this principle, it sound weird to set
>> an hostname in the CouchApp.
>>
>
> That is true.
>
> But partially, the reason couchapps run immediately after replication is
> because all couchapps are still very simple. In the future they will be like
> The mature PHP apps. First you copy to the target. Then you run a one-time
> config to input your email address, site name, theme preferences, etc. etc.
> In there might be the vhost/rewrite questions.
>
there is no reason to not keep the simplicity while the couchapps
feature grow. And I personnaly hope that CouchDB will help to remove
the need of a centralized hosting and just use these services as a
facility to put online for a time our data.

But that's just the way I see it. Today it's already possible to had
an hostname to any couchapp and write an handler listening on each dbs
to get updates in ddocs and then set hostname.

What could be interresting is y to ease management of  modules in
couch so you can add your mod_vhost like you can do on apache.
- benoit

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Jason Smith <jh...@couch.io>.
On Fri, Aug 20, 2010 at 00:13, Benoit Chesneau <bc...@gmail.com> wrote:

> A couchapp should be  "domain" independant, this is the principle of a
> couchapp . So I can replicate anywhere and not only in
> centralizedhost.com . Following this principle, it sound weird to set
> an hostname in the CouchApp.
>

That is true.

But partially, the reason couchapps run immediately after replication is
because all couchapps are still very simple. In the future they will be like
The mature PHP apps. First you copy to the target. Then you run a one-time
config to input your email address, site name, theme preferences, etc. etc.
In there might be the vhost/rewrite questions.

-- 
Jason Smith
Couchio Hosting

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
> A couchapp should be  "domain" independant, this is the principle of a
s/domain/couchdb node

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Aug 19, 2010 at 7:00 PM, Jason Smith <jh...@couch.io> wrote:
> On Thu, Aug 19, 2010 at 22:23, Benoit Chesneau <bc...@gmail.com> wrote:
>
>> To answer in a more generic way. As a "sys admin" or "operatives", you
>> are already habit to vhosts or Locations with http servers. This is
>> the same system here. Nothing new, just a syntax different.
>>
>> I think also there is 2 views of couchdb in oposition here. One is
>> about managing couchdb in multi mass hosting other is to see couchdb
>> as a standalone db that could be used on any device.
>>
>
> Back around 0.9, there was a huge discussion about "transactional" bulk
> inserts.
>
> The prevailing view was that CouchDB does not show a behavior unless it can
> show that behavior in all possible deployments.
>
> I am asking if the same logic applies here.
>
> In the future when every device has CouchDB and every ISP runs it and every
> carrier caches it and everything, apps need to work there too. That is why I
> am attracted to a simple vhost implementation.
>
>
>> This feature isn't yet implemented. It only works for domain names.
>> You don't need to use a wildcard. If you're worried about your users
>> and the fact they could use this feature, I am thinking this is just
>> an operative concern. If you don't want that your user set vhosts like
>> this then you will have to think to a way to do this. I'm not sure
>> that couchdb should concern about it, especially since you can
>> whitelist _config. You could also create your own _config handler that
>> filter entries .
>>
>
> Since I wrote the whitelist feature, yes I am aware how I could work around
> supporting this.
>
> I have operational concerns. However I am simply asking that the community
> think very hard about what this means before permanently committing to it.
>
> Specifically, the fact is CouchDB is a fantastic platform with very, very
> few good applications. CouchDB is years ahead of the popular couchapps with
> respect to polish and features.
>
> My point is, are we bending CouchDB in order to cover for the fact that
> there is no good App Admin tool out there?
>
> Maybe the answer is no. Okay, fine, as long as the decision is deliberated
> and understood by everybody, cool! I do agree about two things:
>
>  1. It is difficult for beginners to develop on localhost:5984 and then push
> to couch.domain.tld. vhost is managed separate from the ddoc and even the
> DB, so it does not replicate. It totally sucks.

>  2. It is hard to manage vhosts. You have cURL and you have Futon, which did
> not even work correctly in 1.0.0.
>
> It would bother me if I do not know which "Host" header to reply to without
> inspecting all database and ddocs.

>
> My preference is dumb vhosts and a smart rewriter. For example, if the
> rewrite rule could see the req object and Host header, it could redirect to
> the correct db, ddoc, and path.
>

A couchapp should be  "domain" independant, this is the principle of a
couchapp . So I can replicate anywhere and not only in
centralizedhost.com . Following this principle, it sound weird to set
an hostname in the CouchApp.

About difficulty to handle couchapp behoind a vhost or not, it's
already handled by current system :

You can set a rewrite rule to always use the database or detected if
you were rewritten by comparing req.requested_path vs req.path.

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Jason Smith <jh...@couch.io>.
On Thu, Aug 19, 2010 at 22:23, Benoit Chesneau <bc...@gmail.com> wrote:

> To answer in a more generic way. As a "sys admin" or "operatives", you
> are already habit to vhosts or Locations with http servers. This is
> the same system here. Nothing new, just a syntax different.
>
> I think also there is 2 views of couchdb in oposition here. One is
> about managing couchdb in multi mass hosting other is to see couchdb
> as a standalone db that could be used on any device.
>

Back around 0.9, there was a huge discussion about "transactional" bulk
inserts.

The prevailing view was that CouchDB does not show a behavior unless it can
show that behavior in all possible deployments.

I am asking if the same logic applies here.

In the future when every device has CouchDB and every ISP runs it and every
carrier caches it and everything, apps need to work there too. That is why I
am attracted to a simple vhost implementation.


> This feature isn't yet implemented. It only works for domain names.
> You don't need to use a wildcard. If you're worried about your users
> and the fact they could use this feature, I am thinking this is just
> an operative concern. If you don't want that your user set vhosts like
> this then you will have to think to a way to do this. I'm not sure
> that couchdb should concern about it, especially since you can
> whitelist _config. You could also create your own _config handler that
> filter entries .
>

Since I wrote the whitelist feature, yes I am aware how I could work around
supporting this.

I have operational concerns. However I am simply asking that the community
think very hard about what this means before permanently committing to it.

Specifically, the fact is CouchDB is a fantastic platform with very, very
few good applications. CouchDB is years ahead of the popular couchapps with
respect to polish and features.

My point is, are we bending CouchDB in order to cover for the fact that
there is no good App Admin tool out there?

Maybe the answer is no. Okay, fine, as long as the decision is deliberated
and understood by everybody, cool! I do agree about two things:

 1. It is difficult for beginners to develop on localhost:5984 and then push
to couch.domain.tld. vhost is managed separate from the ddoc and even the
DB, so it does not replicate. It totally sucks.
 2. It is hard to manage vhosts. You have cURL and you have Futon, which did
not even work correctly in 1.0.0.

It would bother me if I do not know which "Host" header to reply to without
inspecting all database and ddocs.

My preference is dumb vhosts and a smart rewriter. For example, if the
rewrite rule could see the req object and Host header, it could redirect to
the correct db, ddoc, and path.

-- 
Jason Smith
Couchio Hosting

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
>
> My feeling is when you need
>
>  * pattern matching (the concept, not Erlang)
>  * capturing substrings
>  * building new strings based on the captured parts
>  * Understood by *most* programmers and sysadmins is lagniappe
>

what is not understandable in

$val1.test.$val2.domain.tld = /$val1/$val2 ?

which is like

[val1, "test", val2, "domain", "tld"]

compared to


(.*)\.test\.(.*).domain.tld = /$1/*2

sorry but I think the second one not really friendly.

- benoit

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Jason Smith <jh...@couch.io>.
On Thu, Aug 19, 2010 at 23:53, Noah Slater <ns...@apache.org> wrote:

>
> On 19 Aug 2010, at 17:48, Benoit Chesneau wrote:
>
> >>  3. Everybody knows them
> >
> > that's not true. http://xkcd.com/208/
> >
> > To answer to this "reinventing" . We had long discussions on how to
> > manage rewriting in couchapps. Some included "regexp", we fall in the
> > current system. Ie pattern matching because it was a lot easier to
> > understand and *simple*.  SInce the system is already in place, I
> > wanted to reuse it too, there is no need to have 2 dofferent ways to
> > handle rewriting.
> >
> > I would like to keep it *simple*, that's the key.
>
> We're approaching bike shed territory... :D
>

Sure. This is my final statement on the matter.

My feeling is when you need

 * pattern matching (the concept, not Erlang)
 * capturing substrings
 * building new strings based on the captured parts
 * Understood by *most* programmers and sysadmins is lagniappe

Then the burden of proof is on the person who says regular expressions are
inferior to an off-the-cuff implementation.

-- 
Jason Smith
Couchio Hosting

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Noah Slater <ns...@apache.org>.
On 19 Aug 2010, at 17:48, Benoit Chesneau wrote:

>>  3. Everybody knows them
> 
> that's not true. http://xkcd.com/208/
> 
> To answer to this "reinventing" . We had long discussions on how to
> manage rewriting in couchapps. Some included "regexp", we fall in the
> current system. Ie pattern matching because it was a lot easier to
> understand and *simple*.  SInce the system is already in place, I
> wanted to reuse it too, there is no need to have 2 dofferent ways to
> handle rewriting.
> 
> I would like to keep it *simple*, that's the key.

We're approaching bike shed territory... :D


Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Aug 19, 2010 at 6:30 PM, Jason Smith <jh...@couch.io> wrote:
> On Thu, Aug 19, 2010 at 23:14, Benoit Chesneau <bc...@gmail.com> wrote:
>
>> On Thu, Aug 19, 2010 at 5:56 PM, Noah Slater <ns...@apache.org> wrote:
>> >
>> > On 19 Aug 2010, at 16:49, Benoit Chesneau wrote:
>> >
>> >> Well asking without giving yourself some data make it irrelevant but
>> >
>> > Not really.
>> >
>> > I'm suggesting that if we're going to use performance as an argument, we
>> use data and not supposition.
>>
>> The way pattern matching works should explain by itself though ;)
>>
>
> AFAIK Erlang ships with Perl-compatible regular expressions written in C.
>
> re:compile()
>
Yes. so ?

> Anyway the whole performance argument is not relevant. Who cares? You are
> about to do a b-tree lookup from the hard disk. How much will regex vs.
> pattern match on a 20-byte string matter?

b-tree ? Where ?
>
> My point was, why are we inventing syntax out of our asses for every
> different situation? Regular expressions should be considered because:
>
>  1. They match patterns

yes. Pattern matching too but we only match terms,  and they are
common in Erlang.

>  2. They capture groups and can access them later

yes. who care ? A lot of urls rewriters are not using regexp because
it adds an extra layer of complexity: webmachine in erlang, werzeug in
python, ...

>  3. Everybody knows them

that's not true. http://xkcd.com/208/


To answer to this "reinventing" . We had long discussions on how to
manage rewriting in couchapps. Some included "regexp", we fall in the
current system. Ie pattern matching because it was a lot easier to
understand and *simple*.  SInce the system is already in place, I
wanted to reuse it too, there is no need to have 2 dofferent ways to
handle rewriting.

I would like to keep it *simple*, that's the key.

Anyway, can we go back to the topic ? I still don't understand how
vhosts in 1.0.1 or in trunk are a problem ? About that I'm thinking we
could propse a request_handler that isn't relying to vhost at all and
make them an option, would it be ok ?

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Jason Smith <jh...@couch.io>.
On Thu, Aug 19, 2010 at 23:14, Benoit Chesneau <bc...@gmail.com> wrote:

> On Thu, Aug 19, 2010 at 5:56 PM, Noah Slater <ns...@apache.org> wrote:
> >
> > On 19 Aug 2010, at 16:49, Benoit Chesneau wrote:
> >
> >> Well asking without giving yourself some data make it irrelevant but
> >
> > Not really.
> >
> > I'm suggesting that if we're going to use performance as an argument, we
> use data and not supposition.
>
> The way pattern matching works should explain by itself though ;)
>

AFAIK Erlang ships with Perl-compatible regular expressions written in C.

re:compile()

Anyway the whole performance argument is not relevant. Who cares? You are
about to do a b-tree lookup from the hard disk. How much will regex vs.
pattern match on a 20-byte string matter?

My point was, why are we inventing syntax out of our asses for every
different situation? Regular expressions should be considered because:

 1. They match patterns
 2. They capture groups and can access them later
 3. Everybody knows them

-- 
Jason Smith
Couchio Hosting

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Aug 19, 2010 at 5:56 PM, Noah Slater <ns...@apache.org> wrote:
>
> On 19 Aug 2010, at 16:49, Benoit Chesneau wrote:
>
>> Well asking without giving yourself some data make it irrelevant but
>
> Not really.
>
> I'm suggesting that if we're going to use performance as an argument, we use data and not supposition.

The way pattern matching works should explain by itself though ;)

>
>> Pattern matching is more efficient in erlang. Why would would you use
>> regexp where it's not needed. regexp give an extra level of complexity
>> we don't need here.
>
> Should "we" be "I" in that sentence? :)

*I* was speaking for the project, maybe I'm too much implicated :/

Now I won't try to prove anything, I'm too much busy these day
unfortunately. I just used what I thought was the right tool. Same
system is used in _rewrite handler so why using another way to do it.
Btw, regexp was tried for _rewrite too and it would have slow the
process + make it more difficult to use .

Now I can remove this pattern matching from the code, but the problem
Jason raised will stay. Current system in trunk doesn't change
anything to the way vhosts are actually handled in 1.0.1, so the
problem Jason have is already in 1.0.1. What I see on the other hand
is that couchone.com is running.  Also, I introduced the
X-Forwarded-Host especially for reverse-proxy and it should solve
that.

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Noah Slater <ns...@apache.org>.
On 19 Aug 2010, at 16:49, Benoit Chesneau wrote:

> Well asking without giving yourself some data make it irrelevant but

Not really.

I'm suggesting that if we're going to use performance as an argument, we use data and not supposition.

> Pattern matching is more efficient in erlang. Why would would you use
> regexp where it's not needed. regexp give an extra level of complexity
> we don't need here.

Should "we" be "I" in that sentence? :)

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Aug 19, 2010 at 5:36 PM, Noah Slater <ns...@apache.org> wrote:
>
> On 19 Aug 2010, at 16:23, Benoit Chesneau wrote:
>
>> Regexps are complicated, Regexps are slow (especially in erlang).
>
> This is generally false.
>
> I'd want to seem some data to back it up as a justification.

Well asking without giving yourself some data make it irrelevant but :

http://en.wikipedia.org/wiki/Pattern_matching

http://www.tbray.org/ongoing/When/200x/2007/09/21/Erlang

and other on the web. Also did yoi-u really try to do the same in
erlang compatred to pattern matching ?

>
> If you're going to allow this kind of pattern matching, I would use regexps until it was proven inefficient.
>

Pattern matching is more efficient in erlang. Why would would you use
regexp where it's not needed. regexp give an extra level of complexity
we don't need here.

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Noah Slater <ns...@apache.org>.
On 19 Aug 2010, at 16:23, Benoit Chesneau wrote:

> Regexps are complicated, Regexps are slow (especially in erlang).

This is generally false.

I'd want to seem some data to back it up as a justification.

If you're going to allow this kind of pattern matching, I would use regexps until it was proven inefficient.


Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
To answer in a more generic way. As a "sys admin" or "operatives", you
are already habit to vhosts or Locations with http servers. This is
the same system here. Nothing new, just a syntax different.

I think also there is 2 views of couchdb in oposition here. One is
about managing couchdb in multi mass hosting other is to see couchdb
as a standalone db that could be used on any device.

Now answering to other questions:

On Thu, Aug 19, 2010 at 4:59 PM, Jason Smith <jh...@couch.io> wrote:
> On Thu, Aug 19, 2010 at 21:27, Benoit Chesneau <bc...@gmail.com> wrote:
>
>> Could you explain me how it's impossible compared to previous
>> behaviour ? It doesn't change anything technically. Please post all
>> your concern and a way to reproduce , I will have a look on it. Though
>> here hosting > 50 couch - trunk behingd couchdbproxy works. tested
>> yesterday.
>>
>
> ## The top priority question:
>
> How can proxies managed by the sysadmin or network admin know what to do?
>
> Now, vhost is explicit. Anybody with permission can query /_config/vhosts
> for all couches.

That was the case since Jan added this features. Nothing changed and I
think that couchone is running.

>
> With wildcards, e.g. "*/blog", it is impossible to know all domains which
> the web server handles. When a query for  fooapp.foodb.mydomain.tld arrives,
> which couch should handle it?

This feature isn't yet implemented. It only works for domain names.
You don't need to use a wildcard. If you're worried about your users
and the fact they could use this feature, I am thinking this is just
an operative concern. If you don't want that your user set vhosts like
this then you will have to think to a way to do this. I'm not sure
that couchdb should concern about it, especially since you can
whitelist _config. You could also create your own _config handler that
filter entries .

Other simple way. Use your own vhost rewriter :

in ini :

[httpd]
redirect_vhost_handler = {couchone_httpd, vhost_handler}

then

vhost_handler(MochiReq, VhostTarget) ->
    Path =  MochiReq:get(raw_path),
    {"/" ++ TargetPath, _, _} = mochiweb_util:urlsplit_path(VhostTarget),
    {Firstpart, _, _} = mochiweb_util:partition(TargetPath, "/"),

    %% get target path
    Target = case DbName of
        <<'_utils" ->
            ...
        _ ->
                AppPath = VhostTarget ++ Path
    end,

    ?LOG_DEBUG("Vhost Target: '~p'~n", [Target]),

    Headers = mochiweb_headers:enter("x-couchdb-vhost-path", Path,
        MochiReq:get(headers)),

    % build a new mochiweb request
    MochiReq1 = mochiweb_request:new(MochiReq:get(socket),
                                      MochiReq:get(method),
                                      Target,
                                      MochiReq:get(version),
                                      Headers),
    % cleanup, It force mochiweb to reparse raw uri.
    MochiReq1:cleanup(),

    MochiReq1.


done.


>
> CouchDB adoption is growing. The network, system, and programming
> responsibilities are becoming different people. It needs to allow everybody
> to do their job.
>
> ## The medium priority question
>
> I have not decided how I feel about "smart" vhosts.
>
> Is it really wrong with 1,000 vhost entries? For sure, *something* must be
> smart to allow managing 1,000 couchapps, however I am not sure if the vhost
> is the best answer.
>
> One possibility is CouchApp. Currently, CouchApp has no idea about rewrites
> and vhosts, `rewrites.json` is plain old data. Maybe CouchApp is the best
> place to set vhosts and rewrites. The .couchapprc is 3 nested JSON objects!
> Is a good place to say, "when you push to this URL, please also configure
> vhosts so fooapp.foodb.mydomain.tld will go to _rewrite"?

I don't think it could work. You don't want to read all dbs to know if
an host was set.

>
> Another possibility is a dedicated tool to manage CouchDB, perhaps within
> Futon, or perhaps standalone. Besides vhosts, there are many things that are
> becoming difficult to manage.
>
>  * Database _security object
>  * _config
>  * _users (mostly changing passwords but also role management is quite
> basic)
>
> When deploying a CouchApp, these must be synchronized between development
> and production (_users could replicate). In other words, vhost is not the
> only problem when you have 1,000 CouchApps. I think all these problems are
> related: there is no "app management console" yet.

I know at least one but not public yet.

>
> ## The low priority question:
>
> Finally, as a sysadmin, wonder about the $var -> $var syntax. I do not enjoy
> learning another domain-specific language just to get my job done. What does
> '$' mean? Maybe it is like shell variables? But shell variables do not use
> '$ when assigning, only when expanding. What other differences are there?
> What about underscore? Does $foo_bar match anything, or only
> anything+"_bar"? Ultimately, I must look in the documentation to make sure.

>
> There is already a syntax for this included in Erlang: regular expressions.
>
> (.*).(.*).mydomain.tld -> /$2/_design/$1/_rewrite
>
> Finally, I am happy to get new changes, just trying to figure out what is
> best for everybody.

Regexps are complicated, Regexps are slow (especially in erlang). Here
we only do pattern matching. There is no complexity. Host is splitted
between dots ('.') . if between dots you have a $ it is considered as
a variable. wildcard are here as a convenience too.  we could likely
use a list but I think users prefers to just set a variable name
rather having to split themselves all the domain names. If so
considered it's fine, we could just use list then.

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
> '$' mean? Maybe it is like shell variables?

For you sysadmin that know only shell and figure that "$" is only use
in shell ;) this issue is now gone in last commit. ":" are now used
like in the _rewrite handler.

For the "issue" of jchris, I think we should answer to potential
problems in the ticket.

- benoit

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Aug 19, 2010 at 10:31 PM, J Chris Anderson <jc...@apache.org> wrote:
>
> On Aug 19, 2010, at 1:25 PM, Jason Smith wrote:
>
>> On Fri, Aug 20, 2010 at 02:28, J Chris Anderson <jc...@apache.org> wrote:
>>
>>> KEY POINT: Any CouchApp which is designed to require vhosts is
>>> automatically not capable of running on localhost.
>>>
>>> Until we solve this issue I'm not much interested in refining the existing
>>> vhost stuff.
>>>
>>
>> Many apologies but I'm having trouble following. Why can't a couchapp run on
>> localhost? Already you can vhost localhost:5984 to your app's rewriter. Is
>> the problem because that locks you out of Futon? (localhost.localdomain:5984
>> still work, as does machine_name.local on OSX or Ubuntu, or if you've
>> installed Avahi or Bonjour)
>>
>
> I'm not that fancy.
>
> My point is that if you have an app that requires a vhost to work, then you have to do some machine level configuration to get more than one (or maybe 2) vhosts, from a standard issue Mac or Windows box. You can't ask grandma to do that.
>
> If the vhost directive allowed matching on parts of the path, then you could have */foobar = /foo/_design/bar/_rewrite and then the user could just visit localhost:5984/foobar and have it work.
>
> I described this in more detail here https://issues.apache.org/jira/browse/COUCHDB-230
>
> Because all the configuration happens in the couch, the couchapp can self configure. As soon as you start requiring people who don't know what a URL is to edit /etc/hosts, you're hosed.
>
> Chris
>
>
updated the ticket.

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Paul Davis <pa...@gmail.com>.
On Thu, Aug 19, 2010 at 4:57 PM, Jason Smith <jh...@couch.io> wrote:
> On Fri, Aug 20, 2010 at 03:31, J Chris Anderson <jc...@apache.org> wrote:
>
>> My point is that if you have an app that requires a vhost to work, then you
>> have to do some machine level configuration to get more than one (or maybe
>> 2) vhosts, from a standard issue Mac or Windows box. You can't ask grandma
>> to do that.
>>
>
> I'm not clear on "requires." You can still go to /db/_design/app/_rewrite/
> without vhost rules.
>
>
>> If the vhost directive allowed matching on parts of the path, then you
>> could have */foobar = /foo/_design/bar/_rewrite and then the user could just
>> visit localhost:5984/foobar and have it work.
>>
>
> That's a cool feature. But it has ramifications for reverse proxies. If the
> proxy gets a request for example.com/foobar it will not easily know where to
> send it because every couch on the back end has the same vhost setting:
> */foobar. In other words, you could no longer use the union of all
> _config/vhosts as a registry for all domains to serve. I brought up
> transactional _bulk_inserts because the conclusion was, if it works in any
> situation, it must work in all situations. But yes, the situations are
> different.
>
> Since the path queried is in the HTTP request, reverse proxies have no
> problem. Unless you are rewriting. A rewriter can change that path, which
> could trigger a different vhost setting, or could it? And if so, is there
> impact on secure_rewrites?
>
> vhost: */sofa = /blog/_design/sofa/_rewrite
> rewrite: {from: "pages/*", to:"../../../pages/*"}
> vhost: */pages = /pages/_design/pages/_rewrite
>
> This is contrived, I'm just working this out. But is the expectation here
> that /sofa/pages/foo would return the "foo" wiki page or 404?
>
> --
> Jason Smith
> Couchio Hosting
>

I think this is starting to muddy the waters between vhost matching
and path rewriting. AFAIK, a rewrite can't rewrite itself to a
different domain which is good. It could as you point out rewrite to
match another rewrite pattern, but from the code I skimmed earlier I
think we only allow a single rewrite before we hit the internal
handling. Ie, it could conceivably be made recursive to allow
rewriting rewritten urls, but that's going to get into possible abuse
land.

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Jason Smith <jh...@couch.io>.
On Fri, Aug 20, 2010 at 03:31, J Chris Anderson <jc...@apache.org> wrote:

> My point is that if you have an app that requires a vhost to work, then you
> have to do some machine level configuration to get more than one (or maybe
> 2) vhosts, from a standard issue Mac or Windows box. You can't ask grandma
> to do that.
>

I'm not clear on "requires." You can still go to /db/_design/app/_rewrite/
without vhost rules.


> If the vhost directive allowed matching on parts of the path, then you
> could have */foobar = /foo/_design/bar/_rewrite and then the user could just
> visit localhost:5984/foobar and have it work.
>

That's a cool feature. But it has ramifications for reverse proxies. If the
proxy gets a request for example.com/foobar it will not easily know where to
send it because every couch on the back end has the same vhost setting:
*/foobar. In other words, you could no longer use the union of all
_config/vhosts as a registry for all domains to serve. I brought up
transactional _bulk_inserts because the conclusion was, if it works in any
situation, it must work in all situations. But yes, the situations are
different.

Since the path queried is in the HTTP request, reverse proxies have no
problem. Unless you are rewriting. A rewriter can change that path, which
could trigger a different vhost setting, or could it? And if so, is there
impact on secure_rewrites?

vhost: */sofa = /blog/_design/sofa/_rewrite
rewrite: {from: "pages/*", to:"../../../pages/*"}
vhost: */pages = /pages/_design/pages/_rewrite

This is contrived, I'm just working this out. But is the expectation here
that /sofa/pages/foo would return the "foo" wiki page or 404?

-- 
Jason Smith
Couchio Hosting

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by J Chris Anderson <jc...@apache.org>.
On Aug 19, 2010, at 1:25 PM, Jason Smith wrote:

> On Fri, Aug 20, 2010 at 02:28, J Chris Anderson <jc...@apache.org> wrote:
> 
>> KEY POINT: Any CouchApp which is designed to require vhosts is
>> automatically not capable of running on localhost.
>> 
>> Until we solve this issue I'm not much interested in refining the existing
>> vhost stuff.
>> 
> 
> Many apologies but I'm having trouble following. Why can't a couchapp run on
> localhost? Already you can vhost localhost:5984 to your app's rewriter. Is
> the problem because that locks you out of Futon? (localhost.localdomain:5984
> still work, as does machine_name.local on OSX or Ubuntu, or if you've
> installed Avahi or Bonjour)
> 

I'm not that fancy.

My point is that if you have an app that requires a vhost to work, then you have to do some machine level configuration to get more than one (or maybe 2) vhosts, from a standard issue Mac or Windows box. You can't ask grandma to do that.

If the vhost directive allowed matching on parts of the path, then you could have */foobar = /foo/_design/bar/_rewrite and then the user could just visit localhost:5984/foobar and have it work.

I described this in more detail here https://issues.apache.org/jira/browse/COUCHDB-230

Because all the configuration happens in the couch, the couchapp can self configure. As soon as you start requiring people who don't know what a URL is to edit /etc/hosts, you're hosed.

Chris



> -- 
> Jason Smith
> Couchio Hosting


Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Jason Smith <jh...@couch.io>.
On Fri, Aug 20, 2010 at 02:28, J Chris Anderson <jc...@apache.org> wrote:

> KEY POINT: Any CouchApp which is designed to require vhosts is
> automatically not capable of running on localhost.
>
> Until we solve this issue I'm not much interested in refining the existing
> vhost stuff.
>

Many apologies but I'm having trouble following. Why can't a couchapp run on
localhost? Already you can vhost localhost:5984 to your app's rewriter. Is
the problem because that locks you out of Futon? (localhost.localdomain:5984
still work, as does machine_name.local on OSX or Ubuntu, or if you've
installed Avahi or Bonjour)

-- 
Jason Smith
Couchio Hosting

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Aug 19, 2010 at 10:20 PM, Benoit Chesneau <bc...@gmail.com> wrote:
> On Thu, Aug 19, 2010 at 9:28 PM, J Chris Anderson <jc...@apache.org> wrote:
>
>> My top concern with all of this vhost stuff is much more basic, and I think needs to be addressed before we think about adding convenience features:
>>
>> On localhost, there is no such thing as a Domain Name (unless you are the type to hack your /etc/hosts, which is like 0.01% of people).
>>
>> KEY POINT: Any CouchApp which is designed to require vhosts is automatically not capable of running on localhost.
>>
>> Until we solve this issue I'm not much interested in refining the existing vhost stuff.
>>
>
> Where is the issue ? :)
>
Can you give an example of "CouchApp which is designed to require
vhosts" , which isn't really needed today even when using _rewrite.

- benoit

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Aug 19, 2010 at 9:28 PM, J Chris Anderson <jc...@apache.org> wrote:

> My top concern with all of this vhost stuff is much more basic, and I think needs to be addressed before we think about adding convenience features:
>
> On localhost, there is no such thing as a Domain Name (unless you are the type to hack your /etc/hosts, which is like 0.01% of people).
>
> KEY POINT: Any CouchApp which is designed to require vhosts is automatically not capable of running on localhost.
>
> Until we solve this issue I'm not much interested in refining the existing vhost stuff.
>

Where is the issue ? :)

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by J Chris Anderson <jc...@apache.org>.
I'll admit I'm a little lost here too. What is the change to $foo supposed to allow for, and what is the downside to having it?

I think the example:

$app.$db.example.com = /$db/_design/$app/_rewrite

is actually pretty compelling *as a convenience feature.*

However, Jason is arguing that the convenience is putting some smarts in a place that would be better off dumb. This is the same argument that the Google/Verizon stuff is about: should the middleware part of the stack be smart or dumb? So far, dumb has been the clear winner in the general case, even when it is a pain in the specific case. We should reflect on that before we jump to making this change.

My top concern with all of this vhost stuff is much more basic, and I think needs to be addressed before we think about adding convenience features:

On localhost, there is no such thing as a Domain Name (unless you are the type to hack your /etc/hosts, which is like 0.01% of people).

KEY POINT: Any CouchApp which is designed to require vhosts is automatically not capable of running on localhost.

Until we solve this issue I'm not much interested in refining the existing vhost stuff.

Chris

On Aug 19, 2010, at 11:43 AM, Paul Davis wrote:

> On Thu, Aug 19, 2010 at 10:59 AM, Jason Smith <jh...@couch.io> wrote:
>> On Thu, Aug 19, 2010 at 21:27, Benoit Chesneau <bc...@gmail.com> wrote:
>> 
>>> Could you explain me how it's impossible compared to previous
>>> behaviour ? It doesn't change anything technically. Please post all
>>> your concern and a way to reproduce , I will have a look on it. Though
>>> here hosting > 50 couch - trunk behingd couchdbproxy works. tested
>>> yesterday.
>>> 
>> 
>> ## The top priority question:
>> 
>> How can proxies managed by the sysadmin or network admin know what to do?
>> 
>> Now, vhost is explicit. Anybody with permission can query /_config/vhosts
>> for all couches.
>> 
>> With wildcards, e.g. "*/blog", it is impossible to know all domains which
>> the web server handles. When a query for  fooapp.foodb.mydomain.tld arrives,
>> which couch should handle it?
>> 
>> CouchDB adoption is growing. The network, system, and programming
>> responsibilities are becoming different people. It needs to allow everybody
>> to do their job.
>> 
> 
> Can you describe this in more detail? I don't think I understand your
> concerns very well. I'm not familiar with hosting setups so maybe I'm
> just missing something obvious. I just can't figure out  why a network
> administrator would need to reverse engineer the vhost settings.
> 
> As to the bike shedding on syntax I can only say that the non-regexp
> syntax looked fine to me. Though I understand the complaint about
> inventing syntax, instead of jumping for regexp's I would probably
> take a look at WebMachine's dispatcher mechanism as it reuses Erlang
> which I always found quite nifty.
> 
> And a side point on the regexp syntax you posted:
> 
>    (.*).(.*).mydomain.tld -> /$2/_design/$1/_rewrite
> 
> This is a pretty good example of why regexps really aren't such a hot
> idea. I'll give 10 internets to the first person that figures out how
> that pattern matches this domain:
> 
>    blog.davisp.mydomain.tld
> 
> One hint is that it wouldn't rewrite to /davisp/_design/blog/_rewrite.
> 
> And because its always funny:
> 
> Some people, when confronted with a problem, think
> “I know, I'll use regular expressions.”   Now they have two problems.
> - Jamie Zawinski
> 
> 
> HTH,
> Paul Davis


Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Fri, Aug 20, 2010 at 8:55 AM, Benoit Chesneau <bc...@gmail.com> wrote:
ion.
>
> Now I reckon /db/_design/ddoc/_rewrite/ is a little ugly and that
> would be good to shorten this path. Maybe by using the ini like
> suggest jchris I think . Which would be the easiest actually, though
> I'm not fan of auto declaration which may cause a security issue
> (admin that forgot to logout , user replicating a non trusted
> couchapp, ..). Also there will be a limit to the number of apps
> possible. I've updated the ticket about these questions.
>
> - benoit
>

I posted a patch that rewrite the path via ini conf but it ask a new question ;

https://issues.apache.org/jira/browse/COUCHDB-230?focusedCommentId=12900639&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12900639

patch :

https://issues.apache.org/jira/secure/attachment/12452619/0001-manage-aliases.patch

- benoît

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Aug 19, 2010 at 11:12 PM, Jason Smith <jh...@couch.io> wrote:
> On Fri, Aug 20, 2010 at 04:02, Paul Davis <pa...@gmail.com>wrote:
>
>> Still confused. A couchapp developer shouldn't require any sort of
>> configuration because that's not under their control. For a couchapp
>> to be couchappy, its going to be barred from *requiring* such
>> configuration or it'll never work on the wide number of clients that
>> would be expected to host them as Chris points out.
>>
>
> Indeed. However I think in practice, couch app developers demand that the
> URL looks reasonable. Benoit mentioned that vhost and _rewrite are already
> becoming very popular with developers.

Welll both aren't incompatible. You can totally do today a CouchApp
that will adapt its routing depending if it's behind a vhost or not. 2
possibilities today:

1) Use the rewriter and fix a db path :

/db -> ../..
/db/* -> ../../*

and then use this path as default db path in your app. It will always
work if you run your app behind _rewrite.

2) Detect if path have been rewritten using req.requested_path and
comparing to req.path  in shows, lists, updates function.

Now I reckon /db/_design/ddoc/_rewrite/ is a little ugly and that
would be good to shorten this path. Maybe by using the ini like
suggest jchris I think . Which would be the easiest actually, though
I'm not fan of auto declaration which may cause a security issue
(admin that forgot to logout , user replicating a non trusted
couchapp, ..). Also there will be a limit to the number of apps
possible. I've updated the ticket about these questions.

- benoit

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Jason Smith <jh...@couch.io>.
On Fri, Aug 20, 2010 at 04:02, Paul Davis <pa...@gmail.com>wrote:

> Still confused. A couchapp developer shouldn't require any sort of
> configuration because that's not under their control. For a couchapp
> to be couchappy, its going to be barred from *requiring* such
> configuration or it'll never work on the wide number of clients that
> would be expected to host them as Chris points out.
>

Indeed. However I think in practice, couch app developers demand that the
URL looks reasonable. Benoit mentioned that vhost and _rewrite are already
becoming very popular with developers.

For my curiosity, what happens when a vhost setting is wrong? Ie, Bob
> set a vhost to be blog.gene.com which should really go to Gene's
> couch? Or when there's no vhost set at all? Or when a request comes
> into a vhost that's unknown?
>

There is a pattern to handle that. Bob has a canonical address,
bob.service.com, assigned at signup.

Bob can add whatever vhost he wants. However service.com will only honor
those for which a DNS query returns a CNAME back to bob.service.com. Since
bob cannot modify DNS records in gene.com, he cannot steal blog.gene.com.

-- 
Jason Smith
Couchio Hosting

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Paul Davis <pa...@gmail.com>.
On Thu, Aug 19, 2010 at 4:21 PM, Jason Smith <jh...@couch.io> wrote:
> On Fri, Aug 20, 2010 at 01:43, Paul Davis <pa...@gmail.com>wrote:
>
>> > CouchDB adoption is growing. The network, system, and programming
>> > responsibilities are becoming different people. It needs to allow
>> everybody
>> > to do their job.
>> >
>>
>> Can you describe this in more detail? I don't think I understand your
>> concerns very well. I'm not familiar with hosting setups so maybe I'm
>> just missing something obvious. I just can't figure out  why a network
>> administrator would need to reverse engineer the vhost settings.
>>
>
> Obviously I think a lot about CouchDB deployments where the couchapp
> developer is neither the sysadmin responsible for running the couchdb
> program, nor the network admin responsible for moving the bytes around. For
> example, I wrote the config whitelist feature because the, let's say, DBA
> should not have the ability to change the couchdb listen address or port in
> some operational situations.
>
> In a common reverse-proxy situation, you get a request and you look at the
> Host header and *maybe* the path and you decide where to route that query
> to.
>
> Well now with DBAs having a big old time with vhosts, the reverse proxy
> needs to know what vhosts have been configured so when it encounteres "Host:
> blog.example.com" it can say, "Oh right, blog.example.com was vhosted in
> Bob's CouchDB server, so I will route it there."
>
> The only way I think, let's call it the "CouchDB hosting community" knows
> how to do that is to query _config/vhosts for all the couches and populate
> the routing tables with those results.
>
> So I am not rejecting Benoit's idea, just saying that it will add burden to
> people who maintain reverse proxies in front of CouchDB.
>

Still confused. A couchapp developer shouldn't require any sort of
configuration because that's not under their control. For a couchapp
to be couchappy, its going to be barred from *requiring* such
configuration or it'll never work on the wide number of clients that
would be expected to host them as Chris points out.

As to the network admin side of things I'm confused why they would
ever trust routing tables to something they had no control over. I'd
think that they'd either treat them as the port and address settings
as specified by operations, or they'd just ignore them and route based
on OOB information.

For my curiosity, what happens when a vhost setting is wrong? Ie, Bob
set a vhost to be blog.gene.com which should really go to Gene's
couch? Or when there's no vhost set at all? Or when a request comes
into a vhost that's unknown?

>
>>
>> As to the bike shedding on syntax I can only say that the non-regexp
>> syntax looked fine to me. Though I understand the complaint about
>> inventing syntax, instead of jumping for regexp's I would probably
>> take a look at WebMachine's dispatcher mechanism as it reuses Erlang
>> which I always found quite nifty.
>>
>> And a side point on the regexp syntax you posted:
>>
>>    (.*).(.*).mydomain.tld -> /$2/_design/$1/_rewrite
>>
>
>> This is a pretty good example of why regexps really aren't such a hot
>> idea. I'll give 10 internets to the first person that figures out how
>> that pattern matches this domain:
>>
>
> I did simplify it for rhetorical purposes. I knew the dots didn't match
> themselves but I usually try to get away with the "greedy" matching before
> falling back to un-greedy.
>
> Of course, a hypothetical regex implementation would use Erlang's re module
> which is reasonably bug-free. As a sysadmin that gives me comfort. And when
> I discover a bug in my regular expression, there are countless resources.
>
> On the other hand, I am identifying the frustration that programmers and
> sysadmins get when they encounter an obviously new and unknown syntax.
>
> In other words: the principle of least surprise. CouchDB is HTTP for that
> reason. CouchDB is REST for that reason. View indexes are the same as
> databases for that reason. Applications are also documents for that reason.
> Replication is just another client for that reason.
>
> (However in an offline email I made it clear that I will happily defer to
> the implementation which exists!)
>
> And because its always funny:
>>
>> Some people, when confronted with a problem, think
>> “I know, I'll use regular expressions.”   Now they have two problems.
>> - Jamie Zawinski
>>
>
> Very droll. But the point is given too much credit due to its clever
> presentation. My first day at Couchio I heard Mikeal say, "the problem with
> domain-specific languages is now everybody thinks he's a fucking language
> designer."
>
> --
> Jason Smith
> Couchio Hosting
>

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Jason Smith <jh...@couch.io>.
On Fri, Aug 20, 2010 at 01:43, Paul Davis <pa...@gmail.com>wrote:

> > CouchDB adoption is growing. The network, system, and programming
> > responsibilities are becoming different people. It needs to allow
> everybody
> > to do their job.
> >
>
> Can you describe this in more detail? I don't think I understand your
> concerns very well. I'm not familiar with hosting setups so maybe I'm
> just missing something obvious. I just can't figure out  why a network
> administrator would need to reverse engineer the vhost settings.
>

Obviously I think a lot about CouchDB deployments where the couchapp
developer is neither the sysadmin responsible for running the couchdb
program, nor the network admin responsible for moving the bytes around. For
example, I wrote the config whitelist feature because the, let's say, DBA
should not have the ability to change the couchdb listen address or port in
some operational situations.

In a common reverse-proxy situation, you get a request and you look at the
Host header and *maybe* the path and you decide where to route that query
to.

Well now with DBAs having a big old time with vhosts, the reverse proxy
needs to know what vhosts have been configured so when it encounteres "Host:
blog.example.com" it can say, "Oh right, blog.example.com was vhosted in
Bob's CouchDB server, so I will route it there."

The only way I think, let's call it the "CouchDB hosting community" knows
how to do that is to query _config/vhosts for all the couches and populate
the routing tables with those results.

So I am not rejecting Benoit's idea, just saying that it will add burden to
people who maintain reverse proxies in front of CouchDB.


>
> As to the bike shedding on syntax I can only say that the non-regexp
> syntax looked fine to me. Though I understand the complaint about
> inventing syntax, instead of jumping for regexp's I would probably
> take a look at WebMachine's dispatcher mechanism as it reuses Erlang
> which I always found quite nifty.
>
> And a side point on the regexp syntax you posted:
>
>    (.*).(.*).mydomain.tld -> /$2/_design/$1/_rewrite
>

> This is a pretty good example of why regexps really aren't such a hot
> idea. I'll give 10 internets to the first person that figures out how
> that pattern matches this domain:
>

I did simplify it for rhetorical purposes. I knew the dots didn't match
themselves but I usually try to get away with the "greedy" matching before
falling back to un-greedy.

Of course, a hypothetical regex implementation would use Erlang's re module
which is reasonably bug-free. As a sysadmin that gives me comfort. And when
I discover a bug in my regular expression, there are countless resources.

On the other hand, I am identifying the frustration that programmers and
sysadmins get when they encounter an obviously new and unknown syntax.

In other words: the principle of least surprise. CouchDB is HTTP for that
reason. CouchDB is REST for that reason. View indexes are the same as
databases for that reason. Applications are also documents for that reason.
Replication is just another client for that reason.

(However in an offline email I made it clear that I will happily defer to
the implementation which exists!)

And because its always funny:
>
> Some people, when confronted with a problem, think
> “I know, I'll use regular expressions.”   Now they have two problems.
> - Jamie Zawinski
>

Very droll. But the point is given too much credit due to its clever
presentation. My first day at Couchio I heard Mikeal say, "the problem with
domain-specific languages is now everybody thinks he's a fucking language
designer."

-- 
Jason Smith
Couchio Hosting

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Paul Davis <pa...@gmail.com>.
On Thu, Aug 19, 2010 at 10:59 AM, Jason Smith <jh...@couch.io> wrote:
> On Thu, Aug 19, 2010 at 21:27, Benoit Chesneau <bc...@gmail.com> wrote:
>
>> Could you explain me how it's impossible compared to previous
>> behaviour ? It doesn't change anything technically. Please post all
>> your concern and a way to reproduce , I will have a look on it. Though
>> here hosting > 50 couch - trunk behingd couchdbproxy works. tested
>> yesterday.
>>
>
> ## The top priority question:
>
> How can proxies managed by the sysadmin or network admin know what to do?
>
> Now, vhost is explicit. Anybody with permission can query /_config/vhosts
> for all couches.
>
> With wildcards, e.g. "*/blog", it is impossible to know all domains which
> the web server handles. When a query for  fooapp.foodb.mydomain.tld arrives,
> which couch should handle it?
>
> CouchDB adoption is growing. The network, system, and programming
> responsibilities are becoming different people. It needs to allow everybody
> to do their job.
>

Can you describe this in more detail? I don't think I understand your
concerns very well. I'm not familiar with hosting setups so maybe I'm
just missing something obvious. I just can't figure out  why a network
administrator would need to reverse engineer the vhost settings.

As to the bike shedding on syntax I can only say that the non-regexp
syntax looked fine to me. Though I understand the complaint about
inventing syntax, instead of jumping for regexp's I would probably
take a look at WebMachine's dispatcher mechanism as it reuses Erlang
which I always found quite nifty.

And a side point on the regexp syntax you posted:

    (.*).(.*).mydomain.tld -> /$2/_design/$1/_rewrite

This is a pretty good example of why regexps really aren't such a hot
idea. I'll give 10 internets to the first person that figures out how
that pattern matches this domain:

    blog.davisp.mydomain.tld

One hint is that it wouldn't rewrite to /davisp/_design/blog/_rewrite.

And because its always funny:

Some people, when confronted with a problem, think
“I know, I'll use regular expressions.”   Now they have two problems.
- Jamie Zawinski


HTH,
Paul Davis

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Jason Smith <jh...@couch.io>.
On Thu, Aug 19, 2010 at 21:27, Benoit Chesneau <bc...@gmail.com> wrote:

> Could you explain me how it's impossible compared to previous
> behaviour ? It doesn't change anything technically. Please post all
> your concern and a way to reproduce , I will have a look on it. Though
> here hosting > 50 couch - trunk behingd couchdbproxy works. tested
> yesterday.
>

## The top priority question:

How can proxies managed by the sysadmin or network admin know what to do?

Now, vhost is explicit. Anybody with permission can query /_config/vhosts
for all couches.

With wildcards, e.g. "*/blog", it is impossible to know all domains which
the web server handles. When a query for  fooapp.foodb.mydomain.tld arrives,
which couch should handle it?

CouchDB adoption is growing. The network, system, and programming
responsibilities are becoming different people. It needs to allow everybody
to do their job.

## The medium priority question

I have not decided how I feel about "smart" vhosts.

Is it really wrong with 1,000 vhost entries? For sure, *something* must be
smart to allow managing 1,000 couchapps, however I am not sure if the vhost
is the best answer.

One possibility is CouchApp. Currently, CouchApp has no idea about rewrites
and vhosts, `rewrites.json` is plain old data. Maybe CouchApp is the best
place to set vhosts and rewrites. The .couchapprc is 3 nested JSON objects!
Is a good place to say, "when you push to this URL, please also configure
vhosts so fooapp.foodb.mydomain.tld will go to _rewrite"?

Another possibility is a dedicated tool to manage CouchDB, perhaps within
Futon, or perhaps standalone. Besides vhosts, there are many things that are
becoming difficult to manage.

 * Database _security object
 * _config
 * _users (mostly changing passwords but also role management is quite
basic)

When deploying a CouchApp, these must be synchronized between development
and production (_users could replicate). In other words, vhost is not the
only problem when you have 1,000 CouchApps. I think all these problems are
related: there is no "app management console" yet.

## The low priority question:

Finally, as a sysadmin, wonder about the $var -> $var syntax. I do not enjoy
learning another domain-specific language just to get my job done. What does
'$' mean? Maybe it is like shell variables? But shell variables do not use
'$ when assigning, only when expanding. What other differences are there?
What about underscore? Does $foo_bar match anything, or only
anything+"_bar"? Ultimately, I must look in the documentation to make sure.

There is already a syntax for this included in Erlang: regular expressions.

(.*).(.*).mydomain.tld -> /$2/_design/$1/_rewrite

Finally, I am happy to get new changes, just trying to figure out what is
best for everybody.
-- 
Jason Smith
Couchio Hosting

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Aug 19, 2010 at 4:27 PM, Benoit Chesneau <bc...@gmail.com> wrote:
> On Thu, Aug 19, 2010 at 4:22 PM, Jason Smith <jh...@couch.io> wrote:
>> On Thu, Aug 19, 2010 at 21:19, Benoit Chesneau <bc...@gmail.com> wrote:
>>
>>> On Thu, Aug 19, 2010 at 2:30 PM, Jason Smith <jh...@couch.io> wrote:
>>>
>>> > Woah! Can we all please take a step back and talk about what problem this
>>> > solves?
>>> Not everyone want to use a proxy to do smart vhosting. Not everyone do
>>> mass hosting. Some people just want to host their couchdb on port 80
>>> like some tshirt said.
>>>
>>
>> Yes, however these changes make it impossible to move a couch behind a
>> proxy. That would be unfortunate.
>>
> Could you explain me how it's impossible compared to previous
> behaviour ? It doesn't change anything technically. Please post all
> your concern and a way to reproduce , I will have a look on it. Though
> here hosting > 50 couch - trunk behingd couchdbproxy works. tested
> yesterday.
>
>
> If you have specifi concern like forbid a vhost (furon.couchone.com
> for ex-) you could use your own redirect vhost function it's just 4-6
> lines of code.
>
> - benoit
>
> - benoit.
>
s/furon/futon ...

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Aug 19, 2010 at 4:22 PM, Jason Smith <jh...@couch.io> wrote:
> On Thu, Aug 19, 2010 at 21:19, Benoit Chesneau <bc...@gmail.com> wrote:
>
>> On Thu, Aug 19, 2010 at 2:30 PM, Jason Smith <jh...@couch.io> wrote:
>>
>> > Woah! Can we all please take a step back and talk about what problem this
>> > solves?
>> Not everyone want to use a proxy to do smart vhosting. Not everyone do
>> mass hosting. Some people just want to host their couchdb on port 80
>> like some tshirt said.
>>
>
> Yes, however these changes make it impossible to move a couch behind a
> proxy. That would be unfortunate.
>
Could you explain me how it's impossible compared to previous
behaviour ? It doesn't change anything technically. Please post all
your concern and a way to reproduce , I will have a look on it. Though
here hosting > 50 couch - trunk behingd couchdbproxy works. tested
yesterday.


If you have specifi concern like forbid a vhost (furon.couchone.com
for ex-) you could use your own redirect vhost function it's just 4-6
lines of code.

- benoit

- benoit.

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Jason Smith <jh...@couch.io>.
On Thu, Aug 19, 2010 at 21:19, Benoit Chesneau <bc...@gmail.com> wrote:

> On Thu, Aug 19, 2010 at 2:30 PM, Jason Smith <jh...@couch.io> wrote:
>
> > Woah! Can we all please take a step back and talk about what problem this
> > solves?
> Not everyone want to use a proxy to do smart vhosting. Not everyone do
> mass hosting. Some people just want to host their couchdb on port 80
> like some tshirt said.
>

Yes, however these changes make it impossible to move a couch behind a
proxy. That would be unfortunate.

-- 
Jason Smith
Couchio Hosting

Re: Vhosting Requirements (was: Re: [jira] Commented: (COUCHDB-230) Add Support for Rewritable URL)

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Aug 19, 2010 at 2:30 PM, Jason Smith <jh...@couch.io> wrote:

> Woah! Can we all please take a step back and talk about what problem this
> solves?
Not everyone want to use a proxy to do smart vhosting. Not everyone do
mass hosting. Some people just want to host their couchdb on port 80
like some tshirt said.

>
> Benoit, your work is *excellent* however on behalf of couchdb
> administrators, I must say, these proposed vhost rules look complicated, and
> difficult or impossible to support in a reverse-proxy environment.

I don't see how it changed compared to previous system.

before : check Host, use rewrite rule
now: check Host or X-Forwarded-Host , use rewrite rule

what was added, is a way to make the rewrite rule smarter. So no
change for people using a reverse proxy. More over now work is easier
for proxy since they can use the X-Forwarded-Host header which wasn't
possible before.

>
> What exactly is the problem that vhosts and _rewrite cannot solve?

I have 1000 apps on 1000 dbs, each of this app want to have a nice
domain name. Rather than having 1000 lines i can just do now :

$appname.$dbname.mydomain.tld = /$dbname/_design/$appname/_rewrite

other : rather than having 2 lines for www. and . I can just do :

*.domain.tlld = /db/....

Pb to solve, deploying just a couchdb rather than a couchdb + a proxy ++++++ .

>
> I have no final opinion. However I think a CouchApp or Futon feature which
> manages vhosts might be better, long-term, despite being more work to
> implement at this time.

I don't see how it could work If vhost isn't implemented on server
side it can't work.
>
- benoit