You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Robin Berjon <ro...@berjon.com> on 2012/09/18 10:16:55 UTC

Rewrite inside of a path segment

Hi all,

I am using rewrites in order to expose a nice API but I am hitting a problem with rewrites. The constraints I have are as follows (my app is not a blog, but I'll use that for examples as it'll be simpler). First, I would like these three basic operations to work:

GET /blog/this-is-my-title
PUT /blog/this-is-my-title
DELETE /blog/this-is-my-title

So far so good. But the second constraint is that "this-is-my-title" is not the ID of the relevant documents, but rather the value of one of their fields. It is unique within a type, but not unique across all types in the DB (otherwise I'd just use it as the ID and call it a day). So I generate IDs based on concatenating the type and that field.

I got this working easily for GET and PUT, using rewrites that point respectively to a view indexed on the right type plus field, and an update handler that generates the correct ID on the fly.

For DELETE though, I'm stuck. I've tried a bunch of variations from rewriting to the document itself to rewriting to an update handler that sets the _deleted attribute but I can't seem to work my way to a solution.

The problem is that I need a rewrite that invariably accepts parameters that don't fall at the / boundary. Typically:

{
    from:   "/blog/*"
,   to:     "../../blog.*"
,   method: "DELETE"
}

And that doesn't work. It just invariably encodes the * or :id, or whatnot. Have I missed something? I of course could simply expose another ID for DELETE or ugly IDs everywhere, but that completely defeats the point of having rewrites in the first place.

Thanks for any suggestions!

-- 
Robin Berjon - http://berjon.com/ - @robinberjon


Re: Rewrite inside of a path segment

Posted by Robert Newson <rn...@apache.org>.
I'm not opposed to multiple ways to rewrite, though it does make a
good case for the URL rewriter to become a plugin rather than core
(and there are other features that might be better extracted as
optional plugins).

What I am opposed to is a rewrite method that will be so slow as to be
unusable. Excluding the roundtrip to couchjs itself, it seems the
rewrite function is transferred to the view server and compiled on
every call. We do this in other places, it's true, but it's
regrettable. Would this be fast enough, though?

Finally, the native view server would also need this feature, it can't
be javascript-only.

B.

On 26 October 2012 08:49, Benoit Chesneau <bc...@gmail.com> wrote:
> On Thu, Oct 25, 2012 at 9:21 PM, Robin Berjon <ro...@berjon.com> wrote:
>> On 25/10/2012 19:50 , Benoit Chesneau wrote:
>>>
>>> Well first version of the rewriter was based on a function [1]. After
>>> long discussions it wasn't accepted for performances reasons. I'm not
>>> sure we should accept it for now until we change the js evaluation.
>>
>>
>> Performance reasons would be problematic if it were the only option. But
>> since this is just one option, I reckon it should be okay. Also note that
>> I'd be happy to add caching at some point; I've been wondering if it should
>> vary with userCtx or not (I'm leaning towards not but unsure).
>>
> I don't think it's good to have multiple options to handle rewrite.
> Also most of the time you want a function is because you aren't doing
> a fully restful rewriter.  I think the proposal in couchapp-ng gives
> you a lot of flexibility by using regexp. At the end this is what do
> most of the frameworks. And somehow is like mongrel2 at this point.
>
> - benoit

Re: Rewrite inside of a path segment

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Oct 25, 2012 at 9:21 PM, Robin Berjon <ro...@berjon.com> wrote:
> On 25/10/2012 19:50 , Benoit Chesneau wrote:
>>
>> Well first version of the rewriter was based on a function [1]. After
>> long discussions it wasn't accepted for performances reasons. I'm not
>> sure we should accept it for now until we change the js evaluation.
>
>
> Performance reasons would be problematic if it were the only option. But
> since this is just one option, I reckon it should be okay. Also note that
> I'd be happy to add caching at some point; I've been wondering if it should
> vary with userCtx or not (I'm leaning towards not but unsure).
>
I don't think it's good to have multiple options to handle rewrite.
Also most of the time you want a function is because you aren't doing
a fully restful rewriter.  I think the proposal in couchapp-ng gives
you a lot of flexibility by using regexp. At the end this is what do
most of the frameworks. And somehow is like mongrel2 at this point.

- benoit

Re: Rewrite inside of a path segment

Posted by Robin Berjon <ro...@berjon.com>.
On 25/10/2012 19:50 , Benoit Chesneau wrote:
> Well first version of the rewriter was based on a function [1]. After
> long discussions it wasn't accepted for performances reasons. I'm not
> sure we should accept it for now until we change the js evaluation.

Performance reasons would be problematic if it were the only option. But 
since this is just one option, I reckon it should be okay. Also note 
that I'd be happy to add caching at some point; I've been wondering if 
it should vary with userCtx or not (I'm leaning towards not but unsure).

> My other experiments for the rewriter are on the couchapp-ng repo [2].
> I have another try coming but it is using my new elixir [3] app
> engine.

These look very interesting, but since I bumped into needing this for a 
project I have, I wouldn't be averse to its inclusion (but then again 
you might have guessed that ;-).

-- 
Robin Berjon - http://berjon.com/ - @robinberjon

Re: Rewrite inside of a path segment

Posted by Benoit Chesneau <bc...@gmail.com>.
On Thu, Oct 25, 2012 at 6:41 PM, Ryan Ramage <ry...@gmail.com> wrote:
> Wow, I like the idea of a rewriter function. It does add a lot of
> power. Some things to consider...
>
> I know others had rewriter rewrites in mind, namely benoit and jhs.
> Would be nice if they thought this was inline with their ideas. Just
> if we introduce it now like this we want to make sure its compatible
> with the ideas.
>
> Can you add the function as part of an array of rewrites? maybe like this:
>
> rewrites : [
>    {from : "/something", to : "/somethingelse"},
>    "function(req, path) { ...}"
> ]
>
>
> If moved ahead, I also would want to just have a check on
> erica/couchapp tools to make sure could handle this syntax. No one
> likes writing js functions inside of strings in a json file :)
>

Well first version of the rewriter was based on a function [1]. After
long discussions it wasn't accepted for performances reasons. I'm not
sure we should accept it for now until we change the js evaluation.

My other experiments for the rewriter are on the couchapp-ng repo [2].
I have another try coming but it is using my new elixir [3] app
engine.

- benoît

[1] https://github.com/benoitc/couchdb/commit/32783433b9f0030a9e6ac2f995d6cd7098794921
[2] https://github.com/benoitc/couchapp-ng
[3] http://elixir-lang.org/

Re: Rewrite inside of a path segment

Posted by Robin Berjon <ro...@berjon.com>.
On 25/10/2012 18:41 , Ryan Ramage wrote:
> Wow, I like the idea of a rewriter function.

Me too :)

> Can you add the function as part of an array of rewrites? maybe like this:
>
> rewrites : [
>     {from : "/something", to : "/somethingelse"},
>     "function(req, path) { ...}"
> ]

I considered that option, but I couldn't find a strong reason to justify 
the added complexity. And if no one likes writing functions in strings, 
I doubt anyone would enjoy writing multiple functions as string list 
items :)

Note though that there's nothing in this incompatible with any couchapp 
system — all that exists just works, and you can take advantage of this 
just by automatically stringifying functions (which as a matter of fact 
is what I'm doing).

-- 
Robin Berjon - http://berjon.com/ - @robinberjon

Re: Rewrite inside of a path segment

Posted by Ryan Ramage <ry...@gmail.com>.
Wow, I like the idea of a rewriter function. It does add a lot of
power. Some things to consider...

I know others had rewriter rewrites in mind, namely benoit and jhs.
Would be nice if they thought this was inline with their ideas. Just
if we introduce it now like this we want to make sure its compatible
with the ideas.

Can you add the function as part of an array of rewrites? maybe like this:

rewrites : [
   {from : "/something", to : "/somethingelse"},
   "function(req, path) { ...}"
]


If moved ahead, I also would want to just have a check on
erica/couchapp tools to make sure could handle this syntax. No one
likes writing js functions inside of strings in a json file :)



On Thu, Oct 25, 2012 at 10:29 AM, Robin Berjon <ro...@berjon.com> wrote:
> Hi Nathan,
>
>
> On 19/09/2012 22:24 , Nathan Vander Wilt wrote:
>>
>> On Sep 18, 2012, at 1:16 AM, Robin Berjon wrote:
>>>
>>> The problem is that I need a rewrite that invariably accepts
>>> parameters that don't fall at the / boundary. Typically:
>>>
>>> { from:   "/blog/*" ,   to:     "../../blog.*" ,   method:
>>> "DELETE" }
>>>
>>> And that doesn't work. It just invariably encodes the * or :id, or
>>> whatnot. Have I missed something? I of course could simply expose
>>> another ID for DELETE or ugly IDs everywhere, but that completely
>>> defeats the point of having rewrites in the first place.
>>
>>
>> You have not missed anything. The problem is that you're wanting a
>> rewrite doesn't fall on / boundaries, and the simplistic rewrite
>> handler does not support this. I wish that rewrites were implemented
>> in JavaScript instead of this one-off metalanguage, but my dreams
>> aren't much practical value to you right now ;-)
>
>
> As a matter of fact, they are :)
>
> It took me a little while to get back to you because I liked your idea of
> having the rewrite work from JS. So I foolishly picked up an Erlang tutorial
> and went on to implement it. You can find the pull request at:
>
>     https://github.com/apache/couchdb/pull/38
>
> Here's a quick description of how it works:
>
> """
> Alternatively, the rewriting can be performed by a function. It is specified
> as follows:
>
>  {
>      ....
>      "rewrites": "function (req, path) {
>          // process the request, and return the rewrite
>      }"
>  }
>
> The function is called with the request object and a path string. The latter
> contains whatever path is left after removing everything up to _rewrite/.
>
> Which rewrite takes place depends on what the function returns, which can be
> one of three things:
>
>   - false, or a falsy value: indicates that the rewrite could not be
>     performed. This translates to a 500 error.
>   - "a/path": causes the rewrite to that path to take place, with the
>     original method being kept.
>   - { path: "a/path", method: "FOO" }: rewrites to that path and changes
>     the method to the one provided.
>
> Rewrite functions are meant to be devoid of side-effects and one should
> write under the assumption that they are being cached.
> """
>
> There's a bit more information given in the pull request, and in the tests.
>
> I've only given it a cursory run yet but so far it works pretty well. I'd
> appreciate comments, screams, jokes, etc.
>
>
> --
> Robin Berjon - http://berjon.com/ - @robinberjon

Re: Rewrite inside of a path segment

Posted by Robin Berjon <ro...@berjon.com>.
Hi Nathan,

On 19/09/2012 22:24 , Nathan Vander Wilt wrote:
> On Sep 18, 2012, at 1:16 AM, Robin Berjon wrote:
>> The problem is that I need a rewrite that invariably accepts
>> parameters that don't fall at the / boundary. Typically:
>>
>> { from:   "/blog/*" ,   to:     "../../blog.*" ,   method:
>> "DELETE" }
>>
>> And that doesn't work. It just invariably encodes the * or :id, or
>> whatnot. Have I missed something? I of course could simply expose
>> another ID for DELETE or ugly IDs everywhere, but that completely
>> defeats the point of having rewrites in the first place.
>
> You have not missed anything. The problem is that you're wanting a
> rewrite doesn't fall on / boundaries, and the simplistic rewrite
> handler does not support this. I wish that rewrites were implemented
> in JavaScript instead of this one-off metalanguage, but my dreams
> aren't much practical value to you right now ;-)

As a matter of fact, they are :)

It took me a little while to get back to you because I liked your idea 
of having the rewrite work from JS. So I foolishly picked up an Erlang 
tutorial and went on to implement it. You can find the pull request at:

     https://github.com/apache/couchdb/pull/38

Here's a quick description of how it works:

"""
Alternatively, the rewriting can be performed by a function. It is 
specified as follows:

  {
      ....
      "rewrites": "function (req, path) {
          // process the request, and return the rewrite
      }"
  }

The function is called with the request object and a path string. The 
latter contains whatever path is left after removing everything up to 
_rewrite/.

Which rewrite takes place depends on what the function returns, which 
can be one of three things:

   - false, or a falsy value: indicates that the rewrite could not be
     performed. This translates to a 500 error.
   - "a/path": causes the rewrite to that path to take place, with the
     original method being kept.
   - { path: "a/path", method: "FOO" }: rewrites to that path and changes
     the method to the one provided.

Rewrite functions are meant to be devoid of side-effects and one should 
write under the assumption that they are being cached.
"""

There's a bit more information given in the pull request, and in the tests.

I've only given it a cursory run yet but so far it works pretty well. 
I'd appreciate comments, screams, jokes, etc.

-- 
Robin Berjon - http://berjon.com/ - @robinberjon

Re: Rewrite inside of a path segment

Posted by Nathan Vander Wilt <na...@calftrail.com>.
On Sep 18, 2012, at 1:16 AM, Robin Berjon wrote:

> The problem is that I need a rewrite that invariably accepts parameters that don't fall at the / boundary. Typically:
> 
> {
>    from:   "/blog/*"
> ,   to:     "../../blog.*"
> ,   method: "DELETE"
> }
> 
> And that doesn't work. It just invariably encodes the * or :id, or whatnot. Have I missed something? I of course could simply expose another ID for DELETE or ugly IDs everywhere, but that completely defeats the point of having rewrites in the first place.

You have not missed anything. The problem is that you're wanting a rewrite doesn't fall on / boundaries, and the simplistic rewrite handler does not support this. I wish that rewrites were implemented in JavaScript instead of this one-off metalanguage, but my dreams aren't much practical value to you right now ;-)

What I did for the only situation where I've tried to use rewrites in earnest (a blog too, actually) was rely mostly on list functions to show documents by a different key ("path" or "slug" or something) rather than their ID. You can see my hacky rewrites and browse to the broader context via https://github.com/natevw/Glob/blob/master/rewrites.json and it's in use at http://n.exts.ch if you're interested.

Unfortunately, I don't think there's much you could do in the case of a DELETE like this where you can't use view tricks to do without the actual document _id. I'd generally recommend using the _id field *only* as a unique object identifier  — I do often prefix my doc._ids but only as a debugging/troubleshooting aid for my human eyes; any typing information should be encoded elsewhere in the document so code can just treat it as an opaque string.

So I think your options would be to either:
- use something like nginx or node.js to handle rewrites
- [haven't tested] hack up a POST update handler (http://wiki.apache.org/couchdb/Document_Update_Handlers) that takes in the _rev and _id and attempts to overwrite the exisiting document with one that has "_delete":true
- restructure your app/document design so that it fits within CouchDB's very limited _rewrite capabilities

As it seems end-users won't tend to linger on a deletion URL anyway, I'm wondering if the last option wouldn't be best — who cares if the request happens to have full "blog." prefixed identifier if that's the _id anyway?

hth,
-natevw