You are viewing a plain text version of this content. The canonical link for it is here.

Posted to oak-dev@jackrabbit.apache.org by Jukka Zitting <ju...@gmail.com> on 2012/06/25 14:26:05 UTC

Native HTTP bindings for Oak

Hi,

As suggested in OAK-104 [1] and discussed briefly before, I think it
would be useful for Oak to have a native HTTP binding that allows
remote clients to talk directly with oak-core without having to go
through the JCR layer.

The main benefit of such a lower-level HTTP binding is that it can
better leverage the "everything is (JSON) content" nature of Oak
internals than a higher-level layer like our existing WebDAV(ex)
components that only assume a generic JCR content repository. This
makes the binding much more straightforward and probably helps avoid
performance or scalability bottlenecks down the line. It also makes it
easier to support a wider range of functionality without having to add
custom integration code for all features. For example, instead of
accessing separate namespace registration code, a client could
register a new namespace simply by posting it as a normal content
modification to the appropriate place in the content tree, like this:

    $ curl -d foo=http://foo.example.com/ns
http://localhost:8080/content/jcr:system/jcr:namespaces

Unlike a MicroKernel remoting layer, the Oak HTTP binding would still
have all the commit validation and access control logic in place, so
one couldn't use it to bypass the rules that govern more traditional
JCR-based access.

I'm not yet sure about what the binding should look like in detail,
but my general idea is to follow JSOP protocol ideas [2] and try to
keep things as simple and easy as possible for clients like JavaScript
libraries [3] and command line tools like curl. The initial draft code
I committed just recently to the new oak-http component uses a basic
URL to Node mapping and a simple JSON / binary JSON rendering based on
the contents of the HTTP Accept header. It would be nice to support
different kinds of content renderings and also to allow similar
content renderings to be POSTed or PUT to the server. The binding
should also accept basic HTML form POSTs with the
application/x-www-form-urlencoded media type.

By default the HTTP binding could simply use a fresh new session for
each HTTP request, but it should be possible for a client to request a
longer-lived session for more complex content modifications (import,
batch jobs, etc.) or for getting a stable snapshot for larger reads
(export, query, etc.) that shouldn't change while reading. I was
thinking of handling such cases by allowing the client to generate
such a session with a specific POST request that responds with a
redirect to a temporary session URL that exposes the normal content
tree as seen through that session. We'd use a lease mechanism to
control the lifetime of such server-side sessions.

As an example, something like the following interaction could be used
to acquire a new session (with a initial lifetime of one hour,
lease=3600), make some changes to the content tree within that
session, and finally commit those changes and release the session
(lease=0).

    $ curl -d lease=3600 http://localhost:8080/session
    http://localhost:8080/session/a9c49810bebf11e1afa70800200c9a66
    $ curl -d bar=baz
http://localhost:8080/session/a9c49810bebf11e1afa70800200c9a66/foo
    {"bar":"baz"}
    $ curl -d save=true -d lease=0
http://localhost:8080/session/a9c49810bebf11e1afa70800200c9a66

[1] https://issues.apache.org/jira/browse/OAK-104
[2] http://wiki.apache.org/jackrabbit/Jsop
[3] https://issues.apache.org/jira/browse/OAK-103

BR,

Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Mon, Jun 25, 2012 at 2:26 PM, Jukka Zitting <ju...@gmail.com> wrote:
> As suggested in OAK-104 [1] and discussed briefly before, I think it
> would be useful for Oak to have a native HTTP binding that allows
> remote clients to talk directly with oak-core without having to go
> through the JCR layer.

Here's a quick example of this feature in action. First, let's build
Oak trunk and start the standalone jar:

    $ mvn clean install
    $ java -jar oak-run/target/oak-run-0.3-SNAPSHOT.jar
    Apache Jackrabbit Oak 0.3-SNAPSHOT
    Starting an in-memory repository
    http://localhost:8080/ -> [memory]

Then, in a separate terminal I'm using the HTTPie tool
(http://httpie.org/) to access the server:

    $ http -u -b http://localhost:8080/
    {"jcr:primaryType":"nt:unstructured"}

    $ http -u -b http://localhost:8080/ test:={}
    {"jcr:primaryType":"nt:unstructured","test":{}}

    $ http -u -b http://localhost:8080/test
    {}

    $ http -u -b http://localhost:8080/test a=x b=y c=z
    {"b":"y","c":"z","a":"x"}

    $ http -u -b http://localhost:8080/?d=2
    {"jcr:primaryType":"nt:unstructured","test":{"b":"y","c":"z","a":"x"}}

    $ http -u -b http://localhost:8080/ test:=null
    {"jcr:primaryType":"nt:unstructured"}

A lot of things still don't work as expected, but the above should
give a rough idea of the possibilities.

BR,

Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Stefan Guggisberg <st...@gmail.com>.

On Wed, Jun 27, 2012 at 11:20 AM, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> On Wed, Jun 27, 2012 at 10:25 AM, Angela Schreiber <an...@adobe.com> wrote:
>> i don't fully see the use case for such long living sessions.
>
> The rationale is the same as for the branch feature we added to the
> MicroKernel. Instead of having to maintain a separate transient tree
> abstraction on the client side (which might be troublesome given
> potentially limited storage capacity),

are you saying that each Node.addNode would trigger a sever-roundtrip?
i don't see how this could perform reasonably...

cheers
stefan

> it's better to be able to send
> uncommitted data to the server for storage in a temporary branch where
> it can be accessed using the existing tree abstraction already
> provided by Oak.
>
> Most notably the session feature allows us to use such a HTTP binding
> to implement remoting of the Oak API without the need for client-side
> storage space and associated extra code for managing it.
>
>> IMO also importing big trees and batch read/writing should be covered
>> by a single request.
>
> That quickly leads to increasingly complex server side features like
> filtering or conditional saves.
>
> For example, think of a client like the Sling engine that first
> resolves a path potentially with custom mappings, then follows a
> complex set of resource type references, and finally renders a
> representation of the resolved content based on the resource type
> definitions that were found. Ideally (for consistency and better
> caching support) it should be possible to perform the entire operation
> based on a stable snapshot of the repository, but there's no way that
> all the information required by such a process could be included in
> the response of a single Oak HTTP request.
>
> Exposing the branch feature as proposed avoids the need for complex
> server-side request processing logic and makes it easier to implement
> many client-side features that would otherwise have to use local
> storage or temporary content subtrees visible to other repository
> clients.
>
> BR,
>
> Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Angela Schreiber <an...@adobe.com>.

hi jukka

> On Wed, Jun 27, 2012 at 10:25 AM, Angela Schreiber<an...@adobe.com>  wrote:
>> i don't fully see the use case for such long living sessions.
>
> The rationale is the same as for the branch feature we added to the
> MicroKernel. Instead of having to maintain a separate transient tree
> abstraction on the client side (which might be troublesome given
> potentially limited storage capacity), it's better to be able to send
> uncommitted data to the server for storage in a temporary branch where
> it can be accessed using the existing tree abstraction already
> provided by Oak.
>
> Most notably the session feature allows us to use such a HTTP binding
> to implement remoting of the Oak API without the need for client-side
> storage space and associated extra code for managing it.

well.... that's pretty much matches my very first mapping of JCR to
HTTP (using WebDAV) and it showed that having such a lazy client
didn't work out very well... at the end we ended up inventing JSOP
i.e. the various ways of batch reading and writing including the
diff-pattern that allowed to persist a bunch of changes in a
single commit.

the whole story of the SPI and that extra layer we are having
now in oak (compared to jackrabbit 2) was/is that we want to be
able to have different non-java content repository layers such
as the php implementation with the aim to delegate the transient
part to those clients instead of keeping track of the transient 
modifications on the server.

kind regards
angela

Re: Native HTTP bindings for Oak

Posted by Julian Reschke <ju...@gmx.de>.

On 2012-06-27 14:57, Stefan Guggisberg wrote:
> for once i'm a 100% with felix ;)
>
> if i understand jukka's proposal correctly, it's about promoting an alternative
> public client api. in my understanding the ultimate goal of the oak project
> is to come up with a highly efficient & scalable jcr implementation.
> imo we should focus on this goal.
>
> i thought that we have a consensus of how the oak stack should be layered, i.e.
>
> app / sling / oak-jcr (trans. space) / [remoting /] oak-core [remoting
> /] oak-mk.
>
> now your proposal seems to imply a different architecture...
> ...

I would say that the approach to expose a different 
branch/revision/activity/session/whatever in a different URI space is 
completely orthogonal to the question whether we do OAK-over-HTTP or 
JCR-over-HTTP. And also, that is a good idea :-)

Best regards, Julian

Re: Native HTTP bindings for Oak

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Thu, Jun 28, 2012 at 8:39 AM, Lukas Kahwe Smith <ml...@pooteeweet.org> wrote:
> here are some infos i send to the JSR-333 list
>
> https://github.com/jackalope/jackalope/wiki/workspace.mirror
>
> the purpose is to allow editors to "draft" changes in a separate workspace, while still
> being able to see what other editors are doing in production.
>
> typo3 uses this approach. slides 14-17 somewhat illustrate the feature:
> http://www.slideshare.net/kfish/typo3-cr-in-phoenix
>
> but the slides do not illustrate how changes in the "live" workspace would become visible in "user-liga" workspace.
>
> from a google search I also just found this document:
> http://www-typo3.cs.ucl.ac.uk/help/typo3/working_in_a_draft_workspace/

Thanks for the pointers! This indeed seems similar to the rebase
operation we currently only execute internally within oak-core. It
sounds like we could expose a similar feature also through the HTTP
binding with something like this:

    $ curl -d rebase=true http://localhost:8080/branch/X

Note however that the branch feature here is mostly meant for
short-term changes (processing of a single HTTP request, or at most a
single user session). Longer term "branches" are probably better
handled with actual JCR-level workspaces with some functional
extensions and thus a bit orthogonal to the HTTP binding (they'd be
exposed over HTTP just like any other content).

BR,

Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Lukas Kahwe Smith <ml...@pooteeweet.org>.

On Jun 28, 2012, at 01:24 , Jukka Zitting wrote:

> Hi,
> 
> On Wed, Jun 27, 2012 at 6:31 PM, Lukas Kahwe Smith <ml...@pooteeweet.org> wrote:
>> i assume this session space will not cause duplicating the rest of the workspace
>> content that hasnt changed, right?
> 
> Right. No physical duplication of content takes place.
> 
>> if we also make it possible to sync such a session space with the workspace its
>> based on, then this would become an amazingly powerful way to view the state of
>> the workspace if the staged changed including any already commited concurrent writes.
>> 
>> this is very similar to the copy on write cloning with "shine through" of concurrent
>> changes that typo3 supports.
> 
> I'm not too familiar with this typo3 feature, but it doesn't sound
> like something that Oak already supports. We do have a "rebase"
> operation for pending changes in oak-core, which might be similar, but
> so far I didn't think of exposing such functionality through the Oak
> API. Sounds like something we should take a look at together.


here are some infos i send to the JSR-333 list

https://github.com/jackalope/jackalope/wiki/workspace.mirror

the purpose is to allow editors to "draft" changes in a separate workspace, while still being able to see what other editors are doing in production.

typo3 uses this approach. slides 14-17 somewhat illustrate the feature:
http://www.slideshare.net/kfish/typo3-cr-in-phoenix

but the slides do not illustrate how changes in the "live" workspace would become visible in "user-liga" workspace.

from a google search I also just found this document:
http://www-typo3.cs.ucl.ac.uk/help/typo3/working_in_a_draft_workspace/

regards,
Lukas Kahwe Smith
mls@pooteeweet.org

Re: Native HTTP bindings for Oak

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Wed, Jun 27, 2012 at 6:31 PM, Lukas Kahwe Smith <ml...@pooteeweet.org> wrote:
> i assume this session space will not cause duplicating the rest of the workspace
> content that hasnt changed, right?

Right. No physical duplication of content takes place.

> if we also make it possible to sync such a session space with the workspace its
> based on, then this would become an amazingly powerful way to view the state of
> the workspace if the staged changed including any already commited concurrent writes.
>
> this is very similar to the copy on write cloning with "shine through" of concurrent
> changes that typo3 supports.

I'm not too familiar with this typo3 feature, but it doesn't sound
like something that Oak already supports. We do have a "rebase"
operation for pending changes in oak-core, which might be similar, but
so far I didn't think of exposing such functionality through the Oak
API. Sounds like something we should take a look at together.

BR,

Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Lukas Kahwe Smith <ml...@pooteeweet.org>.

On 27.06.2012, at 18:02, Christian Stocker <ch...@liip.ch> wrote:

> Hi
> 
> On 27.06.12 15:07, Jukka Zitting wrote:
>> Hi,
>> 
>> On Wed, Jun 27, 2012 at 2:57 PM, Stefan Guggisberg
>> <st...@gmail.com> wrote:
>>> now your proposal seems to imply a different architecture...
>> 
>> You're reading far too much into this. I'm just thinking of exposing a
>> feature that seems like it could come in handy for some potential
>> clients. Doing so requires zero changes to our existing architecture
>> or APIs.
> 
> I agree with Jukka. Something like this could become very handy for
> "dumb" non-java clients. It's not that easy to implement a transient
> space in another language. Doable and needed if you need every possible
> performance. But sometimes it would be really helpful, if I could just
> outsource that part to the server side. Or start with a very dump client
> and extending it later with more transient space features, if there's a
> need.

i assume this session space will not cause duplicating the rest of the workspace content that hasnt changed, right?

if we also make it possible to sync such a session space with the workspace its based on, then this would become an amazingly powerful way to view the state of the workspace if the staged changed including any already commited concurrent writes. 

this is very similar to the copy on write cloning with "shine through" of concurrent changes that typo3 supports. 

regards,
Lukas

Re: Native HTTP bindings for Oak

Posted by Christian Stocker <ch...@liip.ch>.

Hi

On 27.06.12 15:07, Jukka Zitting wrote:
> Hi,
> 
> On Wed, Jun 27, 2012 at 2:57 PM, Stefan Guggisberg
> <st...@gmail.com> wrote:
>> now your proposal seems to imply a different architecture...
> 
> You're reading far too much into this. I'm just thinking of exposing a
> feature that seems like it could come in handy for some potential
> clients. Doing so requires zero changes to our existing architecture
> or APIs.

I agree with Jukka. Something like this could become very handy for
"dumb" non-java clients. It's not that easy to implement a transient
space in another language. Doable and needed if you need every possible
performance. But sometimes it would be really helpful, if I could just
outsource that part to the server side. Or start with a very dump client
and extending it later with more transient space features, if there's a
need.

chregu

> 
> BR,
> 
> Jukka Zitting
> 

-- 
Liip AG  //  Feldstrasse 133 //  CH-8004 Zurich
Tel +41 43 500 39 81 // Mobile +41 76 561 88 60
www.liip.ch // blog.liip.ch // GnuPG 0x0748D5FE

Re: Native HTTP bindings for Oak

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Wed, Jun 27, 2012 at 3:58 PM, Thomas Mueller <mu...@adobe.com> wrote:
> But this feature is unrelated to the branch/merge / workspace feature that
> Jukka proposed here (at least as far as I understand).

The branch feature goes a bit beyond what Bertrand describes in that
it is read-write instead of just a read-only view of a given revision.
But yes, it does support also Bertrand's use case.

BR,

Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Thomas Mueller <mu...@adobe.com>.

Hi Betrand,

>FWIW, as a Sling user I agree that the "freeze my view of the
>repository for a while, whatever others are doing" feature looks like
>a useful one to expose over an Oak HTTP remoting layer.

Do you mean having a consistent snapshot of the data at a given point in
time? Yes, we want to support that using "revisions". Each read request
(getNodes) would contain a revision, and the result (when using the same
revision) would not change no matter what other clients modified.

But this feature is unrelated to the branch/merge / workspace feature that
Jukka proposed here (at least as far as I understand).

Regards,
Thomas

Re: Native HTTP bindings for Oak

Posted by Bertrand Delacretaz <bd...@apache.org>.

On Wed, Jun 27, 2012 at 3:07 PM, Jukka Zitting <ju...@gmail.com> wrote:
> ...I'm just thinking of exposing a
> feature that seems like it could come in handy for some potential
> clients...

FWIW, as a Sling user I agree that the "freeze my view of the
repository for a while, whatever others are doing" feature looks like
a useful one to expose over an Oak HTTP remoting layer.

-Bertrand

Re: Native HTTP bindings for Oak

Posted by Stefan Guggisberg <st...@gmail.com>.

On Thu, Jun 28, 2012 at 11:16 AM, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> On Thu, Jun 28, 2012 at 9:07 AM, Felix Meschberger <fm...@adobe.com> wrote:
>> How about remotely located oak-jcr installs ?
>>
>> As in: oak-jcr on box X talking to oak-core on box Y. This should probably be able to leverage this new  binding.
>>
>> Or do you envision some different Oak API remoting ?
>
> API remoting isn't my primary goal behind the HTTP binding, though it
> might also come in handy for that purpose.
>
> However, as Stefan pointed out (server roundtrips for each addNode
> call), the HTTP binding as such (i.e. as a one-to-one mapping for Oak
> API calls) probably isn't at the correct level of granularity for
> remote JCR access. A remote JCR client probably in any case needs a
> (potentially partial) client-side transient space like the one we have
> in jcr2spi if it wants to avoid the kind of massive performance hit
> you get with the JCR-RMI remoting layer. I'm not sure whether solving
> that issue on to of the Oak HTTP binding is worth the effort when we
> already have the SPI-based JCR remoting mechanism that works
> reasonably well.

"reasonably well" is IMO not good enough ;) the SPI-based JCR remoting
would need to go to several layers (jcr2spi, spi2jcr, oak-jcr, oak-core,...).
that's hardly efficient. whereas the obvious remoting layer (oak api) would
allow a much leaner layering (oak-jcr, remoted oak-core, ...).

cheers
stefan

>
> BR,
>
> Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Thu, Jun 28, 2012 at 9:07 AM, Felix Meschberger <fm...@adobe.com> wrote:
> How about remotely located oak-jcr installs ?
>
> As in: oak-jcr on box X talking to oak-core on box Y. This should probably be able to leverage this new  binding.
>
> Or do you envision some different Oak API remoting ?

API remoting isn't my primary goal behind the HTTP binding, though it
might also come in handy for that purpose.

However, as Stefan pointed out (server roundtrips for each addNode
call), the HTTP binding as such (i.e. as a one-to-one mapping for Oak
API calls) probably isn't at the correct level of granularity for
remote JCR access. A remote JCR client probably in any case needs a
(potentially partial) client-side transient space like the one we have
in jcr2spi if it wants to avoid the kind of massive performance hit
you get with the JCR-RMI remoting layer. I'm not sure whether solving
that issue on to of the Oak HTTP binding is worth the effort when we
already have the SPI-based JCR remoting mechanism that works
reasonably well.

BR,

Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Felix Meschberger <fm...@adobe.com>.

Hi,

Am 28.06.2012 um 01:15 schrieb Jukka Zitting:

>> do you envision oak-jcr being a client of this http binding?
> 
> No, we already have the Oak API for that.

How about remotely located oak-jcr installs ?

As in: oak-jcr on box X talking to oak-core on box Y. This should probably be able to leverage this new  binding.

Or do you envision some different Oak API remoting ?

Regards
Felix

Re: Native HTTP bindings for Oak

Posted by Stefan Guggisberg <st...@gmail.com>.

On Thu, Jun 28, 2012 at 1:15 AM, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> On Wed, Jun 27, 2012 at 6:49 PM, Stefan Guggisberg
> <st...@gmail.com> wrote:
>> well, statements like the following lead me to this assumption:
>
> OK, I can see how I could have given such an impression. Sorry about
> the poor wording. With "clients" in this context I meant just the
> clients of this proposed HTTP interface, not *all* Oak clients.
>
> Just like in Jackrabbit 2.x there are multiple ways for clients to
> access the repository (local JCR access, RMI remoting, WebDAV, DAVex
> remoting, etc.), the proposed HTTP binding for Oak is just one among
> the potentially many different ways in which an Oak repository could
> be accessed.
>
>> do you envision oak-jcr being a client of this http binding?
>
> No, we already have the Oak API for that.
>
>> what potential clients to you have in mind?
>
> Especially JavaScript applications running in the browser (a good
> example is Create.js [1]) or mobile apps running on various embedded
> systems (phones, tablets, cars, etc.). Potentially also non-Java
> server-side integration tools like Jackalope [2].

ok, makes sense. thanks for the clarifications.

cheers
stefan

>
> [1] http://createjs.org/
> [2] http://jackalope.github.com/
>
> BR,
>
> Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Wed, Jun 27, 2012 at 6:49 PM, Stefan Guggisberg
<st...@gmail.com> wrote:
> well, statements like the following lead me to this assumption:

OK, I can see how I could have given such an impression. Sorry about
the poor wording. With "clients" in this context I meant just the
clients of this proposed HTTP interface, not *all* Oak clients.

Just like in Jackrabbit 2.x there are multiple ways for clients to
access the repository (local JCR access, RMI remoting, WebDAV, DAVex
remoting, etc.), the proposed HTTP binding for Oak is just one among
the potentially many different ways in which an Oak repository could
be accessed.

> do you envision oak-jcr being a client of this http binding?

No, we already have the Oak API for that.

> what potential clients to you have in mind?

Especially JavaScript applications running in the browser (a good
example is Create.js [1]) or mobile apps running on various embedded
systems (phones, tablets, cars, etc.). Potentially also non-Java
server-side integration tools like Jackalope [2].

[1] http://createjs.org/
[2] http://jackalope.github.com/

BR,

Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Stefan Guggisberg <st...@gmail.com>.

On Wed, Jun 27, 2012 at 3:07 PM, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> On Wed, Jun 27, 2012 at 2:57 PM, Stefan Guggisberg
> <st...@gmail.com> wrote:
>> now your proposal seems to imply a different architecture...
>
> You're reading far too much into this.

well, statements like the following lead me to this assumption:

<quote>
a native HTTP binding that allows remote clients to talk directly
with oak-core without having to go through the JCR layer.
</quote>

<quote>
instead of accessing separate namespace registration code, a
client could register a new namespace simply by posting it as
a normal content modification to the appropriate place in the
content tree
</quote>

do you envision oak-jcr being a client of this http binding?

>  I'm just thinking of exposing a
> feature that seems like it could come in handy for some potential
> clients.

what potential clients to you have in mind?

cheers
stefan

> Doing so requires zero changes to our existing architecture
> or APIs.
>
> BR,
>
> Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Wed, Jun 27, 2012 at 2:57 PM, Stefan Guggisberg
<st...@gmail.com> wrote:
> now your proposal seems to imply a different architecture...

You're reading far too much into this. I'm just thinking of exposing a
feature that seems like it could come in handy for some potential
clients. Doing so requires zero changes to our existing architecture
or APIs.

BR,

Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Angela Schreiber <an...@adobe.com>.

> for once i'm a 100% with felix ;)

LOL.... and in addition with angela :)

Re: Native HTTP bindings for Oak

Posted by Angela Schreiber <an...@adobe.com>.

hi stefan

> i thought that we have a consensus of how the oak stack should be layered, i.e.

that was my understanding as well

> app / sling / oak-jcr (trans. space) / [remoting /] oak-core [remoting
> /] oak-mk.

or alternatively:

app / non-java-content-repo-api / [remoting/] oak-core [remoting/] oak-mk

IMO keeping that alternative in mind was the very purpose for the
oak-api in the first place.

kind regards
angela

Re: Native HTTP bindings for Oak

Posted by Stefan Guggisberg <st...@gmail.com>.

On Wed, Jun 27, 2012 at 11:49 AM, Felix Meschberger <fm...@adobe.com> wrote:
> Hi,
>
> Am 27.06.2012 um 11:20 schrieb Jukka Zitting:
>
>> Hi,
>>
>> On Wed, Jun 27, 2012 at 10:25 AM, Angela Schreiber <an...@adobe.com> wrote:
>>> i don't fully see the use case for such long living sessions.
>
> FWIW, this was my first thought, too: This completely breaks stateless-ness of HTTP and introduces the use of Sessions. We can do that, but we have to know exactly, what it means and costs.
>
>>
>> The rationale is the same as for the branch feature we added to the
>> MicroKernel. Instead of having to maintain a separate transient tree
>> abstraction on the client side (which might be troublesome given
>> potentially limited storage capacity), it's better to be able to send
>> uncommitted data to the server for storage in a temporary branch where
>> it can be accessed using the existing tree abstraction already
>> provided by Oak.
>>
>> Most notably the session feature allows us to use such a HTTP binding
>> to implement remoting of the Oak API without the need for client-side
>> storage space and associated extra code for managing it.
>>
>>> IMO also importing big trees and batch read/writing should be covered
>>> by a single request.
>>
>> That quickly leads to increasingly complex server side features like
>> filtering or conditional saves.
>>
>> For example, think of a client like the Sling engine that first
>> resolves a path potentially with custom mappings, then follows a
>> complex set of resource type references, and finally renders a
>> representation of the resolved content based on the resource type
>> definitions that were found. Ideally (for consistency and better
>> caching support) it should be possible to perform the entire operation
>> based on a stable snapshot of the repository, but there's no way that
>> all the information required by such a process could be included in
>> the response of a single Oak HTTP request.
>
> With my Sling hat on, I am not sure about this example ;-)
>
> IMNSHO Sling should operate on JCR API and not on Oak Native HTTP binding.

for once i'm a 100% with felix ;)

if i understand jukka's proposal correctly, it's about promoting an alternative
public client api. in my understanding the ultimate goal of the oak project
is to come up with a highly efficient & scalable jcr implementation.
imo we should focus on this goal.

i thought that we have a consensus of how the oak stack should be layered, i.e.

app / sling / oak-jcr (trans. space) / [remoting /] oak-core [remoting
/] oak-mk.

now your proposal seems to imply a different architecture...

cheers
stefan


>
> Regards
> Felix
>
>
>>
>> Exposing the branch feature as proposed avoids the need for complex
>> server-side request processing logic and makes it easier to implement
>> many client-side features that would otherwise have to use local
>> storage or temporary content subtrees visible to other repository
>> clients.
>>
>> BR,
>>
>> Jukka Zitting
>

Re: Native HTTP bindings for Oak

Posted by Thomas Mueller <mu...@adobe.com>.

Hi,

I understand the point Felix is making. As of now, I would propose to drop
separate URI spaces.

I would also propose to drop the related MicroKernel branch/merge feature,
or at least not rely on the feature to be available. In my view, the
MicroKernel branch/merge feature which was introduced (relatively)
recently adds quite a bit of complexity to each MicroKernel implementation.

As far as I know, one reason why branch/merge was introduced was to reduce
complexity in oak-core, but the reduced complexity in oak-core increased
the complexity in each MicroKernel implementation. So we traded reduced
complexity of one part (one oak-core implementation) with added complexity
in many parts (multiple MicroKernel implementations).

The second reason why branch/merge was added is to support commits that
don't fit in memory. This is, if I remember correctly, not one of the
goals originally defined for Oak. We documented a goal of "> 100k nodes at
1kB each" at [1], that is 100 MB. That should fit in memory.

[1] 
http://wiki.apache.org/jackrabbit/Goals%20and%20non%20goals%20for%20Jackrab
bit%203

Regards,
Thomas

On 6/27/12 12:13 PM, "Jukka Zitting" <ju...@gmail.com> wrote:

>Hi,
>
>On Wed, Jun 27, 2012 at 11:49 AM, Felix Meschberger <fm...@adobe.com>
>wrote:
>> FWIW, this was my first thought, too: This completely breaks
>>stateless-ness
>> of HTTP and introduces the use of Sessions.
>
>I think you're misreading the proposal. The feature uses separate URI
>spaces so all information needed to access such "sessions" is encoded
>in each request and depends on no shared state between the client and
>the server.
>
>> With my Sling hat on, I am not sure about this example ;-)
>
>I'm just using Sling as an concrete example of an existing complex
>client. A JavaScript application or another remote client could easily
>have just as complex content access requirements, and I think it would
>be good for us to be prepared for such cases.
>
>BR,
>
>Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Michael Dürig <md...@apache.org>.


On 27.6.12 13:52, Felix Meschberger wrote:
> Hi,
>
> Am 27.06.2012 um 13:49 schrieb Michael Dürig:
>
>>
>>
>> On 27.6.12 12:50, Felix Meschberger wrote:
>>> Hi,
>>>
>>> Am 27.06.2012 um 12:13 schrieb Jukka Zitting:
>>>
>>>> Hi,
>>>>
>>>> On Wed, Jun 27, 2012 at 11:49 AM, Felix Meschberger <fm...@adobe.com> wrote:
>>>>> FWIW, this was my first thought, too: This completely breaks stateless-ness
>>>>> of HTTP and introduces the use of Sessions.
>>>>
>>>> I think you're misreading the proposal. The feature uses separate URI
>>>> spaces so all information needed to access such "sessions" is encoded
>>>> in each request and depends on no shared state between the client and
>>>> the server.
>>>
>>> Its not about shared state but about state maintained on the server which means the exchange is not stateless any longer.
>>
>> Yes but that's no different from any other POST/PUT/DELETE request.
>
> Absolutely not. Per-se these requests are stateless.

They are stateless in the sense that there is no "user session" 
involved. But by definition they modify the server's state by changing, 
adding or removing a resource. This resembles Jukka's initial proposal 
very much.

Michael

>
> Jukka's proposal really introduces server state even though he "hides" it behind a resource URL. The state introduced is the JCR session kept on the server.
>
> Regards
> Felix
>
>>
>> Michael
>>
>>>
>>> The only difference to a Serlvet API HttpSession is, that the session key is part of the URI path instead of a request parameter or cookie.
>>>
>>> Regards
>>> Felix
>>>
>>
>

Re: Native HTTP bindings for Oak

Posted by Felix Meschberger <fm...@adobe.com>.

Hi,

Am 27.06.2012 um 13:49 schrieb Michael Dürig:

> 
> 
> On 27.6.12 12:50, Felix Meschberger wrote:
>> Hi,
>> 
>> Am 27.06.2012 um 12:13 schrieb Jukka Zitting:
>> 
>>> Hi,
>>> 
>>> On Wed, Jun 27, 2012 at 11:49 AM, Felix Meschberger <fm...@adobe.com> wrote:
>>>> FWIW, this was my first thought, too: This completely breaks stateless-ness
>>>> of HTTP and introduces the use of Sessions.
>>> 
>>> I think you're misreading the proposal. The feature uses separate URI
>>> spaces so all information needed to access such "sessions" is encoded
>>> in each request and depends on no shared state between the client and
>>> the server.
>> 
>> Its not about shared state but about state maintained on the server which means the exchange is not stateless any longer.
> 
> Yes but that's no different from any other POST/PUT/DELETE request.

Absolutely not. Per-se these requests are stateless.

Jukka's proposal really introduces server state even though he "hides" it behind a resource URL. The state introduced is the JCR session kept on the server.

Regards
Felix

> 
> Michael
> 
>> 
>> The only difference to a Serlvet API HttpSession is, that the session key is part of the URI path instead of a request parameter or cookie.
>> 
>> Regards
>> Felix
>> 
>

Re: Native HTTP bindings for Oak

Posted by Michael Dürig <md...@apache.org>.


On 27.6.12 12:50, Felix Meschberger wrote:
> Hi,
>
> Am 27.06.2012 um 12:13 schrieb Jukka Zitting:
>
>> Hi,
>>
>> On Wed, Jun 27, 2012 at 11:49 AM, Felix Meschberger <fm...@adobe.com> wrote:
>>> FWIW, this was my first thought, too: This completely breaks stateless-ness
>>> of HTTP and introduces the use of Sessions.
>>
>> I think you're misreading the proposal. The feature uses separate URI
>> spaces so all information needed to access such "sessions" is encoded
>> in each request and depends on no shared state between the client and
>> the server.
>
> Its not about shared state but about state maintained on the server which means the exchange is not stateless any longer.

Yes but that's no different from any other POST/PUT/DELETE request.

Michael

>
> The only difference to a Serlvet API HttpSession is, that the session key is part of the URI path instead of a request parameter or cookie.
>
> Regards
> Felix
>

Re: Native HTTP bindings for Oak

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Wed, Jun 27, 2012 at 4:02 PM, Felix Meschberger <fm...@adobe.com> wrote:
> Ah ! Sounds much better now. Thanks alot for the clarification.
>
> So
>
>  $ curl -X DELETE http://localhost:8080/branch/X
>
> would in fact drop the branch, right ?

Indeed.

BR,

Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Felix Meschberger <fm...@adobe.com>.

Hi,

Ah ! Sounds much better now. Thanks alot for the clarification.

So

  $ curl -X DELETE http://localhost:8080/branch/X

would in fact drop the branch, right ?

Regards
Felix

Am 27.06.2012 um 15:00 schrieb Jukka Zitting:

> Hi,
> 
> On Wed, Jun 27, 2012 at 12:50 PM, Felix Meschberger <fm...@adobe.com> wrote:
>> Its not about shared state but about state maintained on the server which means
>> the exchange is not stateless any longer.
> 
> I don't follow this argument; the entire repository is one big piece
> of server-side state.
> 
> Let's drop the term "session" here as it's clearly confusing things
> and call this feature "branching":
> 
>   $ curl http://localhost:8080/content
>   {}
>   $ curl -d create=true http://localhost:8080/branch
>   Location: http://localhost:8080/branch/X
>   $ curl http://localhost:8080/branch/X
>   {}
>   $ curl -d foo=bar http://localhost:8080/branch/X
>   {"foo":"bar"}
>   $ curl http://localhost:8080/content
>   {}
>   $ curl -d commit=true -d remove=true http://localhost:8080/branch/X
>   $ curl http://localhost:8080/content
>   {"foo":"bar"}
> 
> The only difference between such an operation and that of using a
> separate cloned subtree (or workspace) is that the latter is visible
> to all repository clients and the former only to those that have the
> relevant URI.
> 
> BR,
> 
> Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Wed, Jun 27, 2012 at 12:50 PM, Felix Meschberger <fm...@adobe.com> wrote:
> Its not about shared state but about state maintained on the server which means
> the exchange is not stateless any longer.

I don't follow this argument; the entire repository is one big piece
of server-side state.

Let's drop the term "session" here as it's clearly confusing things
and call this feature "branching":

   $ curl http://localhost:8080/content
   {}
   $ curl -d create=true http://localhost:8080/branch
   Location: http://localhost:8080/branch/X
   $ curl http://localhost:8080/branch/X
   {}
   $ curl -d foo=bar http://localhost:8080/branch/X
   {"foo":"bar"}
   $ curl http://localhost:8080/content
   {}
   $ curl -d commit=true -d remove=true http://localhost:8080/branch/X
   $ curl http://localhost:8080/content
   {"foo":"bar"}

The only difference between such an operation and that of using a
separate cloned subtree (or workspace) is that the latter is visible
to all repository clients and the former only to those that have the
relevant URI.

BR,

Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Felix Meschberger <fm...@adobe.com>.

Hi,

Am 27.06.2012 um 12:13 schrieb Jukka Zitting:

> Hi,
> 
> On Wed, Jun 27, 2012 at 11:49 AM, Felix Meschberger <fm...@adobe.com> wrote:
>> FWIW, this was my first thought, too: This completely breaks stateless-ness
>> of HTTP and introduces the use of Sessions.
> 
> I think you're misreading the proposal. The feature uses separate URI
> spaces so all information needed to access such "sessions" is encoded
> in each request and depends on no shared state between the client and
> the server.

Its not about shared state but about state maintained on the server which means the exchange is not stateless any longer.

The only difference to a Serlvet API HttpSession is, that the session key is part of the URI path instead of a request parameter or cookie.

Regards
Felix

Re: Native HTTP bindings for Oak

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Wed, Jun 27, 2012 at 11:49 AM, Felix Meschberger <fm...@adobe.com> wrote:
> FWIW, this was my first thought, too: This completely breaks stateless-ness
> of HTTP and introduces the use of Sessions.

I think you're misreading the proposal. The feature uses separate URI
spaces so all information needed to access such "sessions" is encoded
in each request and depends on no shared state between the client and
the server.

> With my Sling hat on, I am not sure about this example ;-)

I'm just using Sling as an concrete example of an existing complex
client. A JavaScript application or another remote client could easily
have just as complex content access requirements, and I think it would
be good for us to be prepared for such cases.

BR,

Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Felix Meschberger <fm...@adobe.com>.

Hi,

Am 27.06.2012 um 11:20 schrieb Jukka Zitting:

> Hi,
> 
> On Wed, Jun 27, 2012 at 10:25 AM, Angela Schreiber <an...@adobe.com> wrote:
>> i don't fully see the use case for such long living sessions.

FWIW, this was my first thought, too: This completely breaks stateless-ness of HTTP and introduces the use of Sessions. We can do that, but we have to know exactly, what it means and costs.

> 
> The rationale is the same as for the branch feature we added to the
> MicroKernel. Instead of having to maintain a separate transient tree
> abstraction on the client side (which might be troublesome given
> potentially limited storage capacity), it's better to be able to send
> uncommitted data to the server for storage in a temporary branch where
> it can be accessed using the existing tree abstraction already
> provided by Oak.
> 
> Most notably the session feature allows us to use such a HTTP binding
> to implement remoting of the Oak API without the need for client-side
> storage space and associated extra code for managing it.
> 
>> IMO also importing big trees and batch read/writing should be covered
>> by a single request.
> 
> That quickly leads to increasingly complex server side features like
> filtering or conditional saves.
> 
> For example, think of a client like the Sling engine that first
> resolves a path potentially with custom mappings, then follows a
> complex set of resource type references, and finally renders a
> representation of the resolved content based on the resource type
> definitions that were found. Ideally (for consistency and better
> caching support) it should be possible to perform the entire operation
> based on a stable snapshot of the repository, but there's no way that
> all the information required by such a process could be included in
> the response of a single Oak HTTP request.

With my Sling hat on, I am not sure about this example ;-)

IMNSHO Sling should operate on JCR API and not on Oak Native HTTP binding.

Regards
Felix


> 
> Exposing the branch feature as proposed avoids the need for complex
> server-side request processing logic and makes it easier to implement
> many client-side features that would otherwise have to use local
> storage or temporary content subtrees visible to other repository
> clients.
> 
> BR,
> 
> Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Wed, Jun 27, 2012 at 10:25 AM, Angela Schreiber <an...@adobe.com> wrote:
> i don't fully see the use case for such long living sessions.

The rationale is the same as for the branch feature we added to the
MicroKernel. Instead of having to maintain a separate transient tree
abstraction on the client side (which might be troublesome given
potentially limited storage capacity), it's better to be able to send
uncommitted data to the server for storage in a temporary branch where
it can be accessed using the existing tree abstraction already
provided by Oak.

Most notably the session feature allows us to use such a HTTP binding
to implement remoting of the Oak API without the need for client-side
storage space and associated extra code for managing it.

> IMO also importing big trees and batch read/writing should be covered
> by a single request.

That quickly leads to increasingly complex server side features like
filtering or conditional saves.

For example, think of a client like the Sling engine that first
resolves a path potentially with custom mappings, then follows a
complex set of resource type references, and finally renders a
representation of the resolved content based on the resource type
definitions that were found. Ideally (for consistency and better
caching support) it should be possible to perform the entire operation
based on a stable snapshot of the repository, but there's no way that
all the information required by such a process could be included in
the response of a single Oak HTTP request.

Exposing the branch feature as proposed avoids the need for complex
server-side request processing logic and makes it easier to implement
many client-side features that would otherwise have to use local
storage or temporary content subtrees visible to other repository
clients.

BR,

Jukka Zitting

Re: Native HTTP bindings for Oak

Posted by Angela Schreiber <an...@adobe.com>.

hi jukka

makes a lot of sense to me and pretty matches the goal of he
whole JSOP approach.

there is just one thing i don't feel totally comfortable with:

> By default the HTTP binding could simply use a fresh new session for
> each HTTP request, but it should be possible for a client to request a
> longer-lived session for more complex content modifications (import,
> batch jobs, etc.) [...]

i don't fully see the use case for such long living sessions.
IMO also importing big trees and batch read/writing should be covered
by a single request. in jackrabbit2 this wasn't possible for all
cases due to the fact that we only had a JCR repo on the server
side of the remoting. now that we finally have native implementation
that we can model according to our needs, i am not convinced that this
is sensible.

kind regards
angela

Re: Native HTTP bindings for Oak

Posted by Christian Stocker <ch...@liip.ch>.

Hi

We're from PHPCR and Jackalope are of course very interested in those
HTTP bindings and would love to play with them and see if it fits our
needs and give you more input. You're suggestions below seem to make
sense to me.

Greetings

chregu


On 25.06.12 14:26, Jukka Zitting wrote:
> Hi,
> 
> As suggested in OAK-104 [1] and discussed briefly before, I think it
> would be useful for Oak to have a native HTTP binding that allows
> remote clients to talk directly with oak-core without having to go
> through the JCR layer.
> 
> The main benefit of such a lower-level HTTP binding is that it can
> better leverage the "everything is (JSON) content" nature of Oak
> internals than a higher-level layer like our existing WebDAV(ex)
> components that only assume a generic JCR content repository. This
> makes the binding much more straightforward and probably helps avoid
> performance or scalability bottlenecks down the line. It also makes it
> easier to support a wider range of functionality without having to add
> custom integration code for all features. For example, instead of
> accessing separate namespace registration code, a client could
> register a new namespace simply by posting it as a normal content
> modification to the appropriate place in the content tree, like this:
> 
>     $ curl -d foo=http://foo.example.com/ns
> http://localhost:8080/content/jcr:system/jcr:namespaces
> 
> Unlike a MicroKernel remoting layer, the Oak HTTP binding would still
> have all the commit validation and access control logic in place, so
> one couldn't use it to bypass the rules that govern more traditional
> JCR-based access.
> 
> I'm not yet sure about what the binding should look like in detail,
> but my general idea is to follow JSOP protocol ideas [2] and try to
> keep things as simple and easy as possible for clients like JavaScript
> libraries [3] and command line tools like curl. The initial draft code
> I committed just recently to the new oak-http component uses a basic
> URL to Node mapping and a simple JSON / binary JSON rendering based on
> the contents of the HTTP Accept header. It would be nice to support
> different kinds of content renderings and also to allow similar
> content renderings to be POSTed or PUT to the server. The binding
> should also accept basic HTML form POSTs with the
> application/x-www-form-urlencoded media type.
> 
> By default the HTTP binding could simply use a fresh new session for
> each HTTP request, but it should be possible for a client to request a
> longer-lived session for more complex content modifications (import,
> batch jobs, etc.) or for getting a stable snapshot for larger reads
> (export, query, etc.) that shouldn't change while reading. I was
> thinking of handling such cases by allowing the client to generate
> such a session with a specific POST request that responds with a
> redirect to a temporary session URL that exposes the normal content
> tree as seen through that session. We'd use a lease mechanism to
> control the lifetime of such server-side sessions.
> 
> As an example, something like the following interaction could be used
> to acquire a new session (with a initial lifetime of one hour,
> lease=3600), make some changes to the content tree within that
> session, and finally commit those changes and release the session
> (lease=0).
> 
>     $ curl -d lease=3600 http://localhost:8080/session
>     http://localhost:8080/session/a9c49810bebf11e1afa70800200c9a66
>     $ curl -d bar=baz
> http://localhost:8080/session/a9c49810bebf11e1afa70800200c9a66/foo
>     {"bar":"baz"}
>     $ curl -d save=true -d lease=0
> http://localhost:8080/session/a9c49810bebf11e1afa70800200c9a66
> 
> [1] https://issues.apache.org/jira/browse/OAK-104
> [2] http://wiki.apache.org/jackrabbit/Jsop
> [3] https://issues.apache.org/jira/browse/OAK-103
> 
> BR,
> 
> Jukka Zitting
> 

-- 
Liip AG  //  Feldstrasse 133 //  CH-8004 Zurich
Tel +41 43 500 39 81 // Mobile +41 76 561 88 60
www.liip.ch // blog.liip.ch // GnuPG 0x0748D5FE

Re: Native HTTP bindings for Oak

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Mon, Jun 25, 2012 at 2:26 PM, Jukka Zitting <ju...@gmail.com> wrote:
> By default the HTTP binding could simply use a fresh new session for
> each HTTP request, but it should be possible for a client to request a
> longer-lived session for more complex content modifications (import,
> batch jobs, etc.) or for getting a stable snapshot for larger reads
> (export, query, etc.) that shouldn't change while reading. I was
> thinking of handling such cases by allowing the client to generate
> such a session with a specific POST request that responds with a
> redirect to a temporary session URL that exposes the normal content
> tree as seen through that session. We'd use a lease mechanism to
> control the lifetime of such server-side sessions.

As a nice extra benefit, such a solution gives us effective protection
against CSRF attacks if we require that all writes need to go through
such sessions, with the session URL acting as a token that the
potential attacker can't access or use.

BR,

Jukka Zitting