You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@chemistry.apache.org by David Nuescheler <da...@day.com> on 2009/12/14 09:05:32 UTC

Comparing Chemistry & OpenCMIS codebases

(was: Re: [PROPOSAL] OpenCMIS incubator for Content Mangement
Interoperability Services (CMIS))

Hi Jens,

thanks for pointing out the differences in the codebases.

I think all the points that you raise point into the direction of the fact
that the code of the opencmis proposal is mature and and well
thought through, which is absolutely great.

...and as you point out there there are definitely areas that the
chemistry code could improve and be extended.

Personally I am more than happy to support the evolution of
the current chemistry codebase as soon as feasible into the
directions you describe.
I think these are all good points that you bring up.

Of course I will leave Florent and some of the other more regular
committers who are definitely more intimate with the code to make
comments but I would really assume that we do not have any
fundamental architecture differences, but it is just the case that
in the chemistry codebase we have not gotten around to doing
some of those things.

Sounds like great news to me to get architecture discussions like
these discussed in the open and see how much of an alignment
we can get...

regards,
david

On Sun, Dec 13, 2009 at 7:44 PM, Jens Hübel <jh...@opentext.com> wrote:
> Hey Jukka,
>
> let's discuss some of the technical areas how to continue.
>
>> Would it at least make sense for the projects to share a common
>> org.apache.cmis package with with things like constants defined in the
>> CMIS standard and other basic concepts that everyone can agree with?
>
> Definitely there are areas where sharing makes sense. The constants, etc are probably the simplest pieces. I see some more:
>
> Query Parser:
> Chemistry did a pretty good job with the query parser. Currently with focus on client side maybe not on our top list, but I do not see any reason to implement this a second time. And this is one of the more complex sub-projects.
>
> Web Service:
> Not sure how far you are here. But the provider implementation should be functional complete in regards of the WS binding. You should take a closer look if you see any chance to benefit from this.
>
> Caching:
> If I got it right there is no Caching yet in Chemistry. Caching (at least of the type system) is probably crucial for an acceptable performance of any CMIS implementation. We already have some form of caching, perhaps not yet the most sophisticated one. This will highly depend on the internal object structure that is cached, but I assume that at least parts of the code can be shared.
>
> This list isn't complete but this comes into my mind right now.
>
> We also should outline some more of the design differences:
>
>> Indeed in Chemistry the AtomPub client code doesn't use JAX-B based
>> serialization so doesn't share code with SOAP.
>
> This is very different compared to the OpenCMIS approach. Both bindings use the same XML and our intent was to share as much as possible between SOAP and Atom and keep the binding specifics to a minimum. The consumer of the API should not see a difference between the two bindings. On top of the JAXB generated class we have handcrafted APIs and implementations. As Florian pointed out we favor a two-level approach here with data transfer classes (being stateless with the exception of caching) and a stateful object-oriented client API.
>
>
> Testing:
> As mentioned above the client should not be aware of the binding in use. Our tests are therefore consumers of the API and agnostic to the binding. There would be an ObjectServiceTest but not an AtomXYZtest. What I see in Chemistry is a TCK based on AtomPub and directly operating on feeds etc. Is it intended to continue this way or is this more an interim step (you need to start with testing somewhere)? BTW our code isn't totally clean in this regard as well yet.
>
> AtomPub and Web Service:
> Looking at the chemistry code this seems to be (currently) focused very much on AtomPub. One of the OpenCMIS design goals is to treat both protocol bindings equally. WS is a first class citizen for OpenCMIS. What are the plans for Chemistry in regards of web services?
>
> And as far as I remember one question is still unanswered:
> What is the intended way in Chemistry how a client API consumer controls when a request to the server is made? Do you have any design around this so far?
>
> Jens
>
>
>
>
>
>
>
>



-- 
David Nuescheler
Chief Technology Officer
mailto: david.nuescheler@day.com

web:  http://www.day.com/ http://dev.day.com
twitter: @daysoftware

Re: Comparing Chemistry & OpenCMIS codebases

Posted by David Nuescheler <da...@day.com>.
hi gianugo,

good point... i will send them all a private invite to the chemistry list.
the good news was that Jens' prior email was to the chemistry list only, IIRC.

regards,
david

On Mon, Dec 14, 2009 at 12:20 PM, Gianugo Rabellino
<g....@sourcesense.com> wrote:
> David,
>
> great to see the discussion is heading towards technical grounds -
> just a side note: are you sure all the OpenCMIS folks are following
> chemistry-dev? Although I hate seeing noise on the incubator list, I'd
> still like to make sure everyone gets a shot...
>
> --
> Gianugo Rabellino
> M: +44 779 5364 932 / +39 389 44 26 846
> Sourcesense - making sense of Open Source: http://www.sourcesense.com
>



-- 
David Nuescheler
Chief Technology Officer
mailto: david.nuescheler@day.com

web:  http://www.day.com/ http://dev.day.com
twitter: @daysoftware

Re: Comparing Chemistry & OpenCMIS codebases

Posted by Gianugo Rabellino <g....@sourcesense.com>.
David,

great to see the discussion is heading towards technical grounds -
just a side note: are you sure all the OpenCMIS folks are following
chemistry-dev? Although I hate seeing noise on the incubator list, I'd
still like to make sure everyone gets a shot...

-- 
Gianugo Rabellino
M: +44 779 5364 932 / +39 389 44 26 846
Sourcesense - making sense of Open Source: http://www.sourcesense.com

Re: Comparing Chemistry & OpenCMIS codebases

Posted by Florent Guillaume <fg...@nuxeo.com>.
On Mon, Dec 14, 2009 at 9:05 AM, David Nuescheler
<da...@day.com> wrote:
> Of course I will leave Florent and some of the other more regular
> committers who are definitely more intimate with the code to make
> comments but I would really assume that we do not have any
> fundamental architecture differences, but it is just the case that
> in the chemistry codebase we have not gotten around to doing
> some of those things.

Yes, it must be stressed that while we have nice plans for Chemistry,
it takes time to write code therefore a number of pieces from the
puzzle are planned but not yet there.

> On Sun, Dec 13, 2009 at 7:44 PM, Jens Hübel <jh...@opentext.com> wrote:
>>> Would it at least make sense for the projects to share a common
>>> org.apache.cmis package with with things like constants defined in the
>>> CMIS standard and other basic concepts that everyone can agree with?
>>
>> Definitely there are areas where sharing makes sense. The constants, etc are probably the simplest pieces.

Note that from a software engineering point of view these constants
are trivial and are already written, and putting them in common code
will bring nothing (except maybe adding an additional JAR to the
distributions).

>> I see some more:
>>
>> Query Parser:
>> Chemistry did a pretty good job with the query parser. Currently with focus on client side maybe not on our top list, but I do not see any reason to implement this a second time. And this is one of the more complex sub-projects.

Thanks. And it definitely can still be improved, it's only be lightly
used as of now (though we have a customer using it together with
another ANTLR layer for the Tree -> SQL conversion in the Nuxeo
backend, including the JOIN features).

>> Web Service:
>> Not sure how far you are here. But the provider implementation should be functional complete in regards of the WS binding. You should take a closer look if you see any chance to benefit from this.

Yes we haven't touched the client-side SOAP yet, as the focus in our
community has been mostly around REST. I've written server-side code
for the Nuxeo backend, but no client code. Although I don't expect
much difficulty as the SPI is designed to be mapped in a
straightforward manner to the domain model and therefore SOAP.

>> Caching:
>> If I got it right there is no Caching yet in Chemistry. Caching (at least of the type system) is probably crucial for an acceptable performance of any CMIS implementation. We already have some form of caching, perhaps not yet the most sophisticated one. This will highly depend on the internal object structure that is cached, but I assume that at least parts of the code can be shared.

Yes no caching is done yet inside Chemistry. I thought I'd leave it to
a further pass of optimization. The Repository is using a TypeManager
abstraction which would be the natural place to do type caching. For
caching of CMIS objects, the Connection is where I want to put it.
Note that caching in a distributed system implies cache invalidation
for correctness, therefore notifications from the server to the
caching client if you want things to be synchronous. This kind of
thing is hard to do correctly, if we don't want to use timeouts or
heuristics we'll have to poll the Change Log to know when to
invalidate stuff... So: later :)

>> This list isn't complete but this comes into my mind right now.
>>
>> We also should outline some more of the design differences:
>>
>>> Indeed in Chemistry the AtomPub client code doesn't use JAX-B based
>>> serialization so doesn't share code with SOAP.
>>
>> This is very different compared to the OpenCMIS approach. Both bindings use the same XML and our intent was to share as much as possible between SOAP and Atom and keep the binding specifics to a minimum. The consumer of the API should not see a difference between the two bindings.

I agree with that last statement, but this doesn't necessarily mean
the bindings have to share some code (although I see the benefits it
can bring).

>> On top of the JAXB generated class we have handcrafted APIs and implementations. As Florian pointed out we favor a two-level approach here with data transfer classes (being stateless with the exception of caching) and a stateful object-oriented client API.

This is probably where we differ most, but I haven't really studied
your code yet.

>> Testing:
>> As mentioned above the client should not be aware of the binding in use. Our tests are therefore consumers of the API and agnostic to the binding. There would be an ObjectServiceTest but not an AtomXYZtest. What I see in Chemistry is a TCK based on AtomPub and directly operating on feeds etc. Is it intended to continue this way or is this more an interim step (you need to start with testing somewhere)? BTW our code isn't totally clean in this regard as well yet.

The core of the tests today go in BasicTestCase, which is an abstract
class used both by TestAtomPubClientServer and TestSimpleDirect, and
in the future TestSOAPClientServer and others. To test CMIS you need a
dialogue between a server and a client, and therefore these tests plug
together a Chemistry server and a Chemistry client. These tests are
the first ones I want to flesh out and make sure have complete code
coverage.
But the existence of separate AtomPubServerTestCase that just tests a
server is important as well, as I want to ensure the protocol is
really tested and validated at the HTTP level. Otherwise if we rely on
both the Chemistry client code and server code to be correct, we may
introduce idiosyncrasies that make the tests pass but don't strictly
follow the spec. This is partly redundant with the TCK, so there will
have to be refactoring to do there.

>> AtomPub and Web Service:
>> Looking at the chemistry code this seems to be (currently) focused very much on AtomPub. One of the OpenCMIS design goals is to treat both protocol bindings equally. WS is a first class citizen for OpenCMIS. What are the plans for Chemistry in regards of web services?

The plan is to have SOAP be a complete first-class citizen as well
(for client and server sides).

>> And as far as I remember one question is still unanswered:
>> What is the intended way in Chemistry how a client API consumer controls when a request to the server is made? Do you have any design around this so far?

When using the SPI, every operation does a request to the server (by
definition of the SPI which is a direct mapping of the protocol).
When using the high-level API, Chemistry has an explicit
CMISObject.save() operation to flush object changes (allowing several
calls to setProperty() for instance, before doing a network request).
For other operations usually I expect the implementation to do direct
server requests, except that any caching layer will be allowed to
optimized things if required. I want the use of the high-level API to
be done on an abstracted Connection object that makes things as
efficient as possible given the constraints of the domain model, and
be object-oriented as much as possible — which will mean caching of
course.

Florent

-- 
Florent Guillaume, Director of R&D, Nuxeo
Open Source, Java EE based, Enterprise Content Management (ECM)
http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87