You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jackrabbit.apache.org by David Rauschenbach <da...@synchronica.com> on 2008/02/08 23:15:48 UTC

Re: SPI caching, was: [jira] Resolved: (JCR-1361) Lock testassumes that changes in one session are immediately visible in differentsession

Yeah, I think I hear what you're saying, and I understand that tight
spot. I have to confess that I also use JCR2SPI as the primary client
for SPI. But I have performance problems to solve, so that <n> JCR calls
don't explode into <n>*6 SPI invocations, so I have to play all kinds of
tricks now, to divine extra information from the client and/or server.
BatchReadConfig is further evidence of the need for this kind of thing.

Anyway, I am also very shocked to see XPath and SQL being "distanced" in
JSR283. Because the point I was going to make, before that revelation,
was that an XPath or SQL query processor would be just the kind of
source of that extra intent that would allow the middleware harbor to
dispatch the correct number of ships across the channel.

It's interesting that JSR283 extolls the virtues of multiple access
paradigms (XPath / SQL and hierarchical traversal), but at the same time
makes it seem like it's headed towards hierarchical traversal only.

It reminds me of what drives me crazy about the database market. Back in
the ISAM days of xBase, Paradox, Raima, etc, we could all traverse tens
of thousands of records per-second. Then SQL caught on, and you could
then do both traversal and sets, which was like the best of both worlds.
Then a funny thing happened -- traversal disappeared, and was lost for
15 years, and data access became slow, since everything had to be
shoe-horned into SQL, or some ODBC/JDBC batch mode, which might require
turning off indexes or ACID protection! Whatever. Now mixed-mode
ISAM/SQL engines are slowly coming back, even though the database
engines are few and far between that go out of their way to support
both.

It's hard to imagine getting much done with an Exchange server via
WebDAV, without query support. Talk about needing a shoe-horn, doing
everything via iteration! I'm guessing content repository vendors are
steering the ship, when what we have here is a very good API that also
works for content middleware.

Conceptually, I think the best way to think of SPI is to still pretend
there's WebDAV in the front and back, with SPI in the middle. If there's
a client doing JCR hierarchical traversal via JCR2SPI, then you end up
with small high-freqeuncy SPI requests. If you do XPath or SQL over SPI,
then you end up with lesser fatter SPI requests, like a PROPFIND. Or if
you're proxying JCR content without a JCR client per-se at the front,
then you have an API that can relay the *content* of JSR170, without
needing to care too much about whether the front-end is JCR, WebDAV,
IMAP, RSS, or some other protocol endpoint.

>>From a middleware point of view, which I would call SPI's point of view,
you only need to have some idea of what you're dealing with, which is
nodes & properties, depths and namespaces, collections & filters,
queries and observation, and a session. It shouldn't matter whether JCR
is at the front end, or something else like a WebDAV proxy shaping more
specific requests.

That's just my 2 cents. I like SPI because of its simplicity. But
performance is problematic, and outside of my control right now, and I
have caches and NodeTypeManagers in my SPIs, even though I am not
supposed to. I also have my own PathElement comparators, to get SPI to
work, so that 0 (unspecified) and 1 (default) indexes are considered
equivalent, but that is another story...

David

On Fri, 2008-02-08 at 18:02 +0100, Marcel Reutegger wrote:
> David Rauschenbach wrote:
> > also worth mentioning is why a requested-depth argument is missing from
> > getItemInfos. It's just a little strange for the server to choose what to do,
> > or to have a pre-configured nodetype-specific batch strategy configured
> > there, when the client is where it's at, where it's known what's to be
> > requested.
> 
> our primary SPI client that we have in mind is jcr2spi. here we are in the same 
> tight spot. jcr2spi does not know in advance what properties a client will 
> request after it got a node. even if we had the ability in the SPI to pass a 
> hint, jcr2spi cannot make use of it in a reasonable way.
> 
> for jcr2spi there are only two patterns it can distinguish. a JCR client gets a 
> named item (getNode/Property()) or an iterator over items 
> (getNodes/Properties()). At least the latter should not result in individual 
> calls for each item.
> 
> regards
>   marcel

Visit Synchronica at GSMA Mobile World Congress, Barcelona, 11-14 Feb, Hall 2, Booth #2J25

Re: SPI caching, was: [jira] Resolved: (JCR-1361) Lock testassumes that changes in one session are immediately visible in differentsession

Posted by Marcel Reutegger <ma...@gmx.net>.

David Rauschenbach wrote:
> Yeah, I think I hear what you're saying, and I understand that tight
> spot. I have to confess that I also use JCR2SPI as the primary client
> for SPI. But I have performance problems to solve, so that <n> JCR calls
> don't explode into <n>*6 SPI invocations, so I have to play all kinds of
> tricks now, to divine extra information from the client and/or server.

we should definitely improve that situation. so far we didn't invest too much 
time in performance analysis but first wanted to have an SPI stack that works 
correctly. I also think that it is now time to carefully analyze the message 
complexity for each JCR call and if needed change the SPI interfaces.

> I like SPI because of its simplicity. But performance is problematic, and
> outside of my control right now, [...]

please let us know what issues you have with the SPI stack. feedback is always 
welcome and gives us an additional view on the SPI that we probably overlooked 
in the past.

you are also welcome to gain control ;) if you have ideas how to improve the SPI 
stack or have patches, please let us know and we will be happy to consider them.

> [...] and I have caches and NodeTypeManagers in my SPIs, even though I am not
>  supposed to.

at some point node type definitions were requested extensively. JCR-1030 should 
have improved that situation.

> I also have my own PathElement comparators, to get SPI to work,
> so that 0 (unspecified) and 1 (default) indexes are considered equivalent,
> but that is another story...

Can you please describe in more detail why you had to do this?

regards
  marcel