You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Paolo Castagna <ca...@googlemail.com> on 2011/02/25 09:17:45 UTC

Re: [jena-dev] TDB concurrency

Hi Danny,
Dave and Damian have already answered your question.
In the past, I had those sort of doubts about MRSW myself.
A simple multi threaded test can help finding the answer.

For clarity, it's MR XOR SW.

It's an exclusive OR: if you are writing, nobody can read.
Therefore a long write would block access to readers.
Similarly, a long read would block any write too.

There is an issue (i.e. New Feature) for this:
https://issues.apache.org/jira/browse/JENA-41
We should discuss technical details there.

Andy proposed "journaled file access".

I'd like to help on this and try to do a prototype as a proof of concept.
However, I am not an expert on this (i.e. I've never wrote a journaled
file access system before) and it does not appear to be only a "small
matter of programming". ;-)

There are a lot of details which are not clear to me.

Damian's suggestion works perfectly for your.

Paolo

PS:
I know it's a pain, but we are trying to move the mailing
list to Apache. Please, subscribe by sending an email to:
jena-dev-subscribe AT incubator.apache.org
jena-users-subscribe AT incubator.apache.org

Danny Ayers wrote:
> on the wiki it says:
> [[
> TDB provides a Multiple Reader or Single Writer (MRSW) policy for
> concurrency access. Applications are expected to adhere to this policy
> - it is not automatically checked.
> 
> One gotcha is Java iterators. An iterator that is moving over the
> database is making read operations and no updates to the dataset are
> possible while an iterator is being used.
> ]]
> 
> I'd like to check I'm reading this correctly - is it that many readers
> can access the data concurrently but the (one and only) writer should
> have an exclusive lock - and that lock should block reading..?
> 
> The scenario I'm looking at will be TDB shared between Fuseki and
> programmatic access (a Turtle editor).
> 
> I've haven't yet really got a clue how I'll handle the sharing, so if
> anyone's got any code for a similar situation I'd be grateful for a
> pointer.
> 
> (Right now I've got the editing happening on a single memory model, so
> for the moment at least I can probably get away with access to TDB
> models through a read-(edit)-replace kind of cycle).
> 
> Cheers,
> Danny.
> 

Re: [jena-dev] TDB concurrency

Posted by Andy Seaborne <an...@epimorphics.com>.

On 26/02/11 08:47, Danny Ayers wrote:
> [on jena-users@incubator.apache.org now...right place finally]
>
> I've decided on Plan D : do nothing :)
>
> Just for ref, here's my decision process:
>
> The idea was for my Turtle editor app to have an integrated server, so
> any created data could automagically be served over HTTP (this ties in
> with the Semantic Web in a Box angle). One option that appealed to me
> further down the line was to give the app a headless (no Swing) option
> so the same setup could be used to deploy a semweb server. I'll be
> using TDB anyway, and Jetty seems like a natural choice for HTTP
> server.
>
> Plan A : code it up myself
> seems like quite a lot of work, especially given the stuff already
> done by others
>
> Plan B : Fuseki
> I had a poke around the download. I'd assumed it was a just a couple
> of extra classes to glue TDB to Jetty, but there's quite a bit more,
> notably an impressive effort to package everything together to make a
> self-contained package. Cool, but for programmatic integration (e.g.
> using the same Jetty instance to serve HTML docs) this would mean I'd
> have to pull everything apart again. Also not having some kind of auth
> would be a problem.

I'd like to enable that (Fuseki as servlets) but as Fuseki 0.2.0 things 
take a while ... you could rip the servlets out of the code base for 
now.  I presume it's possible to add the fuseki jar as a jar and use it 
as a library.  Not tried it - theory.

	Andy

> Plan C : Apache Clerezza
> http://incubator.apache.org/clerezza/
> Reto pointed me in this direction as it used TDB by default and has a
> concurrency lock available. One very appealing aspect is it has a
> JAX-RS implementation which I've played with before, finding it a
> really intuitive way of hooking up web server wiring.
> But similarly it's not obvious how to just use the bits I want without
> also bringing along a load of clutter (for my specific app). Pulling
> down the svn trunk revealed 109 subpackages (maven-managed) which I
> must admit was a bit offputting.
>
> Plan D : do nothing
> My app will have a HTTP client anyway, I need to be able to GET remote
> stuff, and also the ability to be able to address remote systems via
> SPARQL Update and HTTP PUT/POST would be nice to have (also at some
> point I want to make a little adapter for the Talis Platform, so I can
> post graphs up there).
>
> I'm not sure how it would be done with Clerezza, but a completely
> independent install of Fuseki would offer most of the services I want.
> For the automagic publishing of my local TDB, I'm pretty sure that
> could be achieved by recording a datestamp of operations (i.e. an
> ultra-simple versioning system), so it'd only be necessary to transfer
> what's changed.
>
> Cheers,
> Danny.
>
>

Re: [jena-dev] TDB concurrency

Posted by Danny Ayers <da...@gmail.com>.
[on jena-users@incubator.apache.org now...right place finally]

I've decided on Plan D : do nothing :)

Just for ref, here's my decision process:

The idea was for my Turtle editor app to have an integrated server, so
any created data could automagically be served over HTTP (this ties in
with the Semantic Web in a Box angle). One option that appealed to me
further down the line was to give the app a headless (no Swing) option
so the same setup could be used to deploy a semweb server. I'll be
using TDB anyway, and Jetty seems like a natural choice for HTTP
server.

Plan A : code it up myself
seems like quite a lot of work, especially given the stuff already
done by others

Plan B : Fuseki
I had a poke around the download. I'd assumed it was a just a couple
of extra classes to glue TDB to Jetty, but there's quite a bit more,
notably an impressive effort to package everything together to make a
self-contained package. Cool, but for programmatic integration (e.g.
using the same Jetty instance to serve HTML docs) this would mean I'd
have to pull everything apart again. Also not having some kind of auth
would be a problem.

Plan C : Apache Clerezza
http://incubator.apache.org/clerezza/
Reto pointed me in this direction as it used TDB by default and has a
concurrency lock available. One very appealing aspect is it has a
JAX-RS implementation which I've played with before, finding it a
really intuitive way of hooking up web server wiring.
But similarly it's not obvious how to just use the bits I want without
also bringing along a load of clutter (for my specific app). Pulling
down the svn trunk revealed 109 subpackages (maven-managed) which I
must admit was a bit offputting.

Plan D : do nothing
My app will have a HTTP client anyway, I need to be able to GET remote
stuff, and also the ability to be able to address remote systems via
SPARQL Update and HTTP PUT/POST would be nice to have (also at some
point I want to make a little adapter for the Talis Platform, so I can
post graphs up there).

I'm not sure how it would be done with Clerezza, but a completely
independent install of Fuseki would offer most of the services I want.
For the automagic publishing of my local TDB, I'm pretty sure that
could be achieved by recording a datestamp of operations (i.e. an
ultra-simple versioning system), so it'd only be necessary to transfer
what's changed.

Cheers,
Danny.


-- 
http://danny.ayers.name

Re: [jena-dev] TDB concurrency

Posted by Danny Ayers <da...@gmail.com>.
Thanks for the pointers Paolo. Now subscribed to
jena-dev@incubator.apache.org

On 25 February 2011 09:17, Paolo Castagna <ca...@googlemail.com>wrote:

>
>
> Hi Danny,
> Dave and Damian have already answered your question.
> In the past, I had those sort of doubts about MRSW myself.
> A simple multi threaded test can help finding the answer.
>
> For clarity, it's MR XOR SW.
>
> It's an exclusive OR: if you are writing, nobody can read.
> Therefore a long write would block access to readers.
> Similarly, a long read would block any write too.
>
> There is an issue (i.e. New Feature) for this:
> https://issues.apache.org/jira/browse/JENA-41
> We should discuss technical details there.
>
> Andy proposed "journaled file access".
>
> I'd like to help on this and try to do a prototype as a proof of concept.
> However, I am not an expert on this (i.e. I've never wrote a journaled
> file access system before) and it does not appear to be only a "small
> matter of programming". ;-)
>
> There are a lot of details which are not clear to me.
>
> Damian's suggestion works perfectly for your.
>
> Paolo
>
> PS:
> I know it's a pain, but we are trying to move the mailing
> list to Apache. Please, subscribe by sending an email to:
> jena-dev-subscribe AT incubator.apache.org
> jena-users-subscribe AT incubator.apache.org
>
>
> Danny Ayers wrote:
> > on the wiki it says:
> > [[
> > TDB provides a Multiple Reader or Single Writer (MRSW) policy for
> > concurrency access. Applications are expected to adhere to this policy
> > - it is not automatically checked.
> >
> > One gotcha is Java iterators. An iterator that is moving over the
> > database is making read operations and no updates to the dataset are
> > possible while an iterator is being used.
> > ]]
> >
> > I'd like to check I'm reading this correctly - is it that many readers
> > can access the data concurrently but the (one and only) writer should
> > have an exclusive lock - and that lock should block reading..?
> >
> > The scenario I'm looking at will be TDB shared between Fuseki and
> > programmatic access (a Turtle editor).
> >
> > I've haven't yet really got a clue how I'll handle the sharing, so if
> > anyone's got any code for a similar situation I'd be grateful for a
> > pointer.
> >
> > (Right now I've got the editing happening on a single memory model, so
> > for the moment at least I can probably get away with access to TDB
> > models through a read-(edit)-replace kind of cycle).
> >
> > Cheers,
> > Danny.
> >
>  __._,_.___
>   Reply to sender<castagna.lists@googlemail.com?subject=Re%3A%20%5Bjena-dev%5D%20TDB%20concurrency>| Reply
> to group<jena-dev@yahoogroups.com?subject=Re%3A%20%5Bjena-dev%5D%20TDB%20concurrency>| Reply
> via web post<http://groups.yahoo.com/group/jena-dev/post;_ylc=X3oDMTJxcDRqbnVxBF9TAzk3MzU5NzE0BGdycElkAzM5OTc1NTMEZ3Jwc3BJZAMxNzA1MDA3MTgxBG1zZ0lkAzQ2ODcxBHNlYwNmdHIEc2xrA3JwbHkEc3RpbWUDMTI5ODYyMTg3NQ--?act=reply&messageNum=46871>| Start
> a New Topic<http://groups.yahoo.com/group/jena-dev/post;_ylc=X3oDMTJlbHB2Y3Q2BF9TAzk3MzU5NzE0BGdycElkAzM5OTc1NTMEZ3Jwc3BJZAMxNzA1MDA3MTgxBHNlYwNmdHIEc2xrA250cGMEc3RpbWUDMTI5ODYyMTg3NQ-->
> Messages in this topic<http://groups.yahoo.com/group/jena-dev/message/46867;_ylc=X3oDMTM2ZnVnYmhrBF9TAzk3MzU5NzE0BGdycElkAzM5OTc1NTMEZ3Jwc3BJZAMxNzA1MDA3MTgxBG1zZ0lkAzQ2ODcxBHNlYwNmdHIEc2xrA3Z0cGMEc3RpbWUDMTI5ODYyMTg3NQR0cGNJZAM0Njg2Nw-->(
> 4)
>  Recent Activity:
>
>    - New Members<http://groups.yahoo.com/group/jena-dev/members;_ylc=X3oDMTJmYnRiOWM0BF9TAzk3MzU5NzE0BGdycElkAzM5OTc1NTMEZ3Jwc3BJZAMxNzA1MDA3MTgxBHNlYwN2dGwEc2xrA3ZtYnJzBHN0aW1lAzEyOTg2MjE4NzU-?o=6>
>    5
>
>  Visit Your Group<http://groups.yahoo.com/group/jena-dev;_ylc=X3oDMTJlMWhxY3ZjBF9TAzk3MzU5NzE0BGdycElkAzM5OTc1NTMEZ3Jwc3BJZAMxNzA1MDA3MTgxBHNlYwN2dGwEc2xrA3ZnaHAEc3RpbWUDMTI5ODYyMTg3NQ-->
>  [image: Yahoo! Groups]<http://groups.yahoo.com/;_ylc=X3oDMTJkcmljbnU5BF9TAzk3MzU5NzE0BGdycElkAzM5OTc1NTMEZ3Jwc3BJZAMxNzA1MDA3MTgxBHNlYwNmdHIEc2xrA2dmcARzdGltZQMxMjk4NjIxODc1>
> Switch to: Text-Only<jena-dev-traditional@yahoogroups.com?subject=Change+Delivery+Format:+Traditional>,
> Daily Digest<jena-dev-digest@yahoogroups.com?subject=Email+Delivery:+Digest>•
> Unsubscribe <jena-dev-unsubscribe@yahoogroups.com?subject=Unsubscribe> • Terms
> of Use <http://docs.yahoo.com/info/terms/>
>    .
>
> __,_._,___
>



-- 
http://danny.ayers.name