You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by "Dr. André Lanka" <ma...@dr-lanka.de> on 2011/10/27 12:16:01 UTC

TDB: Concurrent writing to named graphs

Hello Jena users,

we use various named graphs in a single dataset and have massive 
parallel write access to the named graphs. I always get a 
ConcurrentModificationException even if I lock on model-level using 
model.enterCriticalSection(boolean).
By this I reckon that the MRSW pattern is applied on dataset level and 
not on named-graph-level.
Namely, parallel writing to _different_ named graphs (constructed by 
dataset.getNamedModel) will always fail as the graphs use the same 
dataset beneath.

Could someone confirm this?

Is there any possibility to permit parallel write access to different 
graphs?

Thanks in advance
André

-- 
Dr. André Lanka  *  0178 / 134 44 47  *  http://dr-lanka.de

Re: TDB: Concurrent writing to named graphs

Posted by Andy Seaborne <an...@apache.org>.
On 27/10/11 16:52, Laurent Pellegrino wrote:
> Hi Paolo, all,
>
> I am also interested by the transactional TDB datastore. Do you know
> if it is possible to add or to remove a quad while iterating over an
> Iter<Quad>  issued from a SPARQL query when a Write access mode is
> used?

No (well, sort of "no")

Transaction don't give you the ability to Iterator.remove.  Within a 
transaction the usual Java-isms of manipulaing things backing iterators 
still apply and ARQ iterators don't support .remove.  (nor .add).

The 'sort of "no"' is that you could have having multiple transactions 
(one read, one write) on a single thread to work.  i.e. iterate on the 
readers and update via the writer.

	Andy

>
> Kind Regards,
> Laurent

Re: TDB: Concurrent writing to named graphs

Posted by Laurent Pellegrino <la...@gmail.com>.
Hi Paolo, all,

I am also interested by the transactional TDB datastore. Do you know
if it is possible to add or to remove a quad while iterating over an
Iter<Quad> issued from a SPARQL query when a Write access mode is
used?

Kind Regards,
Laurent

On Thu, Oct 27, 2011 at 3:15 PM, Paolo Castagna
<ca...@googlemail.com> wrote:
> Hi André
>
> Dr. André Lanka wrote:
>>
>> Hi Paolo,
>>
>> On 27.10.2011 12:31, Paolo Castagna wrote:
>>>
>>> the concurrency documentation on TDB could certainly be improved! :-)
>>> http://openjena.org/wiki/TDB/JavaAPI#Concurrency
>>
>>> Can you try to get a lock from a DatasetGraphTDB and let us know if
>>> you still have problems?
>>
>> As expected the problems vanish, yet this blocks the complete store
>> (containing up to 1000 named graphs) whenever someone wants to write to it.
>> This brings me back to my initial question:
>>
>>
>>>> Is there any possibility to permit parallel write access to different
>>>> graphs?
>
> Which TDB version are you using?
> What is the average execution time of your write|read requests?
> How many write|read request per second you have?
>
> If you want to try a SNAPSHOT (be warned: it's a SNAPSHOT and it might have
> problems!), you could use TDB 0.9.0-incubating-SNAPSHOT here:
> https://repository.apache.org/content/repositories/snapshots/org/apache/jena/jena-tdb/
>
> Then you can try:
>
>    Location location = ...
>    StoreConnection sConn = StoreConnection.make(location);
>
>    DatasetGraphTxn dsgTx = null;
>    try {
>        dsgTx = sConn.begin(ReadWrite.WRITE);
>        ...
>    } catch (Exception e) {
>        dsgTx.abort();
>        ...
>    } finally {
>        dsgTx.commit();
>    }
>
>
>    try {
>        dsgTx = sConn.begin(ReadWrite.READ);
>    } catch (Exception e) {
>        dsgTx.abort();
>    } finally {
>        dsgTx.commit();
>    }
>
> Another warning: these are APIs which might or might not change again in
> future.
>
> See also:
>
>  - https://issues.apache.org/jira/browse/JENA-41 (now closed)
>  - https://cwiki.apache.org/confluence/display/JENA/TxTDB-design
>  - https://cwiki.apache.org/confluence/display/JENA/TxTDB
>
> If you are interested in low level details, a good starting point to look at
> is
> the test suite, here:
> http://svn.apache.org/repos/asf/incubator/jena/Jena2/TDB/trunk/src/test/java/com/hp/hpl/jena/tdb/transaction/
>
> A test using multiple threads to run read/write transactions is this one:
> http://svn.apache.org/repos/asf/incubator/jena/Jena2/TDB/trunk/src/test/java/com/hp/hpl/jena/tdb/transaction/T_TransSystem.java
>
> If you are curious and you want to have a look at the implementation,
> start from the com.hp.hpl.jena.tdb.transaction package, here:
> http://svn.apache.org/repos/asf/incubator/jena/Jena2/TDB/trunk/src/main/java/com/hp/hpl/jena/tdb/transaction/
>
> Paolo
>
>>
>> Thanks
>> André
>>
>>
>
>

Re: TDB: Concurrent writing to named graphs

Posted by Paolo Castagna <ca...@googlemail.com>.
Hi André

Dr. André Lanka wrote:
> Hi Paolo,
> 
> On 27.10.2011 12:31, Paolo Castagna wrote:
>> the concurrency documentation on TDB could certainly be improved! :-)
>> http://openjena.org/wiki/TDB/JavaAPI#Concurrency
> 
>> Can you try to get a lock from a DatasetGraphTDB and let us know if
>> you still have problems?
> 
> As expected the problems vanish, yet this blocks the complete store 
> (containing up to 1000 named graphs) whenever someone wants to write to 
> it. This brings me back to my initial question:
> 
> 
>>> Is there any possibility to permit parallel write access to different
>>> graphs?

Which TDB version are you using?
What is the average execution time of your write|read requests?
How many write|read request per second you have?

If you want to try a SNAPSHOT (be warned: it's a SNAPSHOT and it might have
problems!), you could use TDB 0.9.0-incubating-SNAPSHOT here:
https://repository.apache.org/content/repositories/snapshots/org/apache/jena/jena-tdb/

Then you can try:

     Location location = ...
     StoreConnection sConn = StoreConnection.make(location);

     DatasetGraphTxn dsgTx = null;
     try {
         dsgTx = sConn.begin(ReadWrite.WRITE);
         ...
     } catch (Exception e) {
         dsgTx.abort();
         ...
     } finally {
         dsgTx.commit();
     }


     try {
         dsgTx = sConn.begin(ReadWrite.READ);
     } catch (Exception e) {
         dsgTx.abort();
     } finally {
         dsgTx.commit();
     }

Another warning: these are APIs which might or might not change again in future.

See also:

  - https://issues.apache.org/jira/browse/JENA-41 (now closed)
  - https://cwiki.apache.org/confluence/display/JENA/TxTDB-design
  - https://cwiki.apache.org/confluence/display/JENA/TxTDB

If you are interested in low level details, a good starting point to look at is
the test suite, here:
http://svn.apache.org/repos/asf/incubator/jena/Jena2/TDB/trunk/src/test/java/com/hp/hpl/jena/tdb/transaction/

A test using multiple threads to run read/write transactions is this one:
http://svn.apache.org/repos/asf/incubator/jena/Jena2/TDB/trunk/src/test/java/com/hp/hpl/jena/tdb/transaction/T_TransSystem.java

If you are curious and you want to have a look at the implementation,
start from the com.hp.hpl.jena.tdb.transaction package, here:
http://svn.apache.org/repos/asf/incubator/jena/Jena2/TDB/trunk/src/main/java/com/hp/hpl/jena/tdb/transaction/

Paolo

> 
> Thanks
> André
> 
> 


Re: TDB: Concurrent writing to named graphs

Posted by "Dr. André Lanka" <ma...@dr-lanka.de>.
Hi Paolo,

On 27.10.2011 12:31, Paolo Castagna wrote:
> the concurrency documentation on TDB could certainly be improved! :-)
> http://openjena.org/wiki/TDB/JavaAPI#Concurrency

> Can you try to get a lock from a DatasetGraphTDB and let us know if
> you still have problems?

As expected the problems vanish, yet this blocks the complete store 
(containing up to 1000 named graphs) whenever someone wants to write to 
it. This brings me back to my initial question:


>> Is there any possibility to permit parallel write access to different
>> graphs?

Thanks
André


-- 
Dr. André Lanka  *  0178 / 134 44 47  *  http://dr-lanka.de

Re: TDB: Concurrent writing to named graphs

Posted by Paolo Castagna <ca...@googlemail.com>.
Hi André,
the concurrency documentation on TDB could certainly be improved! :-)
http://openjena.org/wiki/TDB/JavaAPI#Concurrency

Can you try to get a lock from a DatasetGraphTDB and let us know if
you still have problems?

     DatasetGraphTDB dsg = ...
     Lock lock = dsg.getLock();
     lock.enterCriticalSection(Lock.READ);
     try {
          ...
     } finally {
         lock.leaveCriticalSection();
     }

Paolo


Dr. André Lanka wrote:
> Hello Jena users,
> 
> we use various named graphs in a single dataset and have massive 
> parallel write access to the named graphs. I always get a 
> ConcurrentModificationException even if I lock on model-level using 
> model.enterCriticalSection(boolean).
> By this I reckon that the MRSW pattern is applied on dataset level and 
> not on named-graph-level.
> Namely, parallel writing to _different_ named graphs (constructed by 
> dataset.getNamedModel) will always fail as the graphs use the same 
> dataset beneath.
> 
> Could someone confirm this?
> 
> Is there any possibility to permit parallel write access to different 
> graphs?
> 
> Thanks in advance
> André
> 


Re: TDB: Concurrent writing to named graphs

Posted by Andy Seaborne <an...@apache.org>.
On 27/10/11 11:16, "Dr. André Lanka" wrote:
> Hello Jena users,
>
> we use various named graphs in a single dataset and have massive
> parallel write access to the named graphs. I always get a
> ConcurrentModificationException even if I lock on model-level using
> model.enterCriticalSection(boolean).
> By this I reckon that the MRSW pattern is applied on dataset level and
> not on named-graph-level.
> Namely, parallel writing to _different_ named graphs (constructed by
> dataset.getNamedModel) will always fail as the graphs use the same
> dataset beneath.
>
> Could someone confirm this?

Yes - this is the case.

>
> Is there any possibility to permit parallel write access to different
> graphs?

The (in progress) transaction system will provide one active write and 
multiple readers, and also automatically control requests for writes to 
give concurrent writers, one active writer and multiple readers.

If you want separate, truly concurrent models, it will need to be 
separate datasets.

	Andy

>
> Thanks in advance
> André
>