You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by "Lebling, David (US SSA)" <da...@baesystems.com> on 2013/04/10 17:12:17 UTC

Concurrency in Jena/SDB

The method below is a somewhat over-decorated (with extra non-production checks and error log messages) version of "write a model to the SDB repository safely" that I am using. The Individual res (or rather, the in-memory model it is in) is written to the Model model (obtained via SDBFactory.connectNamedModel), replacing any current contents.

It attempts to protect the model update from horrible concurrency death during the process. However, I'm seeing that in cases where there are multiple readers and writers (in different services - i.e., different JVMs) beating on the particular model, the error messages with "****" appear, sometimes surprisingly often. When the first one appears, the second one also always appears.

The outcome in the above cases is a repository model that is empty when read later. The obvious idea is that between model.removeAll() and the later model.add() the reader is reading the temporarily empty model. I don't know if that idea is correct, but I don't have any others.

Any suggestions as to what I am doing wrong would be appreciated.

/**
* Single place to write or delete a model from the store
* @param model store-based model
* @param res in-memory resource to write into model, or null to just delete model
*/
void internalWrite(Model model, Individual res) {
   if (model != null) {
      try {
         logger.debug("begin write model");
         model.enterCriticalSection(Lock.WRITE);
         model.notifyEvent(GraphEvents.startRead); // TEST I don't know if this is worth doing
         model.begin();
         model.removeAll(); // clear out the existing statements in the repo model
         if (res != null) {
            Model base = res.getOntModel().getBaseModel();
            if (base.size() == 0) {
               logger.error("internalWrite: base model empty");
            }
            model.add(base); // add in-memory model to repository model
            if (model.size() == 0) {
               logger.error("internalWrite: model empty after add"); // **** if this happens ****
            }
         }
         else {
            logger.debug("internalWrite: resource passed is null: delete");
         }
         model.commit();
         if (model.size() == 0) {
            logger.error("internalWrite: model empty after commit"); // **** this also happens ****
         }
      }
      catch (SDBException ex) {
         model.abort();
         logger.error(ex);
      }
      finally {
         model.notifyEvent(GraphEvents.finishRead);  // TEST I don't know if this is worth doing
         model.leaveCriticalSection();
         logger.debug("end write model");
      }
   }
}

Thanks,

Dave

David Lebling
BAE Systems, Inc.

Re: Concurrency in Jena/SDB

Posted by Damian Steer <d....@bris.ac.uk>.

On 15 Apr 2013, at 14:27, Damian Steer <d....@bris.ac.uk> wrote:

> However it's not especially smart, so you can take control of the transaction yourself by setting autocommit to false on the JDBC connection. SDB will then assume the client is responsible for managing the transactions.

Sorry, forgot to add:

...and this is all model.begin() et al are doing under the hood.

Damian

Re: Concurrency in Jena/SDB

Posted by Andy Seaborne <an...@apache.org>.

Seeing that error message makes it a bit clearer to me. It's something 
to do with the bulk loader (NNodeQuads1740763164) and some othe raction 
has already inserted a node.  As the possible values for a row in the 
node table are fixed for a given primary key, I wonder if it's possible 
to insert but ignore duplicatess (which won't an a different duplicate). 
  And maybe the  LOCK TABLE needs to be further out.

(/me hopes Damian has better insight here)

	Andy

On 15/04/13 20:58, Lebling, David (US SSA) wrote:
> Andy,
>
> Are you suggesting the use of
Connection.setTransactionIsolation(something-other-than-TRANSACTION_READ_COMMITTED),
which is the default? The SDB implementation already uses setAutoCommit,
etc.
>  I tried all higher values for setTransactionIsolation() than the
default, using a freshly populated database, and the results are that at
levels Connection.TRANSACTION_REPEATABLE_READ and
Connection.TRANSACTION_SERIALIZABLE after a while the following
exception is thrown:



>
> WARN [tomcat-http--10] (SDBConnection.java:338) - execUpdate: SQLException
> ERROR: duplicate key value violates unique constraint "nodes_pkey"
>    Detail: Key (hash)=(-8555076428185164481) already exists.
> LOCK TABLE Nodes; INSERT INTO Nodes (hash, lex, lang, datatype, type)
> SELECT NNodeQuads1740763164.n0 , NNodeQuads1740763164.n1 , NNodeQuads1740763164.n2 , NNodeQuads1740763164.n3 , NNodeQuads1740763164.n4
> FROM NNodeQuads1740763164 LEFT JOIN Nodes ON (NNodeQuads1740763164.n0=Nodes.hash)
> WHERE Nodes.hash IS NULL
>
> ERROR [tomcat-http--10] (PlannerInterfaceServiceEndPointImpl.java:237) - lp:PI2 comp:PIS id:12 action:exception msg:submitExternalRequest
> com.hp.hpl.jena.sdb.SDBException: Exception flushing
> 	at com.hp.hpl.jena.sdb.layout2.TupleLoaderBase.flush(TupleLoaderBase.java:234)
> 	at com.hp.hpl.jena.sdb.layout2.TupleLoaderBase.finish(TupleLoaderBase.java:169)
> 	at com.hp.hpl.jena.sdb.layout2.LoaderTuplesNodes.commitTuples(LoaderTuplesNodes.java:306)
> 	at com.hp.hpl.jena.sdb.layout2.LoaderTuplesNodes.flushTriples(LoaderTuplesNodes.java:235)
> 	at com.hp.hpl.jena.sdb.layout2.LoaderTuplesNodes.finishBulkUpdate(LoaderTuplesNodes.java:90)
> 	at com.hp.hpl.jena.sdb.graph.GraphSDB.finishBulkUpdate(GraphSDB.java:310)
> 	at com.hp.hpl.jena.sdb.graph.EventManagerSDB.notifyEvent(EventManagerSDB.java:41)
> 	at com.hp.hpl.jena.rdf.model.impl.ModelCom.notifyEvent(ModelCom.java:1543)
>
> In addition, these level settings had no impact on the error conditions I highlighted in my original message.
>
> Obviously I'm missing something here. Were you actually suggesting something different?
>
> Dave
>
> -----Original Message-----
> From: Andy Seaborne [mailto:andy@apache.org]
> Sent: Monday, April 15, 2013 10:10 AM
> To: users@jena.apache.org
> Subject: Re: Concurrency in Jena/SDB
>
> On 15/04/13 14:51, Lebling, David (US SSA) wrote:
>> Andy,
>>
>> Which part of that document is the part that might help? My code
>> already creates a new Store for each transaction. My code already uses
>> begin/commit.
>
> The fact you can manage the JDBC connection separately from the Store - that can tie to receiving servlet requests quite well.
>
> 	Andy
>
> You asked:
>> I was hoping you might have some guidance as to what you mean by "You
>> need to control the JDBC transaction in your code to get isolation
>> between JVMs."
>
>
>>
>> (I've read the document many times previously, by the way.)
>>
>> Dave L.
>>
>> -----Original Message----- From: Andy Seaborne
>> [mailto:andy@apache.org] Sent: Monday, April 15, 2013 9:42 AM To:
>> users@jena.apache.org Subject: Re: Concurrency in Jena/SDB
>>
>> This might help:
>>
>> http://jena.apache.org/documentation/sdb/javaapi.html#connection-manag
>> ement
>>
>>   Andy

RE: Concurrency in Jena/SDB

Posted by "Lebling, David (US SSA)" <da...@baesystems.com>.

Andy,

Are you suggesting the use of Connection.setTransactionIsolation(something-other-than-TRANSACTION_READ_COMMITTED), which is the default? The SDB implementation already uses setAutoCommit, etc.

I tried all higher values for setTransactionIsolation() than the default, using a freshly populated database, and the results are that at levels Connection.TRANSACTION_REPEATABLE_READ and Connection.TRANSACTION_SERIALIZABLE after a while the following exception is thrown:

WARN [tomcat-http--10] (SDBConnection.java:338) - execUpdate: SQLException
ERROR: duplicate key value violates unique constraint "nodes_pkey"
  Detail: Key (hash)=(-8555076428185164481) already exists.
LOCK TABLE Nodes; INSERT INTO Nodes (hash, lex, lang, datatype, type) 
SELECT NNodeQuads1740763164.n0 , NNodeQuads1740763164.n1 , NNodeQuads1740763164.n2 , NNodeQuads1740763164.n3 , NNodeQuads1740763164.n4
FROM NNodeQuads1740763164 LEFT JOIN Nodes ON (NNodeQuads1740763164.n0=Nodes.hash) 
WHERE Nodes.hash IS NULL

ERROR [tomcat-http--10] (PlannerInterfaceServiceEndPointImpl.java:237) - lp:PI2 comp:PIS id:12 action:exception msg:submitExternalRequest
com.hp.hpl.jena.sdb.SDBException: Exception flushing
	at com.hp.hpl.jena.sdb.layout2.TupleLoaderBase.flush(TupleLoaderBase.java:234)
	at com.hp.hpl.jena.sdb.layout2.TupleLoaderBase.finish(TupleLoaderBase.java:169)
	at com.hp.hpl.jena.sdb.layout2.LoaderTuplesNodes.commitTuples(LoaderTuplesNodes.java:306)
	at com.hp.hpl.jena.sdb.layout2.LoaderTuplesNodes.flushTriples(LoaderTuplesNodes.java:235)
	at com.hp.hpl.jena.sdb.layout2.LoaderTuplesNodes.finishBulkUpdate(LoaderTuplesNodes.java:90)
	at com.hp.hpl.jena.sdb.graph.GraphSDB.finishBulkUpdate(GraphSDB.java:310)
	at com.hp.hpl.jena.sdb.graph.EventManagerSDB.notifyEvent(EventManagerSDB.java:41)
	at com.hp.hpl.jena.rdf.model.impl.ModelCom.notifyEvent(ModelCom.java:1543)

In addition, these level settings had no impact on the error conditions I highlighted in my original message.

Obviously I'm missing something here. Were you actually suggesting something different?

Dave

-----Original Message-----
From: Andy Seaborne [mailto:andy@apache.org] 
Sent: Monday, April 15, 2013 10:10 AM
To: users@jena.apache.org
Subject: Re: Concurrency in Jena/SDB

On 15/04/13 14:51, Lebling, David (US SSA) wrote:
> Andy,
>
> Which part of that document is the part that might help? My code 
> already creates a new Store for each transaction. My code already uses 
> begin/commit.

The fact you can manage the JDBC connection separately from the Store - that can tie to receiving servlet requests quite well.

	Andy

You asked:
> I was hoping you might have some guidance as to what you mean by "You 
> need to control the JDBC transaction in your code to get isolation 
> between JVMs."

>
> (I've read the document many times previously, by the way.)
>
> Dave L.
>
> -----Original Message----- From: Andy Seaborne 
> [mailto:andy@apache.org] Sent: Monday, April 15, 2013 9:42 AM To:
> users@jena.apache.org Subject: Re: Concurrency in Jena/SDB
>
> This might help:
>
> http://jena.apache.org/documentation/sdb/javaapi.html#connection-manag
> ement
>
>  Andy

Re: Concurrency in Jena/SDB

Posted by Andy Seaborne <an...@apache.org>.

On 15/04/13 14:51, Lebling, David (US SSA) wrote:
> Andy,
>
> Which part of that document is the part that might help? My code
> already creates a new Store for each transaction. My code already
> uses begin/commit.

The fact you can manage the JDBC connection separately from the Store -
that can tie to receiving servlet requests quite well.

	Andy

You asked:
> I was hoping you might have some guidance as to what you mean by "You
> need to control the JDBC transaction in your code to get isolation
> between JVMs."


>
> (I've read the document many times previously, by the way.)
>
> Dave L.
>
> -----Original Message----- From: Andy Seaborne
> [mailto:andy@apache.org] Sent: Monday, April 15, 2013 9:42 AM To:
> users@jena.apache.org Subject: Re: Concurrency in Jena/SDB
>
> This might help:
>
> http://jena.apache.org/documentation/sdb/javaapi.html#connection-management
>
>  Andy

RE: Concurrency in Jena/SDB

Posted by "Lebling, David (US SSA)" <da...@baesystems.com>.

Andy,

Which part of that document is the part that might help? My code already creates a new Store for each transaction. My code already uses begin/commit.

(I've read the document many times previously, by the way.)

Dave L.

-----Original Message-----
From: Andy Seaborne [mailto:andy@apache.org] 
Sent: Monday, April 15, 2013 9:42 AM
To: users@jena.apache.org
Subject: Re: Concurrency in Jena/SDB

This might help:

http://jena.apache.org/documentation/sdb/javaapi.html#connection-management

	Andy


On 15/04/13 14:38, David Jordan wrote:
>
> So every call to a method of Model or OntModel is done in a separate transaction? This could easily explain the poor performance I am getting, and those of others who have complained about SDB performance in this group.
>
> Your code example is very sparse. Are there calls to make to get to an associated JDBC transaction? All of the JDBC connection info is described in Jena's assembler files, so it seems like an API should be available to access the needed JDBC objects.
>
> -----Original Message-----
> From: Damian Steer [mailto:cmdms@bristol.ac.uk] On Behalf Of Damian Steer
> Sent: Monday, April 15, 2013 9:27 AM
> To: users@jena.apache.org
> Subject: Re: Concurrency in Jena/SDB
>
>
> On 15 Apr 2013, at 12:35, "Lebling, David (US SSA)" <da...@baesystems.com> wrote:
>
>> Andy,
>>
>> I was hoping you might have some guidance as to what you mean by "You need to control the JDBC transaction in your code to get isolation between JVMs."
>>
>> Is there some interface in SDB that will accomplish this, or does this require SQL-implementation-specific code, thus tearing down the SDB façade? If it does require such implementation-specific code, do you have any pointers to examples which do it correctly? Is it necessary to write one's own transaction begin/commit/abort methods?
>
> SDB (as you'd expect) handles transactions itself by default. All model operations happen in a single transaction, with a minor wrinkle around really big loads (it will start committing every 20,000 triples iirc to avoid huge transactions).
>
> You can also wrap operations like this to get some of that behaviour across a number of operations:
>
> model.notifyEvent(GraphEvents.startRead) ; try { ... do add/remove operations ...
> } finally {
> 	model.notifyEvent(GraphEvents.finishRead) ; }
>
> However it's not especially smart, so you can take control of the transaction yourself by setting autocommit to false on the JDBC connection. SDB will then assume the client is responsible for managing the transactions.
>
> Just remember to commit() and reset autocommit when you finish.
>
> Damian
>

Re: Concurrency in Jena/SDB

Posted by Andy Seaborne <an...@apache.org>.

This might help:

http://jena.apache.org/documentation/sdb/javaapi.html#connection-management

	Andy


On 15/04/13 14:38, David Jordan wrote:
>
> So every call to a method of Model or OntModel is done in a separate transaction? This could easily explain the poor performance I am getting, and those of others who have complained about SDB performance in this group.
>
> Your code example is very sparse. Are there calls to make to get to an associated JDBC transaction? All of the JDBC connection info is described in Jena's assembler files, so it seems like an API should be available to access the needed JDBC objects.
>
> -----Original Message-----
> From: Damian Steer [mailto:cmdms@bristol.ac.uk] On Behalf Of Damian Steer
> Sent: Monday, April 15, 2013 9:27 AM
> To: users@jena.apache.org
> Subject: Re: Concurrency in Jena/SDB
>
>
> On 15 Apr 2013, at 12:35, "Lebling, David (US SSA)" <da...@baesystems.com> wrote:
>
>> Andy,
>>
>> I was hoping you might have some guidance as to what you mean by "You need to control the JDBC transaction in your code to get isolation between JVMs."
>>
>> Is there some interface in SDB that will accomplish this, or does this require SQL-implementation-specific code, thus tearing down the SDB façade? If it does require such implementation-specific code, do you have any pointers to examples which do it correctly? Is it necessary to write one's own transaction begin/commit/abort methods?
>
> SDB (as you'd expect) handles transactions itself by default. All model operations happen in a single transaction, with a minor wrinkle around really big loads (it will start committing every 20,000 triples iirc to avoid huge transactions).
>
> You can also wrap operations like this to get some of that behaviour across a number of operations:
>
> model.notifyEvent(GraphEvents.startRead) ; try { ... do add/remove operations ...
> } finally {
> 	model.notifyEvent(GraphEvents.finishRead) ; }
>
> However it's not especially smart, so you can take control of the transaction yourself by setting autocommit to false on the JDBC connection. SDB will then assume the client is responsible for managing the transactions.
>
> Just remember to commit() and reset autocommit when you finish.
>
> Damian
>

Re: Concurrency in Jena/SDB

Posted by Damian Steer <d....@bris.ac.uk>.

On 15 Apr 2013, at 14:38, David Jordan <da...@sas.com> wrote:

> 
> So every call to a method of Model or OntModel is done in a separate transaction? This could easily explain the poor performance I am getting, and those of others who have complained about SDB performance in this group.

Poor performance typically has more to do with the cumulative overhead of large numbers of small operations than transactions per se. 

SDB tries to queue up added or removed triples in large chunks (c 20,000 triples), and execute them in a small number of RDB operations. Each call to add or remove will be a single chunk, so it's best to make them as big as possible.

The model.notifyEvent(GraphEvents.startRead) business is a way to take some control of that queuing and allow the queue to cross method boundaries. So, for example:

while (condition) { if (condition) model.add(statement); }

would benefit immensely from being wrapped in start/finishRead.

The problem is that you are queuing client side, so peeking at the contents of the model is a bad idea: the triples may not have been added.

Damian

RE: Concurrency in Jena/SDB

Posted by David Jordan <Da...@sas.com>.

So every call to a method of Model or OntModel is done in a separate transaction? This could easily explain the poor performance I am getting, and those of others who have complained about SDB performance in this group.

Your code example is very sparse. Are there calls to make to get to an associated JDBC transaction? All of the JDBC connection info is described in Jena's assembler files, so it seems like an API should be available to access the needed JDBC objects.

-----Original Message-----
From: Damian Steer [mailto:cmdms@bristol.ac.uk] On Behalf Of Damian Steer
Sent: Monday, April 15, 2013 9:27 AM
To: users@jena.apache.org
Subject: Re: Concurrency in Jena/SDB

On 15 Apr 2013, at 12:35, "Lebling, David (US SSA)" <da...@baesystems.com> wrote:

> Andy,
> 
> I was hoping you might have some guidance as to what you mean by "You need to control the JDBC transaction in your code to get isolation between JVMs."
> 
> Is there some interface in SDB that will accomplish this, or does this require SQL-implementation-specific code, thus tearing down the SDB façade? If it does require such implementation-specific code, do you have any pointers to examples which do it correctly? Is it necessary to write one's own transaction begin/commit/abort methods?

SDB (as you'd expect) handles transactions itself by default. All model operations happen in a single transaction, with a minor wrinkle around really big loads (it will start committing every 20,000 triples iirc to avoid huge transactions).

You can also wrap operations like this to get some of that behaviour across a number of operations:

model.notifyEvent(GraphEvents.startRead) ; try { ... do add/remove operations ...
} finally {
	model.notifyEvent(GraphEvents.finishRead) ; }

However it's not especially smart, so you can take control of the transaction yourself by setting autocommit to false on the JDBC connection. SDB will then assume the client is responsible for managing the transactions.

Just remember to commit() and reset autocommit when you finish.

Damian

Re: Concurrency in Jena/SDB

Posted by Damian Steer <d....@bris.ac.uk>.

On 15 Apr 2013, at 12:35, "Lebling, David (US SSA)" <da...@baesystems.com> wrote:

> Andy,
> 
> I was hoping you might have some guidance as to what you mean by "You need to control the JDBC transaction in your code to get isolation between JVMs."
> 
> Is there some interface in SDB that will accomplish this, or does this require SQL-implementation-specific code, thus tearing down the SDB façade? If it does require such implementation-specific code, do you have any pointers to examples which do it correctly? Is it necessary to write one's own transaction begin/commit/abort methods?

SDB (as you'd expect) handles transactions itself by default. All model operations happen in a single transaction, with a minor wrinkle around really big loads (it will start committing every 20,000 triples iirc to avoid huge transactions).

You can also wrap operations like this to get some of that behaviour across a number of operations:

model.notifyEvent(GraphEvents.startRead) ;
try {
... do add/remove operations ...
} finally {
	model.notifyEvent(GraphEvents.finishRead) ;
}

However it's not especially smart, so you can take control of the transaction yourself by setting autocommit to false on the JDBC connection. SDB will then assume the client is responsible for managing the transactions.

Just remember to commit() and reset autocommit when you finish.

Damian

RE: Concurrency in Jena/SDB

Posted by "Lebling, David (US SSA)" <da...@baesystems.com>.

As Joshua Bloch says, "concurrency is hard."

David Jordan,

The model.begin/model.commit/model.abort trio seems to work okay for all operations within a single JVM, although if you look at the code I included with my original question you'll a lot of decoration that may or may not be required.

Where my problem happens is when multiple JVMs are reading and one is writing.

Damian Steer,

An explicit begin/end is what I have in my code (which you see in the original post in this thread). There is also (a possibly not needed) pair of notifyEvents bookending the transaction. Are you suggesting I must also set autocommit to false for that code to work over multiple JVMs, or is explicit use of begin/end enough?

Dave L.

RE: Concurrency in Jena/SDB

Posted by David Jordan <Da...@sas.com>.

I have the same question. Surely the underlying SDB code begins/commits a transaction when it does updates to the database. Does it do so on an individual update operation? In which case you would lose the atomicity of multiple updates done in a single transaction. If it does do little transactions with each update, this could be a cause of performance issues, which I am definitely seeing. This should be fully explained and documented.

-----Original Message-----
From: Lebling, David (US SSA) [mailto:david.lebling@baesystems.com] 
Sent: Monday, April 15, 2013 7:35 AM
To: users@jena.apache.org
Subject: RE: Concurrency in Jena/SDB

Andy,

I was hoping you might have some guidance as to what you mean by "You need to control the JDBC transaction in your code to get isolation between JVMs."

Is there some interface in SDB that will accomplish this, or does this require SQL-implementation-specific code, thus tearing down the SDB façade? If it does require such implementation-specific code, do you have any pointers to examples which do it correctly? Is it necessary to write one's own transaction begin/commit/abort methods?

Thanks,

Dave

-----Original Message-----
From: Lebling, David (US SSA) [mailto:david.lebling@baesystems.com]
Sent: Thursday, April 11, 2013 9:15 AM
To: users@jena.apache.org
Subject: RE: Concurrency in Jena/SDB

Andy,

It has been used with Postgres and MySQL.

Dave

-----Original Message-----
From: Andy Seaborne [mailto:andy@apache.org]
Sent: Thursday, April 11, 2013 8:16 AM
To: users@jena.apache.org
Subject: Re: Concurrency in Jena/SDB

David,

Your not executing inside a database level (JDBC) transaction.  I'm afraid model.begin/commit does not know about JDBC properly.  You need to control the JDBC transaction in your code to get isolation between JVMs.

Which SQL database are you using?

	Andy

On 10/04/13 16:12, Lebling, David (US SSA) wrote:
> The method below is a somewhat over-decorated (with extra non-production checks and error log messages) version of "write a model to the SDB repository safely" that I am using. The Individual res (or rather, the in-memory model it is in) is written to the Model model (obtained via SDBFactory.connectNamedModel), replacing any current contents.
>
> It attempts to protect the model update from horrible concurrency death during the process. However, I'm seeing that in cases where there are multiple readers and writers (in different services - i.e., different JVMs) beating on the particular model, the error messages with "****" appear, sometimes surprisingly often. When the first one appears, the second one also always appears.
>
> The outcome in the above cases is a repository model that is empty when read later. The obvious idea is that between model.removeAll() and the later model.add() the reader is reading the temporarily empty model. I don't know if that idea is correct, but I don't have any others.
>
> Any suggestions as to what I am doing wrong would be appreciated.
>
> /**
> * Single place to write or delete a model from the store
> * @param model store-based model
> * @param res in-memory resource to write into model, or null to just 
> delete model */ void internalWrite(Model model, Individual res) {
>     if (model != null) {
>        try {
>           logger.debug("begin write model");
>           model.enterCriticalSection(Lock.WRITE);
>           model.notifyEvent(GraphEvents.startRead); // TEST I don't know if this is worth doing
>           model.begin();
>           model.removeAll(); // clear out the existing statements in the repo model
>           if (res != null) {
>              Model base = res.getOntModel().getBaseModel();
>              if (base.size() == 0) {
>                 logger.error("internalWrite: base model empty");
>              }
>              model.add(base); // add in-memory model to repository model
>              if (model.size() == 0) {
>                 logger.error("internalWrite: model empty after add"); // **** if this happens ****
>              }
>           }
>           else {
>              logger.debug("internalWrite: resource passed is null: delete");
>           }
>           model.commit();
>           if (model.size() == 0) {
>              logger.error("internalWrite: model empty after commit"); // **** this also happens ****
>           }
>        }
>        catch (SDBException ex) {
>           model.abort();
>           logger.error(ex);
>        }
>        finally {
>           model.notifyEvent(GraphEvents.finishRead);  // TEST I don't know if this is worth doing
>           model.leaveCriticalSection();
>           logger.debug("end write model");
>        }
>     }
> }
>
> Thanks,
>
> Dave
>
> David Lebling
> BAE Systems, Inc.
>

RE: Concurrency in Jena/SDB

Posted by "Lebling, David (US SSA)" <da...@baesystems.com>.

Andy,

I was hoping you might have some guidance as to what you mean by "You need to control the JDBC transaction in your code to get isolation between JVMs."

Is there some interface in SDB that will accomplish this, or does this require SQL-implementation-specific code, thus tearing down the SDB façade? If it does require such implementation-specific code, do you have any pointers to examples which do it correctly? Is it necessary to write one's own transaction begin/commit/abort methods?

Thanks,

Dave

-----Original Message-----
From: Lebling, David (US SSA) [mailto:david.lebling@baesystems.com] 
Sent: Thursday, April 11, 2013 9:15 AM
To: users@jena.apache.org
Subject: RE: Concurrency in Jena/SDB

Andy,

It has been used with Postgres and MySQL.

Dave

-----Original Message-----
From: Andy Seaborne [mailto:andy@apache.org]
Sent: Thursday, April 11, 2013 8:16 AM
To: users@jena.apache.org
Subject: Re: Concurrency in Jena/SDB

David,

Your not executing inside a database level (JDBC) transaction.  I'm afraid model.begin/commit does not know about JDBC properly.  You need to control the JDBC transaction in your code to get isolation between JVMs.

Which SQL database are you using?

	Andy

On 10/04/13 16:12, Lebling, David (US SSA) wrote:
> The method below is a somewhat over-decorated (with extra non-production checks and error log messages) version of "write a model to the SDB repository safely" that I am using. The Individual res (or rather, the in-memory model it is in) is written to the Model model (obtained via SDBFactory.connectNamedModel), replacing any current contents.
>
> It attempts to protect the model update from horrible concurrency death during the process. However, I'm seeing that in cases where there are multiple readers and writers (in different services - i.e., different JVMs) beating on the particular model, the error messages with "****" appear, sometimes surprisingly often. When the first one appears, the second one also always appears.
>
> The outcome in the above cases is a repository model that is empty when read later. The obvious idea is that between model.removeAll() and the later model.add() the reader is reading the temporarily empty model. I don't know if that idea is correct, but I don't have any others.
>
> Any suggestions as to what I am doing wrong would be appreciated.
>
> /**
> * Single place to write or delete a model from the store
> * @param model store-based model
> * @param res in-memory resource to write into model, or null to just 
> delete model */ void internalWrite(Model model, Individual res) {
>     if (model != null) {
>        try {
>           logger.debug("begin write model");
>           model.enterCriticalSection(Lock.WRITE);
>           model.notifyEvent(GraphEvents.startRead); // TEST I don't know if this is worth doing
>           model.begin();
>           model.removeAll(); // clear out the existing statements in the repo model
>           if (res != null) {
>              Model base = res.getOntModel().getBaseModel();
>              if (base.size() == 0) {
>                 logger.error("internalWrite: base model empty");
>              }
>              model.add(base); // add in-memory model to repository model
>              if (model.size() == 0) {
>                 logger.error("internalWrite: model empty after add"); // **** if this happens ****
>              }
>           }
>           else {
>              logger.debug("internalWrite: resource passed is null: delete");
>           }
>           model.commit();
>           if (model.size() == 0) {
>              logger.error("internalWrite: model empty after commit"); // **** this also happens ****
>           }
>        }
>        catch (SDBException ex) {
>           model.abort();
>           logger.error(ex);
>        }
>        finally {
>           model.notifyEvent(GraphEvents.finishRead);  // TEST I don't know if this is worth doing
>           model.leaveCriticalSection();
>           logger.debug("end write model");
>        }
>     }
> }
>
> Thanks,
>
> Dave
>
> David Lebling
> BAE Systems, Inc.
>

RE: Concurrency in Jena/SDB

Posted by "Lebling, David (US SSA)" <da...@baesystems.com>.

Andy,

It has been used with Postgres and MySQL.

Dave

-----Original Message-----
From: Andy Seaborne [mailto:andy@apache.org] 
Sent: Thursday, April 11, 2013 8:16 AM
To: users@jena.apache.org
Subject: Re: Concurrency in Jena/SDB

David,

Your not executing inside a database level (JDBC) transaction.  I'm afraid model.begin/commit does not know about JDBC properly.  You need to control the JDBC transaction in your code to get isolation between JVMs.

Which SQL database are you using?

	Andy

On 10/04/13 16:12, Lebling, David (US SSA) wrote:
> The method below is a somewhat over-decorated (with extra non-production checks and error log messages) version of "write a model to the SDB repository safely" that I am using. The Individual res (or rather, the in-memory model it is in) is written to the Model model (obtained via SDBFactory.connectNamedModel), replacing any current contents.
>
> It attempts to protect the model update from horrible concurrency death during the process. However, I'm seeing that in cases where there are multiple readers and writers (in different services - i.e., different JVMs) beating on the particular model, the error messages with "****" appear, sometimes surprisingly often. When the first one appears, the second one also always appears.
>
> The outcome in the above cases is a repository model that is empty when read later. The obvious idea is that between model.removeAll() and the later model.add() the reader is reading the temporarily empty model. I don't know if that idea is correct, but I don't have any others.
>
> Any suggestions as to what I am doing wrong would be appreciated.
>
> /**
> * Single place to write or delete a model from the store
> * @param model store-based model
> * @param res in-memory resource to write into model, or null to just 
> delete model */ void internalWrite(Model model, Individual res) {
>     if (model != null) {
>        try {
>           logger.debug("begin write model");
>           model.enterCriticalSection(Lock.WRITE);
>           model.notifyEvent(GraphEvents.startRead); // TEST I don't know if this is worth doing
>           model.begin();
>           model.removeAll(); // clear out the existing statements in the repo model
>           if (res != null) {
>              Model base = res.getOntModel().getBaseModel();
>              if (base.size() == 0) {
>                 logger.error("internalWrite: base model empty");
>              }
>              model.add(base); // add in-memory model to repository model
>              if (model.size() == 0) {
>                 logger.error("internalWrite: model empty after add"); // **** if this happens ****
>              }
>           }
>           else {
>              logger.debug("internalWrite: resource passed is null: delete");
>           }
>           model.commit();
>           if (model.size() == 0) {
>              logger.error("internalWrite: model empty after commit"); // **** this also happens ****
>           }
>        }
>        catch (SDBException ex) {
>           model.abort();
>           logger.error(ex);
>        }
>        finally {
>           model.notifyEvent(GraphEvents.finishRead);  // TEST I don't know if this is worth doing
>           model.leaveCriticalSection();
>           logger.debug("end write model");
>        }
>     }
> }
>
> Thanks,
>
> Dave
>
> David Lebling
> BAE Systems, Inc.
>

Re: Concurrency in Jena/SDB

Posted by Andy Seaborne <an...@apache.org>.

David,

Your not executing inside a database level (JDBC) transaction.  I'm 
afraid model.begin/commit does not know about JDBC properly.  You need 
to control the JDBC transaction in your code to get isolation between JVMs.

Which SQL database are you using?

	Andy

On 10/04/13 16:12, Lebling, David (US SSA) wrote:
> The method below is a somewhat over-decorated (with extra non-production checks and error log messages) version of "write a model to the SDB repository safely" that I am using. The Individual res (or rather, the in-memory model it is in) is written to the Model model (obtained via SDBFactory.connectNamedModel), replacing any current contents.
>
> It attempts to protect the model update from horrible concurrency death during the process. However, I'm seeing that in cases where there are multiple readers and writers (in different services - i.e., different JVMs) beating on the particular model, the error messages with "****" appear, sometimes surprisingly often. When the first one appears, the second one also always appears.
>
> The outcome in the above cases is a repository model that is empty when read later. The obvious idea is that between model.removeAll() and the later model.add() the reader is reading the temporarily empty model. I don't know if that idea is correct, but I don't have any others.
>
> Any suggestions as to what I am doing wrong would be appreciated.
>
> /**
> * Single place to write or delete a model from the store
> * @param model store-based model
> * @param res in-memory resource to write into model, or null to just delete model
> */
> void internalWrite(Model model, Individual res) {
>     if (model != null) {
>        try {
>           logger.debug("begin write model");
>           model.enterCriticalSection(Lock.WRITE);
>           model.notifyEvent(GraphEvents.startRead); // TEST I don't know if this is worth doing
>           model.begin();
>           model.removeAll(); // clear out the existing statements in the repo model
>           if (res != null) {
>              Model base = res.getOntModel().getBaseModel();
>              if (base.size() == 0) {
>                 logger.error("internalWrite: base model empty");
>              }
>              model.add(base); // add in-memory model to repository model
>              if (model.size() == 0) {
>                 logger.error("internalWrite: model empty after add"); // **** if this happens ****
>              }
>           }
>           else {
>              logger.debug("internalWrite: resource passed is null: delete");
>           }
>           model.commit();
>           if (model.size() == 0) {
>              logger.error("internalWrite: model empty after commit"); // **** this also happens ****
>           }
>        }
>        catch (SDBException ex) {
>           model.abort();
>           logger.error(ex);
>        }
>        finally {
>           model.notifyEvent(GraphEvents.finishRead);  // TEST I don't know if this is worth doing
>           model.leaveCriticalSection();
>           logger.debug("end write model");
>        }
>     }
> }
>
> Thanks,
>
> Dave
>
> David Lebling
> BAE Systems, Inc.
>