You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by François-Paul Servant <fr...@gmail.com> on 2015/09/05 17:20:18 UTC

Concurrency in a sparql servlet

trying to understand how to write a sparql servlet that handles read-write operations…

I read
- https://jena.apache.org/documentation/notes/concurrency-howto.html
- http://jena.apache.org/documentation/tdb/tdb_transactions.html
(though not working specifically with TDB)
- and http://stackoverflow.com/questions/18968971/is-it-possible-to-concurrently-write-to-the-same-dataset-file-but-to-different-n

Not sure to understand everything, correct me if I’m wrong, and TIA if you can answer some of these questions:

- if within an application, a model can be updated, then any call to jena (or at least, any one that iterates over something) must be inside some kind of lock.
- for memory models, including datasets built around memory models, the only option is model.enterCriticalSection/leaveCriticalSection
- with TDB, use dataset.begin/dataset.end
- your code must not enter a locking section if you’re already in one. It looks that this means that you probably must have some kind of one central, unique entry point in your code where you lock/unlock -- hmm, doesn’t seem easy
- in a sparql query over a dataset with memory named graphs, if updates are possible, you must lock all the graphs (Dataset.listNames to get the names of the graphs, loop over them to enterCriticalSection)
- if a sparql query doens’t include updates, it is better to just take a READ lock
- so a big question is: how do you known if a query contains updates or not
- if a sparql query doens’t include updates, is it enough to take a READ lock only on the graphs included in the query?
- (necessary only if the answer of previous one is “yes”) you get the graphs in a query with the union of getGraphURIs and getNamedGraphURIs” ?

that’s all for the moment, thank you.

Best Regards,

fps

Re: Concurrency in a sparql servlet

Posted by François-Paul Servant <fr...@gmail.com>.

Andy,

thank you very much, it’s ver kind of you. Things are a lot clearer now

fps

> Le 5 sept. 2015 à 18:21, Andy Seaborne <an...@apache.org> a écrit :
> 
> On 05/09/15 16:20, François-Paul Servant wrote:
>> trying to understand how to write a sparql servlet that handles read-write operations…
> 
> The code in Fuseki is available to look at.
> 
>> 
>> I read
>> - https://jena.apache.org/documentation/notes/concurrency-howto.html
>> - http://jena.apache.org/documentation/tdb/tdb_transactions.html
>> (though not working specifically with TDB)
>> - and http://stackoverflow.com/questions/18968971/is-it-possible-to-concurrently-write-to-the-same-dataset-file-but-to-different-n
>> 
>> Not sure to understand everything, correct me if I’m wrong, and TIA if you can answer some of these questions:
>> 
>> - if within an application, a model can be updated, then any call to jena (or at least, any one that iterates over something) must be inside some kind of lock.
> 
> Yes
> 
>> - for memory models, including datasets built around memory models, the only option is model.enterCriticalSection/leaveCriticalSection
> 
> No - DatasetGraphWithLock is another option, then use begin/end.
> 
> You can look at the code in Fuseki.
> 
> SPARQL_Query.execute.
> 
>> - with TDB, use dataset.begin/dataset.end
> 
> Yes
> 
>> - your code must not enter a locking section if you’re already in one. It looks that this means that you probably must have some kind of one central, unique entry point in your code where you lock/unlock -- hmm, doesn’t seem easy
> 
> Not sure what is behind the question - locks are java's ReentrantReadWriteLock (transactions are not - that would be a nest transaction which is not supported)
> 
>> - in a sparql query over a dataset with memory named graphs, if updates are possible, you must lock all the graphs (Dataset.listNames to get the names of the graphs, loop over them to enterCriticalSection)
> 
> No - the Dataset must be locked.  Dataset.getLock()
> 
> Locking the models does not help protect the dataset in the general case.
> 
>> - if a sparql query doens’t include updates, it is better to just take a READ lock
> 
> Yes
> 
>> - so a big question is: how do you known if a query contains updates or not
> 
> SPARQL Query and SPARQL Update are different languages precisely for this case.  This isn't SQL!
> 
> You can't parse SPARQL Update with the query parser - disjoint keywords.
> 
>> - if a sparql query doens’t include updates, is it enough to take a READ lock only on the graphs included in the query?
> 
> No - one lock on the Datasets
> 
>> - (necessary only if the answer of previous one is “yes”) you get the graphs in a query with the union of getGraphURIs and getNamedGraphURIs” ?
> 
>> 
>> that’s all for the moment, thank you.
>> 
>> Best Regards,
>> 
>> fps
>> 
>

Re: Concurrency in a sparql servlet

Posted by Andy Seaborne <an...@apache.org>.

On 05/09/15 16:20, François-Paul Servant wrote:
> trying to understand how to write a sparql servlet that handles read-write operations…

The code in Fuseki is available to look at.

>
> I read
> - https://jena.apache.org/documentation/notes/concurrency-howto.html
> - http://jena.apache.org/documentation/tdb/tdb_transactions.html
> (though not working specifically with TDB)
> - and http://stackoverflow.com/questions/18968971/is-it-possible-to-concurrently-write-to-the-same-dataset-file-but-to-different-n
>
> Not sure to understand everything, correct me if I’m wrong, and TIA if you can answer some of these questions:
>
> - if within an application, a model can be updated, then any call to jena (or at least, any one that iterates over something) must be inside some kind of lock.

Yes

> - for memory models, including datasets built around memory models, the only option is model.enterCriticalSection/leaveCriticalSection

No - DatasetGraphWithLock is another option, then use begin/end.

You can look at the code in Fuseki.

SPARQL_Query.execute.

> - with TDB, use dataset.begin/dataset.end

Yes

> - your code must not enter a locking section if you’re already in one. It looks that this means that you probably must have some kind of one central, unique entry point in your code where you lock/unlock -- hmm, doesn’t seem easy

Not sure what is behind the question - locks are java's 
ReentrantReadWriteLock (transactions are not - that would be a nest 
transaction which is not supported)

> - in a sparql query over a dataset with memory named graphs, if updates are possible, you must lock all the graphs (Dataset.listNames to get the names of the graphs, loop over them to enterCriticalSection)

No - the Dataset must be locked.  Dataset.getLock()

Locking the models does not help protect the dataset in the general case.

> - if a sparql query doens’t include updates, it is better to just take a READ lock

Yes

> - so a big question is: how do you known if a query contains updates or not

SPARQL Query and SPARQL Update are different languages precisely for 
this case.  This isn't SQL!

You can't parse SPARQL Update with the query parser - disjoint keywords.

> - if a sparql query doens’t include updates, is it enough to take a READ lock only on the graphs included in the query?

No - one lock on the Datasets

> - (necessary only if the answer of previous one is “yes”) you get the graphs in a query with the union of getGraphURIs and getNamedGraphURIs” ?

>
> that’s all for the moment, thank you.
>
> Best Regards,
>
> fps
>