You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Claus Stadler (Jira)" <ji...@apache.org> on 2020/03/13 20:58:00 UTC

[jira] [Commented] (JENA-1861) Query not thread safe

    [ https://issues.apache.org/jira/browse/JENA-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17059073#comment-17059073 ] 

Claus Stadler commented on JENA-1861:
-------------------------------------

Here is example code that should eventually die due to a race condition, for me it takes a few seconds:

{code}
    @Test
    public void testRaceCondition() {
        Stream.generate(() -> QueryFactory.create("SELECT * { BIND(SHA256('foo') AS ?bar) }"))
            .peek(q -> q.setResultVars()) // <-- With this line commented out, the race condition happens earlier
             // Repeat q to increase chance to cause the race condition
            .forEach(q -> Arrays.asList(q, q, q, q, q, q, q, q).parallelStream()
                .forEach(query -> {
                    Model model = ModelFactory.createDefaultModel();
                    try(QueryExecution qe = QueryExecutionFactory.create(query, model)) {
                        ResultSetFormatter.consume(qe.execSelect());	
                    }
                }));
    }
{code}

> Query not thread safe
> ---------------------
>
>                 Key: JENA-1861
>                 URL: https://issues.apache.org/jira/browse/JENA-1861
>             Project: Apache Jena
>          Issue Type: Question
>          Components: ARQ
>    Affects Versions: Jena 3.14.0
>            Reporter: Claus Stadler
>            Priority: Major
>
> Executing the same query object on different RDFConnections is not thread safe:
> I ran into very misleading "NPE in NodeFactory.createLiteral" exceptions when computing SHA256 sums in parallel on different connections backed by different datasets/models using the SAME query object.
> I identified the cause as due to a race condition due to the digestCache used in [ExprDigest|https://github.com/apache/jena/blob/d95b7d295cebaeb2ea41029f4ee7781be94e5e85/jena-arq/src/main/java/org/apache/jena/sparql/expr/ExprDigest.java#L33]
> My first question is: Are Query objects - or rather expressions - supposed to carry execution state or is this rather a bug?
> I know that some parts of the Query object, such as result vars, are only initialized on request which makes use of the same Query object in different threads fragile to begin with.
> So my other question is: Given a Query object, is Jena supposed to allow for 'fully initializing' it, such that its execution using Jena's provided facilities (models, datasets, etc) is guaranteed to not modify its state?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)