You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Andy Seaborne (Jira)" <ji...@apache.org> on 2020/03/14 10:52:00 UTC

[jira] [Resolved] (JENA-1858) SERVICE in SPARQL blocks after a while

     [ https://issues.apache.org/jira/browse/JENA-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andy Seaborne resolved JENA-1858.
---------------------------------
    Fix Version/s: Jena 3.15.0
         Assignee: Andy Seaborne
       Resolution: Fixed

> SERVICE in SPARQL blocks after a while
> --------------------------------------
>
>                 Key: JENA-1858
>                 URL: https://issues.apache.org/jira/browse/JENA-1858
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: ARQ
>    Affects Versions: Jena 3.14.0
>            Reporter: Claus Stadler
>            Assignee: Andy Seaborne
>            Priority: Major
>             Fix For: Jena 3.15.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hi once again :)
> I wanted to create a quick RDF/SPARQL-based service online/offline monitoring system just like this:
> * A list of endpoints in [this dataset|https://github.com/SmartDataAnalytics/lodservatory/blob/master/sparql-endpoints-from-andre.ttl]
> * Have a CI process run this SPARQL query and publish/commit the results to a file
> {code}
> PREFIX eg: <http://www.example.org/>
> PREFIX dcat: <http://www.w3.org/ns/dcat#>
> CONSTRUCT {
>     ?s eg:serviceStatus ?status
> }
> {
>   ?s dcat:endpointURL ?e .
>   # Here we rely on jena's substitution mechanism in QueryIterService.java - which is sufficient for my use case
>   SERVICE SILENT ?e { 
>     # If the request fails, we get a single binding without any variables bound
>     { SELECT ?t { ?x a ?t } LIMIT 1 }
>   }
>   BIND(IF(BOUND(?t), "online", "offline") AS ?status)
> }
> {code}
> However, the query blocks after a while by consuming the HTTP connection pool.
> I have not yet identified all sources, but one I could spot is here:
> * The InputStream opened at [Service.java#L172|https://github.com/apache/jena/blob/64253b9de5924006cdd46f1e3492a92031842d3b/jena-arq/src/main/java/org/apache/jena/sparql/engine/http/Service.java#L172] is not in a try-catch-block, so if the subsequent XML parsing fails, then it is never closed.
> Maybe this triggers ideas of potentially other spots. I have a local jena checkout and will try to find out whether there are any other leaks. My goal is to have the query complete on the whole endpoint list - despite many of the URLs actually referring to by now broken services.
> I am aware of the context settings in https://jena.apache.org/documentation/query/service.html - but I did not fiddle with the settings - especially timeouts, as so far the issue is really the exhaustion of the connection pool.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)