You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Andy Seaborne <an...@apache.org> on 2019/07/07 16:11:15 UTC
Re: RDFSExptRuleReasoner / stopping early (JENA-1719)
On 30/06/2019 10:42, Dave Reynolds wrote:
> Having spent some more time on this I still can't find a safe fix to the
> underlying problem but have created a brute force work around.
>
> https://github.com/apache/jena/pull/580
>
> As it says in the PR it might need some serious redesign of the backward
> engine to properly fix the underlying issue, it really shouldn't be this
> hard to clean up partial state despite all the tabling. That's not
> something I can help with in the foreseeable future, hence going for a
> workaround.
>
> I suspect I've got the workflow wrong with this PR. I used my normal git
> process of developing the changes in a branch and then issuing a PR
> against that branch but maybe that's not the right thing to do with the
> apache setup. Apologies if I've messed up. Should I have done this is my
> own fork instead?
It's worked and I've merged it. A branch in the main repo has a similar
flow, including needing any local tidy-up afterwards.
If it is of any size, or rather of any length in time, a clone+branch is
probably the way to go, but for immediate fixes (esp. when your jena
working copy is already "in use"), direct branches seem Ok to me at least.
Andy
>
> Dave
>
> On 09/06/2019 22:56, Dave Reynolds wrote:
>> I've spent some time on this but with limited results. At heart this
>> looks a long standing and deep seated bug in the backward chainer but
>> I'm not yet sure how to safely fix it.
>>
>> I can reproduce the problem fine using the recipe in the ticket but
>> haven't yet managed to create a stand alone test case.
>>
>> More annoyingly if I pause in a debugger then the test case works. I
>> thought this was because of the switch of tabledGoals in LBRuleEngine
>> to a weak hashmap (for JENA-901) but if I remove the weakValues() call
>> the test remains non-deterministic.
>>
>> That aside, the problem is, as you might expect, the tabling of goals.
>>
>> The backward chainer can mark any or all predicates as tabled. Any
>> goal (basically a triple pattern) which is tabled is satisfied by
>> getting a Generator for it from the tabledGoals store. These goals
>> might be from the top level query or from body terms in the chain of
>> rules being fired. The Generator instance stores both the results for
>> the goal known so far and an interpreter instance with associated
>> state that can be used to generate more answers. Part of that
>> interpreter state can be an actual triple query to the underlying
>> data. That's what these TopLevelTripleMatchFrame instances are.
>>
>> The Generators in the tabledGoals table are expected to outlast
>> individual queries (kind of the point of them) but that means that if
>> the query didn't run to completion then the Generators can hold on to
>> state include unclosed TopLevelTripleMatchFrame instances and so to
>> iterators in the store. Bad.
>>
>> If the top level query is closed early then the LPTopGoalIterator does
>> try to clean up the engine state by looking for Generators that have
>> completed, and does close the top level interpreter. However, it
>> doesn't seem to do anything about the remaining inflight Generators
>> and I assume that's the underlying problem. There needs to be some
>> sort of a scan to close those down and removed them from the
>> tabledGoals (so as to not poison future runs by having an incomplete
>> set of results in the goal table). That all looks really tricky to do
>> when there can be concurrent top level queries all pulling on the same
>> set of tabled generators.
>>
>> However, there's lots I don't understand about the current behaviour
>> (amazing what you can forget about complex code in 10 years!).
>>
>> - I'm surprised that LPInterpreter doesn't also close any associated
>> TopLevelTripleMatchFrame. That on its own wouldn't (and doesn't) fix
>> this problem because the relevant interpreter is not itself closed but
>> I'm surprised we don't see lots of unclosed iterators lying around (or
>> at least not that we've noticed).
>>
>> - Given that some interpreters *are* closed (e.g. the top level one) I
>> would have expected those to need to be removed from the tabledGoals
>> if they weren't complete. I can't see any code to do that but if it's
>> not being done I can't understand how the system works at all!
>>
>> So no real help yet I'm afraid.
>>
>> Dave
>>
>>
>> On 08/06/2019 12:34, Andy Seaborne wrote:
>>> Even just some pointers as to where in the code it should close
>>> TopLevelTripleMatchFrame would be helpful if getting it setup is a
>>> barrier. (if it should close it) I don't know the rules codes well
>>> enough.
>>>
>>> -------------------
>>>
>>> The example on JENA-1719 after editing the <file:> and tdb2:location
>>> runs.
>>>
>>> Files in /home/afs/DIR/
>>>
>>>
>>> Load data:
>>> cd DIR
>>> tdb2.tdbloader --loc=graph data.nt
>>>
>>> I run it in Eclipse with
>>>
>>> public static void main(String[] argv) throws Exception {
>>> FusekiMainCmd.main("--config=/home/afs/DIR/example.ttl");
>>> }
>>>
>>> My example.ttl below.
>>>
>>> Many thanks
>>> Andy
>>>
>>> On 08/06/2019 12:22, Dave Reynolds wrote:
>>>> Hi Andy,
>>>>
>>>> Sure, I'll try to take a look over the weekend if I can get a
>>>> working dev environment set up.
>>>>
>>>> Dave
>>>>
>>>> On 07/06/2019 13:46, Andy Seaborne wrote:
>>>>> Dave,
>>>>>
>>>>> I am hoping you can give me a some pointers so I can solve JENA-1719.
>>>>> https://issues.apache.org/jira/browse/JENA-1719
>>>>>
>>>>> Setup:
>>>>> data in TDB2.
>>>>> ontology in a memory model.
>>>>> RDFSExptRuleReasoner
>>>>>
>>>>> First query has a short limit:
>>>>> select * where {?s a <http://example.com/ns/Person>} limit 1
>>>>> This returns one answer.
>>>>>
>>>>> The iterator (LPTopGoalIterator) over the inf graph does get closed.
>>>>>
>>>>> Second query has a longer limit
>>>>> select * where {?s a <http://example.com/ns/Person>} limit 1000
>>>>>
>>>>> (If the long query is used first, the second query works)
>>>>>
>>>>> Problem:
>>>>>
>>>>> The second query is using an iterator created during the first query.
>>>>>
>>>>> During the LIMIT 1 query:
>>>>>
>>>>> TopLevelTripleMatchFrame constructor
>>>>> TopLevelTripleMatchFrame constructor
>>>>> TopLevelTripleMatchFrame constructor
>>>>> TopLevelTripleMatchFrame constructor
>>>>> TopLevelTripleMatchFrame constructor
>>>>> closeIterator[?s, rdf:type, http://example.com/ns/Person] <-- The
>>>>> query
>>>>> LPInterpreter.close[cpFrame:ConsumerChoicePointFrame]
>>>>> LPInterpreter.close[cpFrame:null]
>>>>> LPInterpreter.close[cpFrame:ConsumerChoicePointFrame]
>>>>> LPInterpreter.close[cpFrame:null]
>>>>>
>>>>>
>>>>> TopLevelTripleMatchFrame.close is not called - and the same is true
>>>>> genrally during startup.
>>>>>
>>>>> Thanks,
>>>>> Andy
>>>
>>> @prefix : <#> .
>>> @prefix fuseki: <http://jena.apache.org/fuseki#> .
>>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
>>> @prefix tdb2: <http://jena.apache.org/2016/tdb#> .
>>> @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
>>>
>>> <#service> rdf:type fuseki:Service ;
>>> rdfs:label "Fuseki service" ;
>>> fuseki:name "example" ;
>>> fuseki:serviceQuery "query" ;
>>> fuseki:serviceQuery "sparql" ;
>>> fuseki:serviceUpdate "update" ;
>>> fuseki:serviceUpload "upload" ;
>>> fuseki:serviceReadWriteGraphStore "data" ;
>>> fuseki:serviceReadGraphStore "get" ;
>>> fuseki:dataset <#inf_dataset> ;
>>> .
>>>
>>> <#inf_dataset> rdf:type ja:RDFDataset ;
>>> ja:defaultGraph <#model> .
>>>
>>> <#model> rdf:type ja:InfModel ;
>>> ja:reasoner <#reasoner> ;
>>> ja:baseModel <#tdb_graph> .
>>>
>>> <#reasoner> rdf:type ja:ReasonerFactory ;
>>> ja:reasonerURL <http://jena.hpl.hp.com/2003/RDFSExptRuleReasoner> ;
>>> ja:schema <#ontology> .
>>>
>>> <#ontology> rdf:type ja:MemoryModel ;
>>> ja:content [ ja:externalContent
>>> <file:///home/afs/DIR/example.owl> ] .
>>>
>>> <#tdb_graph> rdf:type tdb2:GraphTDB ;
>>> tdb2:dataset <#tdb2_dataset> .
>>>
>>> <#tdb2_dataset> rdf:type tdb2:DatasetTDB2 ;
>>> tdb2:location "/home/afs/DIR/graph" .