You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@jena.apache.org by GitBox <gi...@apache.org> on 2022/11/10 09:13:43 UTC
[GitHub] [jena] LorenzBuehmann opened a new issue, #1614: Property path handling in query optimizer and timeout handler
LorenzBuehmann opened a new issue, #1614:
URL: https://github.com/apache/jena/issues/1614
### Version
4.7.0-SNAPSHOT
### Question
Hi,
- Jena 4.7.0-SNAPSHOT
- TDB2
- Fuseki
### Part 1
We deployed Wikidata truthy dump and got a query which runs "forever" (probably):
``` sparql
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?item ?itemLabel WHERE {
?item (rdfs:label|skos:altLabel) ?itemLabel;
rdf:type wikibase:Property.
FILTER(CONTAINS(LCASE(?itemLabel), "border"@en))
}
LIMIT 10
```
There are in total 7 results (when the reordered query terminates).
You can see the usage of a property path in the first triple pattern. It looks like the BGP is never reordered here? We also computed the TDB stats:
| Pattern | Count |
|------------------------------|---------------:|
| `rdfs:label` | 510 188 728 |
| `skos:altLabel` | 102 554 842 |
| `rdf:type wikibase:Property` | 9003 |
the size of the second triple pattern would be `9000`, so fairly small.
Reordering the triple pattern indeed helps, but the user would have to know this. I can imagine property path query result estimation is difficult ...
### Part 2
With the same query we recognized that the query execution timeout is never considered with that query, so it looks like it's in a branch where query termination isn't checked for maybe when calling `GraphUtils.allNodes(graph)` in `PathLib` L245 ??
relevant part of JStack dump:
```
"qtp123458189-28" #28 prio=5 os_prio=0 cpu=45409.83ms elapsed=198.44s tid=0x00007f3831342000 nid=0x3b runnable [0x00007f1eacefa000]
java.lang.Thread.State: RUNNABLE
at java.io.RandomAccessFile.seek0(java.base@11.0.16.1/Native Method)
at java.io.RandomAccessFile.seek(java.base@11.0.16.1/RandomAccessFile.java:591)
at org.apache.jena.dboe.base.file.BinaryDataFileRandomAccess.seek(BinaryDataFileRandomAccess.java:95)
at org.apache.jena.dboe.base.file.BinaryDataFileRandomAccess.read(BinaryDataFileRandomAccess.java:71)
at org.apache.jena.dboe.base.file.BinaryDataFileWriteBuffered.read(BinaryDataFileWriteBuffered.java:121)
- locked <0x00007f202a7f73f8> (a java.lang.Object)
at org.apache.jena.dboe.trans.data.TransBinaryDataFile.read(TransBinaryDataFile.java:197)
at org.apache.jena.tdb2.store.nodetable.TReadAppendFileTransport.read(TReadAppendFileTransport.java:74)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:100)
at org.apache.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:622)
at org.apache.thrift.protocol.TCompactProtocol.readFieldBegin(TCompactProtocol.java:522)
at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:247)
at org.apache.thrift.TUnion$TUnionStandardScheme.read(TUnion.java:227)
at org.apache.thrift.TUnion.read(TUnion.java:145)
at org.apache.jena.tdb2.store.nodetable.NodeTableTRDF.readNodeFromTable(NodeTableTRDF.java:82)
at org.apache.jena.tdb2.store.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:102)
- locked <0x00007f203af76558> (a org.apache.jena.tdb2.store.nodetable.NodeTableTRDF)
at org.apache.jena.tdb2.store.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:52)
at org.apache.jena.tdb2.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:208)
- locked <0x00007f203af6b168> (a java.lang.Object)
at org.apache.jena.tdb2.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:133)
at org.apache.jena.tdb2.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:52)
at org.apache.jena.tdb2.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:65)
at org.apache.jena.tdb2.lib.TupleLib.triple(TupleLib.java:77)
at org.apache.jena.tdb2.lib.TupleLib.triple(TupleLib.java:66)
at org.apache.jena.tdb2.lib.TupleLib.lambda$convertToTriples$2(TupleLib.java:48)
at org.apache.jena.tdb2.lib.TupleLib$$Lambda$643/0x00007f1e464db4b0.apply(Unknown Source)
at org.apache.jena.atlas.iterator.Iter$IterMap.next(Iter.java:417)
at org.apache.jena.atlas.iterator.IteratorWrapper.next(IteratorWrapper.java:41)
at org.apache.jena.dboe.transaction.txn.IteratorTxnTracker.next(IteratorTxnTracker.java:39)
at org.apache.jena.atlas.iterator.Iter$IterMap.next(Iter.java:417)
at org.apache.jena.atlas.iterator.Iter$IterMap.next(Iter.java:417)
at org.apache.jena.atlas.iterator.Iter.next(Iter.java:1109)
at org.apache.jena.util.iterator.WrappedIterator.next(WrappedIterator.java:107)
at org.apache.jena.sparql.util.graph.GraphUtils.allNodes(GraphUtils.java:240)
at org.apache.jena.sparql.path.PathLib.determineUngroundedStartingSet(PathLib.java:245)
at org.apache.jena.sparql.path.PathLib.execUngroundedPath(PathLib.java:182)
at org.apache.jena.sparql.path.PathLib.execTriplePath(PathLib.java:128)
at org.apache.jena.sparql.path.PathLib.execTriplePath(PathLib.java:108)
at org.apache.jena.sparql.engine.iterator.QueryIterPath.nextStage(QueryIterPath.java:47)
at org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:100)
at org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:60)
at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:116)
at org.apache.jena.sparql.engine.main.iterator.QueryIterGraph$QueryIterGraphInner.hasNextBinding(QueryIterGraph.java:121)
at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:116)
at org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:69)
at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:116)
at org.apache.jena.sparql.engine.iterator.QueryIterProcessBinding.hasNextBinding(QueryIterProcessBinding.java:77)
at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:116)
at org.apache.jena.tdb2.solver.OpExecutorTDB2.optimizeExecuteQuads(OpExecutorTDB2.java:227)
at org.apache.jena.tdb2.solver.OpExecutorTDB2.execute(OpExecutorTDB2.java:164)
at org.apache.jena.sparql.engine.main.ExecutionDispatch.visit(ExecutionDispatch.java:66)
at org.apache.jena.sparql.algebra.op.OpQuadPattern.visit(OpQuadPattern.java:87)
at org.apache.jena.sparql.engine.main.ExecutionDispatch.exec(ExecutionDispatch.java:46)
at org.apache.jena.sparql.engine.main.OpExecutor.exec(OpExecutor.java:119)
at org.apache.jena.tdb2.solver.OpExecutorTDB2.exec(OpExecutorTDB2.java:87)
at org.apache.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:230)
at org.apache.jena.sparql.engine.main.ExecutionDispatch.visit(ExecutionDispatch.java:130)
at org.apache.jena.sparql.algebra.op.OpSequence.visit(OpSequence.java:75)
at org.apache.jena.sparql.engine.main.ExecutionDispatch.exec(ExecutionDispatch.java:46)
at org.apache.jena.sparql.engine.main.OpExecutor.exec(OpExecutor.java:119)
at org.apache.jena.tdb2.solver.OpExecutorTDB2.exec(OpExecutorTDB2.java:87)
at org.apache.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:391)
at org.apache.jena.sparql.engine.main.ExecutionDispatch.visit(ExecutionDispatch.java:267)
at org.apache.jena.sparql.algebra.op.OpProject.visit(OpProject.java:47)
at org.apache.jena.sparql.engine.main.ExecutionDispatch.exec(ExecutionDispatch.java:46)
at org.apache.jena.sparql.engine.main.OpExecutor.exec(OpExecutor.java:119)
at org.apache.jena.tdb2.solver.OpExecutorTDB2.exec(OpExecutorTDB2.java:87)
at org.apache.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:401)
at org.apache.jena.sparql.engine.main.ExecutionDispatch.visit(ExecutionDispatch.java:307)
at org.apache.jena.sparql.algebra.op.OpSlice.visit(OpSlice.java:50)
at org.apache.jena.sparql.engine.main.ExecutionDispatch.exec(ExecutionDispatch.java:46)
at org.apache.jena.sparql.engine.main.OpExecutor.exec(OpExecutor.java:119)
at org.apache.jena.tdb2.solver.OpExecutorTDB2.exec(OpExecutorTDB2.java:87)
at org.apache.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:90)
at org.apache.jena.sparql.engine.main.QC.execute(QC.java:53)
at org.apache.jena.sparql.engine.main.QueryEngineMain.eval(QueryEngineMain.java:55)
at org.apache.jena.tdb2.solver.QueryEngineTDB.eval(QueryEngineTDB.java:108)
at org.apache.jena.sparql.engine.QueryEngineBase.evaluate(QueryEngineBase.java:171)
at org.apache.jena.sparql.engine.QueryEngineBase.createPlan(QueryEngineBase.java:130)
at org.apache.jena.sparql.engine.QueryEngineBase.getPlan(QueryEngineBase.java:112)
at org.apache.jena.tdb2.solver.QueryEngineTDB$QueryEngineFactoryTDB.create(QueryEngineTDB.java:138)
at org.apache.jena.sparql.engine.QueryEngineFactoryWrapper.create(QueryEngineFactoryWrapper.java:49)
at org.apache.jena.sparql.exec.QueryExecDataset.getPlan(QueryExecDataset.java:531)
at org.apache.jena.sparql.exec.QueryExecDataset.startQueryIterator(QueryExecDataset.java:494)
at org.apache.jena.sparql.exec.QueryExecDataset.execute(QueryExecDataset.java:173)
at org.apache.jena.sparql.exec.QueryExecDataset.select(QueryExecDataset.java:167)
at org.apache.jena.sparql.exec.QueryExecutionAdapter.execSelect(QueryExecutionAdapter.java:115)
at org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.executeQuery(SPARQLQueryProcessor.java:374)
at org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.execute(SPARQLQueryProcessor.java:279)
at org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.executeWithParameter(SPARQLQueryProcessor.java:224)
at org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.execute(SPARQLQueryProcessor.java:209)
at org.apache.jena.fuseki.servlets.ActionService.executeLifecycle(ActionService.java:58)
at org.apache.jena.fuseki.servlets.SPARQLQueryProcessor.execPost(SPARQLQueryProcessor.java:84)
at org.apache.jena.fuseki.servlets.ActionProcessor.process(ActionProcessor.java:34)
at org.apache.jena.fuseki.servlets.ActionBase.process(ActionBase.java:54)
at org.apache.jena.fuseki.servlets.ActionExecLib.execActionSub(ActionExecLib.java:124)
at org.apache.jena.fuseki.servlets.ActionExecLib.execAction(ActionExecLib.java:98)
at org.apache.jena.fuseki.server.Dispatcher.dispatchAction(Dispatcher.java:164)
at org.apache.jena.fuseki.server.Dispatcher.process(Dispatcher.java:156)
at org.apache.jena.fuseki.server.Dispatcher.dispatch(Dispatcher.java:83)
at org.apache.jena.fuseki.servlets.FusekiFilter.doFilter(FusekiFilter.java:48)
at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:202)
at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)
at org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:61)
at org.apache.shiro.web.servlet.AdviceFilter.executeChain(AdviceFilter.java:108)
at org.apache.shiro.web.servlet.AdviceFilter.doFilterInternal(AdviceFilter.java:137)
at org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:154)
at org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:66)
at org.apache.shiro.web.servlet.AdviceFilter.executeChain(AdviceFilter.java:108)
at org.apache.shiro.web.servlet.AdviceFilter.doFilterInternal(AdviceFilter.java:137)
at org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:154)
at org.apache.shiro.web.servlet.ProxiedFilterChain.doFilter(ProxiedFilterChain.java:66)
at org.apache.shiro.web.servlet.AbstractShiroFilter.executeChain(AbstractShiroFilter.java:458)
at org.apache.shiro.web.servlet.AbstractShiroFilter$1.call(AbstractShiroFilter.java:373)
at org.apache.shiro.subject.support.SubjectCallable.doCall(SubjectCallable.java:90)
at org.apache.shiro.subject.support.SubjectCallable.call(SubjectCallable.java:83)
at org.apache.shiro.subject.support.DelegatingSubject.execute(DelegatingSubject.java:387)
at org.apache.shiro.web.servlet.AbstractShiroFilter.doFilterInternal(AbstractShiroFilter.java:370)
at org.apache.shiro.web.servlet.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:154)
at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:202)
at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)
at org.apache.jena.fuseki.servlets.CrossOriginFilter.handle(CrossOriginFilter.java:344)
at org.apache.jena.fuseki.servlets.CrossOriginFilter.doFilter(CrossOriginFilter.java:298)
at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:210)
at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:578)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1571)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1383)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1544)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1305)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:822)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
at org.eclipse.jetty.server.Server.handle(Server.java:563)
at org.eclipse.jetty.server.HttpChannel.lambda$handle$0(HttpChannel.java:505)
at org.eclipse.jetty.server.HttpChannel$$Lambda$768/0x00007f1d7cda2440.dispatch(Unknown Source)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:762)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:497)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:282)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:421)
at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:390)
at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:277)
at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produce(AdaptiveExecutionStrategy.java:199)
at org.eclipse.jetty.io.ManagedSelector$$Lambda$725/0x00007f1d7d757058.run(Unknown Source)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:933)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1077)
at java.lang.Thread.run(java.base@11.0.16.1/Thread.java:829)
```
Let me if we can provide more details.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org
[GitHub] [jena] afs commented on issue #1614: Property path handling in query optimizer and timeout handler
Posted by GitBox <gi...@apache.org>.
afs commented on issue #1614:
URL: https://github.com/apache/jena/issues/1614#issuecomment-1327383351
Continued on GH-1629.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org
[GitHub] [jena] afs commented on issue #1614: Property path handling in query optimizer and timeout handler
Posted by GitBox <gi...@apache.org>.
afs commented on issue #1614:
URL: https://github.com/apache/jena/issues/1614#issuecomment-1310237230
> I guess, UNION clauses can't also be optimized
The filter is being pushed in - there isn't any reorder of rdf:type (which is usually assumed to be risky choice).
```
(sequence
(union
(filter (contains (lcase ?itemLabel) "border"@en)
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/2000/01/rdf-schema#label> ?itemLabel)))
(filter (contains (lcase ?itemLabel) "border"@en)
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/2004/02/skos/core#altLabel> ?itemLabel))))
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://wikiba.se/ontology#Property>)))))
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org
[GitHub] [jena] afs commented on issue #1614: Property path handling in query optimizer and timeout handler
Posted by GitBox <gi...@apache.org>.
afs commented on issue #1614:
URL: https://github.com/apache/jena/issues/1614#issuecomment-1310233335
> in https://github.com/apache/jena/blob/main/jena-arq/src/main/java/org/apache/jena/sparql/algebra/optimize/TransformPathFlattern.java#L35 there is a comment that P_Alt is not converted into OpUnion, but do you know why? What was the downside of this particular optimisation?
No idea.
It might be a bad idea if the expression either side was complex.
Seems reasonable if the LHS and RHS are simple.
[JENA-2325](https://issues.apache.org/jira/browse/JENA-2325)
May be: [JENA-1300](https://issues.apache.org/jira/browse/JENA-1300)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org
[GitHub] [jena] afs commented on issue #1614: Property path handling in query optimizer and timeout handler
Posted by GitBox <gi...@apache.org>.
afs commented on issue #1614:
URL: https://github.com/apache/jena/issues/1614#issuecomment-1310761363
> Without having looked into this in detail, maybe it's possible to mitigate this aspect of the raised issues by using a QueryIterDistinct.
Should be possible.
And possibly properly push `QueryIterator` usage into `PathEvaluator` which is normal iterator centric.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org
[GitHub] [jena] afs commented on issue #1614: Property path handling in query optimizer and timeout handler
Posted by GitBox <gi...@apache.org>.
afs commented on issue #1614:
URL: https://github.com/apache/jena/issues/1614#issuecomment-1310140269
> Jena 4.7.0-SNAPSHOT
Which date of 4.7.0-SNAPSHOT.
Did other versions of Jena work?
It might be the reordering -- there again it also does not seem to convert the path alternatives to a a union.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org
[GitHub] [jena] LorenzBuehmann commented on issue #1614: Property path handling in query optimizer and timeout handler
Posted by GitBox <gi...@apache.org>.
LorenzBuehmann commented on issue #1614:
URL: https://github.com/apache/jena/issues/1614#issuecomment-1310230675
### Jena version
```
tdb2.tdbquery --version
Jena: VERSION: 4.7.0-SNAPSHOT
Jena: BUILD_DATE: 2022-11-10T12:32:30Z
```
### Property path query
```
tdb2.tdbquery --loc /data/coypu/tdb2/wikidata --explain "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?item ?itemLabel WHERE {
?item rdfs:label|skos:altLabel ?itemLabel ;
rdf:type wikibase:Property.
FILTER(CONTAINS(LCASE(?itemLabel), 'border'@en))
}
LIMIT 10"
12:42:50 INFO exec :: QUERY
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX wikibase: <http://wikiba.se/ontology#>
SELECT ?item ?itemLabel
WHERE
{ ?item rdfs:label|skos:altLabel ?itemLabel .
?item rdf:type wikibase:Property
FILTER contains(lcase(?itemLabel), "border"@en)
}
LIMIT 10
12:42:50 INFO exec :: ALGEBRA
(slice _ 10
(project (?item ?itemLabel)
(sequence
(filter (contains (lcase ?itemLabel) "border"@en)
(graph <urn:x-arq:DefaultGraphNode>
(path ?item (alt <http://www.w3.org/2000/01/rdf-schema#label> <http://www.w3.org/2004/02/skos/core#altLabel>) ?itemLabel)))
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://wikiba.se/ontology#Property>)))))
12:42:50 INFO exec :: TDB2
(slice _ 10
(project (?item ?itemLabel)
(sequence
(filter (contains (lcase ?itemLabel) "border"@en)
(graph <urn:x-arq:DefaultGraphNode>
(path ?item (alt <http://www.w3.org/2000/01/rdf-schema#label> <http://www.w3.org/2004/02/skos/core#altLabel>) ?itemLabel)))
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://wikiba.se/ontology#Property>)))))
12:42:50 INFO exec :: TDB2
(path ?item (alt <http://www.w3.org/2000/01/rdf-schema#label> <http://www.w3.org/2004/02/skos/core#altLabel>) ?itemLabel)
12:42:50 INFO exec :: Path :: ?item <http://www.w3.org/2000/01/rdf-schema#label>|<http://www.w3.org/2004/02/skos/core#altLabel> ?itemLabel
```
### Union query
I guess, `UNION` clauses can't also be optimized, right? For the rewritten query it would be trivial to transform it, but this is also not reordered I guess? For those tiny queries one could indeed optimize and estimate the size of the `UNION` (`|A| + |B|`), but seems to be always a tradeoff - for bigger BGPs in the `UNION` it's getting more complex.
```
tdb2.tdbquery --loc /data/coypu/tdb2/wikidata --explain "
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?item ?itemLabel WHERE {
{?item rdfs:label ?itemLabel } UNION {?item skos:altLabel ?itemLabel;}
?item rdf:type wikibase:Property.
FILTER(CONTAINS(LCASE(?itemLabel), 'border'@en))
}
LIMIT 10"
12:37:15 INFO exec :: QUERY
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX wikibase: <http://wikiba.se/ontology#>
SELECT ?item ?itemLabel
WHERE
{ { ?item rdfs:label ?itemLabel }
UNION
{ ?item skos:altLabel ?itemLabel }
?item rdf:type wikibase:Property
FILTER contains(lcase(?itemLabel), "border"@en)
}
LIMIT 10
12:37:15 INFO exec :: ALGEBRA
(slice _ 10
(project (?item ?itemLabel)
(sequence
(union
(filter (contains (lcase ?itemLabel) "border"@en)
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/2000/01/rdf-schema#label> ?itemLabel)))
(filter (contains (lcase ?itemLabel) "border"@en)
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/2004/02/skos/core#altLabel> ?itemLabel))))
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://wikiba.se/ontology#Property>)))))
12:37:15 INFO exec :: TDB2
(slice _ 10
(project (?item ?itemLabel)
(sequence
(union
(filter (contains (lcase ?itemLabel) "border"@en)
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/2000/01/rdf-schema#label> ?itemLabel)))
(filter (contains (lcase ?itemLabel) "border"@en)
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/2004/02/skos/core#altLabel> ?itemLabel))))
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://wikiba.se/ontology#Property>)))))
12:37:15 INFO exec :: TDB2
(filter (contains (lcase ?itemLabel) "border"@en)
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/2000/01/rdf-schema#label> ?itemLabel)))
12:37:15 INFO exec :: Execute :: ?item rdfs:label ?itemLabel
12:37:15 INFO exec :: TDB2
(filter (contains (lcase ?itemLabel) "border"@en)
(quadpattern (quad <urn:x-arq:DefaultGraphNode> ?item <http://www.w3.org/2004/02/skos/core#altLabel> ?itemLabel)))
12:37:15 INFO exec :: Execute :: ?item <http://www.w3.org/2004/02/skos/core#altLabel> ?itemLabel
12:37:28 INFO exec :: Execute :: ?item rdf:type <http://wikiba.se/ontology#Property>
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org
[GitHub] [jena] SimonBin commented on issue #1614: Property path handling in query optimizer and timeout handler
Posted by GitBox <gi...@apache.org>.
SimonBin commented on issue #1614:
URL: https://github.com/apache/jena/issues/1614#issuecomment-1310172264
in https://github.com/apache/jena/blob/main/jena-arq/src/main/java/org/apache/jena/sparql/algebra/optimize/TransformPathFlattern.java#L35 there is a comment that P_Alt is not converted into OpUnion, but do you know why? What was the downside of this particular optimisation?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org
[GitHub] [jena] afs closed issue #1614: Property path handling in query optimizer and timeout handler
Posted by GitBox <gi...@apache.org>.
afs closed issue #1614: Property path handling in query optimizer and timeout handler
URL: https://github.com/apache/jena/issues/1614
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org
[GitHub] [jena] Aklakan commented on issue #1614: Property path handling in query optimizer and timeout handler
Posted by GitBox <gi...@apache.org>.
Aklakan commented on issue #1614:
URL: https://github.com/apache/jena/issues/1614#issuecomment-1310277065
Just a comment on this stacktrace snippet: @LorenzBuehmann mentioned that `GraphUtils.allNodes` attempts to load everything into in memory thereby ignoring query timeouts. Without having looked into this in detail, maybe its possible to mitigate this issue by using a `QueryIterDistinct`.
```
at org.apache.jena.sparql.util.graph.GraphUtils.allNodes(GraphUtils.java:240)
at org.apache.jena.sparql.path.PathLib.determineUngroundedStartingSet(PathLib.java:245)
at org.apache.jena.sparql.path.PathLib.execUngroundedPath(PathLib.java:182)
at org.apache.jena.sparql.path.PathLib.execTriplePath(PathLib.java:128)
at org.apache.jena.sparql.path.PathLib.execTriplePath(PathLib.java:108)
at org.apache.jena.sparql.engine.iterator.QueryIterPath.nextStage(QueryIterPath.java:47)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@jena.apache.org
For additional commands, e-mail: issues-help@jena.apache.org