You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Andy Seaborne (Jira)" <ji...@apache.org> on 2020/01/13 12:30:00 UTC
[jira] [Resolved] (JENA-1813) Join optimization transform results
in incorrect query results
[ https://issues.apache.org/jira/browse/JENA-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andy Seaborne resolved JENA-1813.
---------------------------------
Fix Version/s: Jena 3.14.0
Resolution: Fixed
Immediate fix as suggested.
See JENA-1815 for longer term work.
> Join optimization transform results in incorrect query results
> --------------------------------------------------------------
>
> Key: JENA-1813
> URL: https://issues.apache.org/jira/browse/JENA-1813
> Project: Apache Jena
> Issue Type: Bug
> Components: ARQ
> Affects Versions: Jena 3.13.1
> Reporter: Shawn Smith
> Assignee: Andy Seaborne
> Priority: Major
> Fix For: Jena 3.14.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> I think I've found a query where TransformJoinStrategy incorrectly decides that a query is linear such that a "join" operation can be replaced by a "sequence" operation. As a result, the query returns incorrect results. Disabling optimizations with "qe.getContext().set(ARQ.optimization, false)" fixes the issue.
> Here's the query:
> {noformat}
> PREFIX : <http://example.com/>
> SELECT ?a
> WHERE {
> GRAPH :graph { :s :p ?a }
> GRAPH :graph {
> SELECT (?b AS ?a)
> WHERE { :t :q ?b }
> GROUP BY ?b
> }
> }
> {noformat}
> Here's the data to test it with (two quads, as Trig):
> {noformat}
> @prefix : <http://example.com/> .
> :graph {
> :s :p "a" .
> :t :q "b" .
> }
> {noformat}
> I expected the query to return zero results because the two GRAPH clauses can't find compatible bindings for ?a. But, in practice, Jena returns ?a="a" and logs a warning:
> {noformat}
> [main] WARN BindingUtils - merge: Mismatch : "a" != "b"{noformat}
> Note the warning is actually coming from QueryIterProjectMerge.java, not BindingUtils.java. With more complicated queries and datasets, this issue can result in thousands or millions of logged warnings.
> The query plan before optimization looks like this:
> {noformat}
> (project (?a)
> (join
> (graph <http://example.com/graph>
> (bgp (triple <http://example.com/s> <http://example.com/p> ?a)))
> (graph <http://example.com/graph>
> (project (?a)
> (extend ((?a ?b))
> (group (?b)
> (bgp (triple <http://example.com/t> <http://example.com/q> ?b))))))))
> {noformat}
> Optimization replaces "join" with "sequence" which fails to detect conflicts on ?a:
> {noformat}
> (project (?a)
> (sequence
> (graph <http://example.com/graph>
> (bgp (triple <http://example.com/s> <http://example.com/p> ?a)))
> (graph <http://example.com/graph>
> (project (?a)
> (extend ((?a ?/b))
> (group (?/b)
> (bgp (triple <http://example.com/t> <http://example.com/q> ?/b))))))))
> {noformat}
> For convenience, here's Java code that reproduces the bug:
> {noformat}
> import org.apache.jena.query.ARQ;
> import org.apache.jena.query.Dataset;
> import org.apache.jena.query.DatasetFactory;
> import org.apache.jena.query.QueryExecution;
> import org.apache.jena.query.QueryExecutionFactory;
> import org.apache.jena.query.ResultSet;
> import org.apache.jena.riot.Lang;
> import org.apache.jena.riot.RDFParser;
> import org.junit.Test;
> public class QueryTest {
> @Test
> public void testGraphQuery() {
> String query = "" +
> "PREFIX : <http://example.com/>\n" +
> "SELECT ?a\n" +
> "WHERE {\n" +
> " GRAPH :graph { :s :p ?a }\n" +
> " GRAPH :graph {\n" +
> " SELECT (?b AS ?a)\n" +
> " WHERE { :t :q ?b }\n" +
> " GROUP BY ?b\n" +
> " }\n" +
> "}\n";
> String data = "" +
> "@prefix : <http://example.com/> .\n" +
> ":graph {\n" +
> " :s :p \"a\" .\n" +
> " :t :q \"b\" .\n" +
> "}\n";
> Dataset ds = DatasetFactory.create();
> RDFParser.fromString(data).lang(Lang.TRIG).parse(ds);
> try (QueryExecution qe = QueryExecutionFactory.create(query, ds)) {
> qe.getContext().set(ARQ.optimization, true); // flipping this to false fixes the test
> ResultSet rs = qe.execSelect();
> if (rs.hasNext()) {
> System.out.println(rs.nextBinding());
> throw new AssertionError("Result set should be empty");
> }
> }
> }
> }
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)