You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Shawn Smith (Jira)" <ji...@apache.org> on 2019/10/17 22:20:00 UTC
[jira] [Created] (JENA-1771) Spilling combined with DISTINCT ..
ORDER BY returns rows in the wrong order
Shawn Smith created JENA-1771:
---------------------------------
Summary: Spilling combined with DISTINCT .. ORDER BY returns rows in the wrong order
Key: JENA-1771
URL: https://issues.apache.org/jira/browse/JENA-1771
Project: Apache Jena
Issue Type: Bug
Components: ARQ
Affects Versions: Jena 3.13.1
Reporter: Shawn Smith
It looks like Jena assumes that OpDistinct preserves order, but order is not preserved when spilling occurs. This is only a problem when the ARQ.spillToDiskThreshold setting has been configured.
Consider the following query:
{code:java}
SELECT DISTINCT *
WHERE
{ ?x :p ?v }
ORDER BY ASC(?v)
{code}
Jena executes the ORDER BY ASC(?v) before the DISTINCT, relying on the SPARQL requirement:
bq. The order of Distinct(Ψ) must preserve any ordering given by OrderBy.
But, when spilling, QueryIterDistinct (which executes OpDistinct) creates a DistinctDataBag with a BindingComparator without any sort conditions. As a result, the DISTINCT operation doesn't preserve the ORDER BY ASC(?v) requirement.
You can reproduce this by injecting the following into QueryTest.java then running the ARQTestRefEngine tests:
{code:java}
void runTestSelect(Query query, QueryExecution qe)
{
qe.getContext().set(ARQ.spillToDiskThreshold, 4); // add this line
...
{code}
For example, "ARQTestRefEngine -> Algebra optimizations -> QueryTest.opt-top-05" will fail with:
{code}
Query:
PREFIX : <http://example/>
SELECT DISTINCT *
WHERE
{ ?x :p ?v }
ORDER BY ASC(?v)
LIMIT 5
Got: 5 --------------------------------
-------------
| x | v |
=============
| :x1 | 1 |
| :x2 | 2 |
| :x10 | 10 |
| :x11 | 11 |
| :x12 | 12 |
-------------
Expected: 5 -----------------------------
------------
| x | v |
============
| :x1 | 1 |
| :x2 | 2 |
| :x3 | 3 |
| :x3a | 3 |
| :x4 | 4 |
------------
junit.framework.AssertionFailedError: Results do not match
at junit.framework.Assert.fail(Assert.java:57)
at junit.framework.Assert.assertTrue(Assert.java:22)
at junit.framework.TestCase.assertTrue(TestCase.java:192)
at org.apache.jena.sparql.junit.QueryTest.runTestSelect(QueryTest.java:284)
at org.apache.jena.sparql.junit.QueryTest.runTestForReal(QueryTest.java:201)
at org.apache.jena.sparql.junit.EarlTestCase.runTest(EarlTestCase.java:88)
at junit.framework.TestCase.runBare(TestCase.java:141)
at junit.framework.TestResult$1.protect(TestResult.java:122)
at junit.framework.TestResult.runProtected(TestResult.java:142)
at junit.framework.TestResult.run(TestResult.java:125)
at junit.framework.TestCase.run(TestCase.java:129)
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)