You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@jena.apache.org by GitBox <gi...@apache.org> on 2022/11/20 12:14:19 UTC

[GitHub] [jena] afs opened a new pull request, #1631: GH-1615: LATERAL join

afs opened a new pull request, #1631:
URL: https://github.com/apache/jena/pull/1631

   GitHub issue resolved #1615
   
   Pull request Description:
   Implementation of LATERAL join.
   
   Currently, there are different commits for the different implementation steps but each step does not necessarily work on its own.  That helps inspect the PR. 
   
   The PR will be cleaned up before merge.
   
   ----
   
    - [x] Tests are included.
    - [ ] Documentation change and updates are provided for the [Apache Jena website](https://github.com/apache/jena-site/)
    - [x] Commits have been squashed to remove intermediate development commit messages.
    - [ ] Key commit messages start with the issue number (GH-xxxx or JENA-xxxx)
   
   By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the [Contributor's Agreement](https://www.apache.org/licenses/contributor-agreements.html).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on pull request #1631: GH-1615: LATERAL join

Posted by GitBox <gi...@apache.org>.
afs commented on PR #1631:
URL: https://github.com/apache/jena/pull/1631#issuecomment-1323343724

   Rebased to current main branch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] rvesse commented on a diff in pull request #1631: GH-1615: LATERAL join

Posted by GitBox <gi...@apache.org>.
rvesse commented on code in PR #1631:
URL: https://github.com/apache/jena/pull/1631#discussion_r1029549347


##########
jena-arq/src/main/java/org/apache/jena/sparql/algebra/optimize/TransformFilterImplicitJoin.java:
##########
@@ -120,10 +120,14 @@ private static Op apply(ExprList exprs, Op subOp) {
         // ---- Check if the subOp is the right shape to transform.
         Op op = subOp;
 
-        // Special case : deduce that the filter will always "eval unbound"
-        // hence eliminate all rows. Return the empty table.
+        // LATERAL : This is not longer true.
+//        // Special case : deduce that the filter will always "eval unbound"
+//        // hence eliminate all rows. Return the empty table.
+//        if (testSpecialCaseUnused(subOp, joins, remaining))
+//            return OpTable.empty();
+        // But simply skipping this causes (filter) to become (assign) which fails as (assign) does not handle errors.
         if (testSpecialCaseUnused(subOp, joins, remaining))
-            return OpTable.empty();
+            return null;

Review Comment:
   Would an alternative fix be overriding the transform of `OpLateral` in this class to return a direct copy of the original `opLateral` ignoring any transforms this might have generated?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] rvesse commented on a diff in pull request #1631: GH-1615: LATERAL join

Posted by GitBox <gi...@apache.org>.
rvesse commented on code in PR #1631:
URL: https://github.com/apache/jena/pull/1631#discussion_r1030429038


##########
jena-arq/src/main/java/org/apache/jena/sparql/algebra/optimize/TransformFilterImplicitJoin.java:
##########
@@ -120,10 +120,14 @@ private static Op apply(ExprList exprs, Op subOp) {
         // ---- Check if the subOp is the right shape to transform.
         Op op = subOp;
 
-        // Special case : deduce that the filter will always "eval unbound"
-        // hence eliminate all rows. Return the empty table.
+        // LATERAL : This is not longer true.
+//        // Special case : deduce that the filter will always "eval unbound"
+//        // hence eliminate all rows. Return the empty table.
+//        if (testSpecialCaseUnused(subOp, joins, remaining))
+//            return OpTable.empty();
+        // But simply skipping this causes (filter) to become (assign) which fails as (assign) does not handle errors.
         if (testSpecialCaseUnused(subOp, joins, remaining))
-            return OpTable.empty();
+            return null;

Review Comment:
   > Is there any pointers as to when it has been beneficial?
   
   If memory serves this particular piece originated because of users.  This originated when I was at Cray and occasionally we'd see users make simple typos in their queries, especially around variable names, that would cause the engine to do a lot of work it later just threw away due to the bad filter clause so this was basically a case of protecting users from themselves
   
   So we'd get a bug of the form "Why does my query take ages and return nothing?" where the returning nothing was due to user error.  By spotting these cases in the optimiser we could return nothing much faster, and generally the users figured out the typo themselves in that case because there weren't distracted by the slow performance



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs merged pull request #1631: GH-1615: LATERAL join

Posted by GitBox <gi...@apache.org>.
afs merged PR #1631:
URL: https://github.com/apache/jena/pull/1631


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on a diff in pull request #1631: GH-1615: LATERAL join

Posted by GitBox <gi...@apache.org>.
afs commented on code in PR #1631:
URL: https://github.com/apache/jena/pull/1631#discussion_r1030345799


##########
jena-arq/src/main/java/org/apache/jena/sparql/algebra/optimize/TransformFilterImplicitJoin.java:
##########
@@ -120,10 +120,14 @@ private static Op apply(ExprList exprs, Op subOp) {
         // ---- Check if the subOp is the right shape to transform.
         Op op = subOp;
 
-        // Special case : deduce that the filter will always "eval unbound"
-        // hence eliminate all rows. Return the empty table.
+        // LATERAL : This is not longer true.
+//        // Special case : deduce that the filter will always "eval unbound"
+//        // hence eliminate all rows. Return the empty table.
+//        if (testSpecialCaseUnused(subOp, joins, remaining))
+//            return OpTable.empty();
+        // But simply skipping this causes (filter) to become (assign) which fails as (assign) does not handle errors.
         if (testSpecialCaseUnused(subOp, joins, remaining))
-            return OpTable.empty();
+            return null;

Review Comment:
   Changing the walker is big change - I've tried it before. I'd like to get this PR in without resorting to that.
   
   I believe the special case is also a potential impact in other places. We haven't seen it reported because e.g. EXISTS {} is often simple.
   
   Is there any pointers as to when it has been beneficial?
   
   I tried skipping it rather than `return null` but `TestSemanticEquivalence.implicitJoinEvaluation3` and on a SPARQL WG test fail.
   
   In `implicitJoinEvaluation3`, it only fails on the query string part - the next two algebra level tests work. This doesn't make sense to me -- the algebra steps are supposed to cover the possibilities of the SPARQL query.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on a diff in pull request #1631: GH-1615: LATERAL join

Posted by GitBox <gi...@apache.org>.
afs commented on code in PR #1631:
URL: https://github.com/apache/jena/pull/1631#discussion_r1029650336


##########
jena-arq/src/main/java/org/apache/jena/sparql/algebra/optimize/TransformFilterImplicitJoin.java:
##########
@@ -120,10 +120,14 @@ private static Op apply(ExprList exprs, Op subOp) {
         // ---- Check if the subOp is the right shape to transform.
         Op op = subOp;
 
-        // Special case : deduce that the filter will always "eval unbound"
-        // hence eliminate all rows. Return the empty table.
+        // LATERAL : This is not longer true.
+//        // Special case : deduce that the filter will always "eval unbound"
+//        // hence eliminate all rows. Return the empty table.
+//        if (testSpecialCaseUnused(subOp, joins, remaining))
+//            return OpTable.empty();
+        // But simply skipping this causes (filter) to become (assign) which fails as (assign) does not handle errors.
         if (testSpecialCaseUnused(subOp, joins, remaining))
-            return OpTable.empty();
+            return null;

Review Comment:
   That's a possibility - I think it seems to designing for handling a bad query case (if not LATERAL), rather than optimization in LATERAL.
   
   It does seem to be a risk more widely, now and in future changes and not just in LATERAL - e.g. optimizing then substituting (e.g. parameterized queries). (NOT)EXISTS as well which is related to substitution. The fix for JENA-500 is a specific fix for the whole-query case.
   
   The transform walk is bottom up so it will run over the code - the walker needs to be changed to avoid certain ops.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on a diff in pull request #1631: GH-1615: LATERAL join

Posted by GitBox <gi...@apache.org>.
afs commented on code in PR #1631:
URL: https://github.com/apache/jena/pull/1631#discussion_r1030516768


##########
jena-arq/src/main/java/org/apache/jena/sparql/algebra/optimize/TransformFilterImplicitJoin.java:
##########
@@ -120,10 +120,14 @@ private static Op apply(ExprList exprs, Op subOp) {
         // ---- Check if the subOp is the right shape to transform.
         Op op = subOp;
 
-        // Special case : deduce that the filter will always "eval unbound"
-        // hence eliminate all rows. Return the empty table.
+        // LATERAL : This is not longer true.
+//        // Special case : deduce that the filter will always "eval unbound"
+//        // hence eliminate all rows. Return the empty table.
+//        if (testSpecialCaseUnused(subOp, joins, remaining))
+//            return OpTable.empty();
+        // But simply skipping this causes (filter) to become (assign) which fails as (assign) does not handle errors.
         if (testSpecialCaseUnused(subOp, joins, remaining))
-            return OpTable.empty();
+            return null;

Review Comment:
   Query timeouts will catch that (we hope).
   
   EXISTS uses the same injection of bindings as JENA-500 except it does it dynamically at runtime (like LATERAL - EXIST is the ASK version of LATERAL). Whether that's the current iffy form or the better one written up.
   
   It's is less likely to occur in EXISTS due to usage but it is possible. And like LATERAL, complicated because it may be bound sometimes and not others.
   
   Can we merge as-is which is functionally safe?
   
   ---
   
   I'm still mystified by `TestSemanticEquivalence.implicitJoinEvaluation3`.
   
   Maybe the test is wrong and comparing to zero happens for the wrong reasons but I can't see a difference to the algebra sub-tests.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] rvesse commented on a diff in pull request #1631: GH-1615: LATERAL join

Posted by GitBox <gi...@apache.org>.
rvesse commented on code in PR #1631:
URL: https://github.com/apache/jena/pull/1631#discussion_r1030190728


##########
jena-arq/src/main/java/org/apache/jena/sparql/algebra/optimize/TransformFilterImplicitJoin.java:
##########
@@ -120,10 +120,14 @@ private static Op apply(ExprList exprs, Op subOp) {
         // ---- Check if the subOp is the right shape to transform.
         Op op = subOp;
 
-        // Special case : deduce that the filter will always "eval unbound"
-        // hence eliminate all rows. Return the empty table.
+        // LATERAL : This is not longer true.
+//        // Special case : deduce that the filter will always "eval unbound"
+//        // hence eliminate all rows. Return the empty table.
+//        if (testSpecialCaseUnused(subOp, joins, remaining))
+//            return OpTable.empty();
+        // But simply skipping this causes (filter) to become (assign) which fails as (assign) does not handle errors.
         if (testSpecialCaseUnused(subOp, joins, remaining))
-            return OpTable.empty();
+            return null;

Review Comment:
   Yes the bottom up property means the optimisation would be attempted but then you could discard it when you reached the `OpLateral` level but that does mean unnecessary work is done.
   
   Having the walker be able to decide what operations to descend through could be a useful mechanism in general for limiting the scope of transformations



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org