You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@jena.apache.org by GitBox <gi...@apache.org> on 2020/03/19 16:09:32 UTC

[GitHub] [jena] Aklakan opened a new pull request #712: Jena 1862: Query.clone is slow

Aklakan opened a new pull request #712: Jena 1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712
 
 
   This PR changes the implementation of Query.clone to make use of the syntax transform machinery.
   A parameterized test case runner with 1230 queries is provided in TestQueryCloningEssentials
   A test case for inital corner cases is provided in TestQueryCloningCornerCases - the main purpose is to check that modification of elements of a clone do not affect the original query. Note, that  Query.elementDataBlock is not cloned, I'd said this is an issue in QueryTransformOps.
   A little commented out by means of "// @ Test" benchmark runner is provided in TestQueryCloningCornerCases as well.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] Aklakan edited a comment on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
Aklakan edited a comment on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-602217879
 
 
   Oh, I haven't noticed the prefix leaking in testing this code, but I have seen it happen before. IIRC it may be related to a PrefixMapping2 (the one with local / global maps) getting [copied into a PrefixMappingImpl](https://github.com/apache/jena/blob/9d0689680f17a71167fd9553cc6146e8e9965ca6/jena-arq/src/main/java/org/apache/jena/sparql/syntax/syntaxtransform/QueryTransformOps.java#L180). Maybe PrefixMapping needs a clone method too?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] Aklakan commented on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
Aklakan commented on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-604693910
 
 
   Hi Andy, is there anything more I can do to support on this PR?
   
   * The `ElementSubQuery` case can be handled by just adding it to the if statement below the incorrect comment `Top level is always a group.`
   * As for wrapping the value with an element: This seems reasonable, however an element transform may yield another element instead. While this is valid for in-pattern ElementData blocks, this is not the case for a valueDataBlock. On the other hand, if one really wanted to transform a valueDataBlock to another element, one could just use a pre-transform to move the ElementData into an ElementGroup pattern.
   * As for the prefixes I could have another look.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
afs commented on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-607336207
 
 
   Hi Claus,
   
   You OK for doing ElementSubQuery and ElementData? If those are done, I'll add the prefixes fixes.
   
   Prefix change:
   https://github.com/afs/jena/commit/5e16f6717e293dd21e9221d57b76002017454cc0
   and a small cleanup
   https://github.com/afs/jena/commit/6dd354b697855dbf4ab7f9556815d1133101b78a
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
afs commented on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-614702099
 
 
   Time flies :-)
   
   I ran the code, together with prefixes patch, and there is one test failure: `TestSyntaxTransform.subst_query_10`. This is good! It expected a subquery nested with `{{ }}` and now `{ }` is used.
   
   Also, I chekced the prinint of queries with my previous investigation code.
   
   All looks good!
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] Aklakan commented on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
Aklakan commented on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-608106665
 
 
   Hi Andy, I will add ElementSubQuery and ElementData over the weekend

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
afs commented on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-601662371
 
 
   I'll review ASAP especially the skipped tests.
   
   There is a significant code update as we switch to log4j2 - this should not have any impact here but with so many files touched in some minor way, the PR may not be still merge-clean.
   
   Also, as @rvesse is aware, the project has some other matter it needs to attend to.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs edited a comment on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
afs edited a comment on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-605308084
 
 
   ElementSubQuery: Agreed
   
   ElementData: Seems reason to assume ElementData to ElementData at least for the trailing part of  a query. That can't be anything else. Bundling using a temporary ElementData is fine as well.
   
   Prefixes: I now think the best solution is to clear the prefixes during query parsing. The subquery does carry round the prefixes of the outer query. Subquery code predates SPARQL 1.1 - mayb the intent was subquery scoped prefixes.  Far too long ago to know.
   
   The parser can clear them after finishing the sub-query element, or maybe it does not need them at all. I'll investigate.
   
   Later: the parser does not use the prefixmap/base from the subquery, nor does printing query. The prologue can be set empty when the parser makes the inner query.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
afs commented on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-605308084
 
 
   ElementSubQuery: Agreed
   
   ElementData: Seems reason to assume ElementData to ElementData at least for the trailing part of  a query. That can't be anything else. Bundling using a temporary ElementData is fine as well.
   
   Prefixes: I now think the best solution is to clear the prefixes during query parsing. The subquery does carry round the prefixes of the outer query. Subquery code predates SPARQL 1.1 - mayb the intent was subquery scoped prefixes.  Far too long ago to know.
   
   The parser can clear them after finishing the sub-query element, or maybe it does not need them at all. I'll investigate.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] Aklakan commented on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
Aklakan commented on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-614063788
 
 
   Whoa, already 2 week passed - but the ElementSubQuery and ElementData implementation is ready for review. Note, that binding objects are not cloned in the process. Probably a clone of bindings is not desired anyway, as for all practical purposes, once they are constructed, they should be considered immutable anyway (the Binding interface does not provide modification methods).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] Aklakan edited a comment on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
Aklakan edited a comment on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-604693910
 
 
   Hi Andy, is there anything more I can do to support on this PR?
   
   * The `ElementSubQuery` case can be handled by just adding it to the if statement below the incorrect comment `Top level is always a group.`
   * As for wrapping the valueDataBlock with an element: This seems reasonable, however an element transform may yield another element instead. While this is valid for in-pattern ElementData blocks, this is not the case for a valueDataBlock. On the other hand, if one really wanted to transform a valueDataBlock to another element, one could just use a pre-transform to move the ElementData into an ElementGroup pattern.
   * As for the prefixes I could have another look.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
afs commented on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-602194877
 
 
   From [JENA-1862](https://issues.apache.org/jira/browse/JENA-1862):
   
   > Query.valueDataBlock is not copied by QueryTransformOps - probably
   > this should be done; maybe wrapping it as an ElementData passing it
   > through the given ElementTransformer?
   
   Yes, it should be transformed and that way seems better than any that adds a public method to `Query`. I had thought of a change `Query` from a `TableData` to using an `ElementData`, then call the transform on `ElementData`, `QueryTransformOp.transformat(Query,...)` which already processes projects etc. But seems to need `Query` to have new methods to get/set it.
   
   > QueryTransformOps has a statement that says that the top level
   > element is always a ElementGroup - however the 4 excluded test
   > case files parse into a Query with a ElementSubQuery as the top
   > level object. What behavior is preferred in this case?
   
   The comment isn't quite right - it can be an `ElementSubQuery`. There is a special syntax case because it does not need to be nested in { } -- it already prints them as it does when it's part of a group.
   
   There is a bug somewhere:
   
   ```
   PREFIX : <http://example/> 
   SELECT * {
      ?x ?y ?z 
      { SELECT ?s { ?s ?p ?o } }
    }
   ```
   
   when new-cloned and printed becomes 
   
   ```
   PREFIX  :     <http://example/>
   
   SELECT  *
   WHERE
     { ?x  ?y  ?z
       { PREFIX  :     <http://example/>
         
         SELECT  ?s
         WHERE
           { ?s  ?p  ?o }
       }
     }
   
   ```
   The prefixes have leaked into the sub-query.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] Aklakan edited a comment on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
Aklakan edited a comment on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-614063788
 
 
   Whoa, already 2 weeks have passed - anyway, the ElementSubQuery and ElementData implementation is ready for review. Note, that binding objects are not cloned in the process. Probably a clone of bindings is not desired anyway, as for all practical purposes, once they are constructed, they should be considered immutable anyway (the Binding interface does not provide modification methods).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs merged pull request #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
afs merged pull request #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs edited a comment on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
afs edited a comment on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-605308084
 
 
   ElementSubQuery: Agreed
   
   ElementData: Seems reason to assume ElementData to ElementData at least for the trailing part of  a query. That can't be anything else. Bundling using a temporary ElementData is fine as well.
   
   Prefixes: I now think the best solution is to clear the prefixes during query parsing. The subquery does carry round the prefixes of the outer query. Subquery code predates SPARQL 1.1 - mayb the intent was subquery scoped prefixes.  Far too long ago to know.
   
   The parser can clear them after finishing the sub-query element, or maybe it does not need them at all. I'll investigate.
   
   ---
   
   Investigation: the parser does not use the prefixmap/base from the subquery, nor does printing query. The prologue can be set empty when the parser makes the inner query.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] Aklakan commented on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
Aklakan commented on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-602217879
 
 
   Oh, I haven't noticed the prefix leaking in testing this code, but I have seen it happen before. IIRC it may be related to a PrefixMapping2 (the one with local / global maps) getting [copied into a PrefixMappingImpl](https://github.com/apache/jena/blob/master/jena-arq/src/main/java/org/apache/jena/sparql/syntax/syntaxtransform/QueryTransformOps.java#L180). Maybe PrefixMapping needs a clone method too?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] afs commented on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
afs commented on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-610034513
 
 
   I've run into the fact that GH knows where repos are cloned from originally and haven't worked out how to have two+ clones of Jena on my account.
   
   Here is a diff to not have prefixes when parsing a sub-select:
   
   ```
   diff --git a/jena-arq/src/main/java/org/apache/jena/sparql/lang/SPARQLParserBase.java b/jena-arq/src/main/java/org/apache/jena/sparql/lang/SPARQLParserBase.java
   index d3fcd7b55f..274342f90d 100644
   --- a/jena-arq/src/main/java/org/apache/jena/sparql/lang/SPARQLParserBase.java
   +++ b/jena-arq/src/main/java/org/apache/jena/sparql/lang/SPARQLParserBase.java
   @@ -221,12 +221,12 @@ public class SPARQLParserBase extends ParserBase
        protected void startSubSelect(int line, int col)
        {
            pushQuery();
   -        query = newSubQuery(getPrologue()) ;
   +        query = newSubQuery() ;
        }
        
   -    protected Query newSubQuery(Prologue progloue)
   +    protected Query newSubQuery()
        {
   -        return new Query(getPrologue());
   +        return new Query();
        }
        
        protected void popQuery()
   ```
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org


[GitHub] [jena] Aklakan edited a comment on issue #712: JENA-1862: Query.clone is slow

Posted by GitBox <gi...@apache.org>.
Aklakan edited a comment on issue #712: JENA-1862: Query.clone is slow
URL: https://github.com/apache/jena/pull/712#issuecomment-604693910
 
 
   Hi Andy, is there anything more I can do to support on this PR?
   
   * The `ElementSubQuery` case can be handled by just adding it to the if statement below the incorrect comment `Top level is always a group.`
   * As for wrapping the valueDataBlock with an ElementData: This seems reasonable, however an element transform may yield another element instead. While this is valid for in-pattern ElementData blocks, this is not the case for a valueDataBlock. On the other hand, if one really wanted to transform a valueDataBlock to another element, one could just use a pre-transform to move the ElementData into an ElementGroup pattern.
   * As for the prefixes I could have another look.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@jena.apache.org
For additional commands, e-mail: pr-help@jena.apache.org