You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/03/11 18:33:39 UTC

[GitHub] [lucene-solr] micpalmia opened a new pull request #1343: LUCENE-8103: Use two-phase iterator in Query- and DoubleValuesSource

micpalmia opened a new pull request #1343: LUCENE-8103: Use two-phase iterator in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343
 
 
   # Description
   QueryValueSource (in "queries" module) is a ValueSource representation of a Query; the score is the value. It ought to try to use a TwoPhaseIterator from the query if it can be offered.
   
   # Solution
   Always check whether a TwoPhaseIterator can be offered, and use it. Since its `matches()` function may only be called once, while there is no restrictions on QueryValueSource's or DoubleValuesSource's method calling, I unfortunately had to add some state.
   
   # Tests
   
   For each value source, I added a test case that makes use of a query that serves a TwoPhaseIterator. I don't think that's the best solution, but was not sure about how to solve this differently.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request title.
   - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [x] I have run `ant precommit` and the appropriate test suite.
   - [x] I have added tests for my changes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392367810
 
 

 ##########
 File path: lucene/core/src/java/org/apache/lucene/search/DoubleValuesSource.java
 ##########
 @@ -604,18 +604,26 @@ public DoubleValues getValues(LeafReaderContext ctx, DoubleValues scores) throws
       Scorer scorer = weight.scorer(ctx);
       if (scorer == null)
         return DoubleValues.EMPTY;
-      DocIdSetIterator it = scorer.iterator();
+
       return new DoubleValues() {
+        private final TwoPhaseIterator tpi = scorer.twoPhaseIterator();
+        private final DocIdSetIterator disi = (tpi == null) ? scorer.iterator() : tpi.approximation();
+
+        private int scorerDoc = -1;
+        private boolean thisDocMatches = false;
+
         @Override
         public double doubleValue() throws IOException {
-          return scorer.score();
+          return thisDocMatches ? scorer.score() : Double.NaN;
         }
 
         @Override
         public boolean advanceExact(int doc) throws IOException {
-          if (it.docID() > doc)
-            return false;
-          return it.docID() == doc || it.advance(doc) == doc;
+          if (scorerDoc < doc) {
+            scorerDoc = disi.advance(doc);
+            thisDocMatches = tpi==null || tpi.matches();
 
 Review comment:
   I've done as you suggested. I'm a bit concerned about the performance implications of the unboxing though - this implementation is transforming a Boolean to a boolean for every matching document. I can look at running performance tests if you think that would be important.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392381180
 
 

 ##########
 File path: lucene/core/src/java/org/apache/lucene/search/DoubleValuesSource.java
 ##########
 @@ -608,22 +608,25 @@ public DoubleValues getValues(LeafReaderContext ctx, DoubleValues scores) throws
       return new DoubleValues() {
         private final TwoPhaseIterator tpi = scorer.twoPhaseIterator();
         private final DocIdSetIterator disi = (tpi == null) ? scorer.iterator() : tpi.approximation();
-
-        private int scorerDoc = -1;
-        private boolean thisDocMatches = false;
+        private Boolean thisDocMatches = false;
 
         @Override
         public double doubleValue() throws IOException {
-          return thisDocMatches ? scorer.score() : Double.NaN;
+          return (thisDocMatches != null && thisDocMatches) ? scorer.score() : Double.NaN;
         }
 
         @Override
         public boolean advanceExact(int doc) throws IOException {
-          if (scorerDoc < doc) {
-            scorerDoc = disi.advance(doc);
-            thisDocMatches = tpi==null || tpi.matches();
+          if (disi.docID() < doc) {
 
 Review comment:
   Have you determined if this condition can be replaced with an assert?  I bet tests pass.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392569092
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/QueryValueSource.java
 ##########
 @@ -156,42 +135,38 @@ public float floatVal(int doc) {
   @Override
   public boolean exists(int doc) {
     try {
-      if (doc < lastDocRequested) {
-        if (noMatches) return false;
+      if (noMatches) return false;
+      if (scorer == null) {
         scorer = weight.scorer(readerContext);
-        scorerDoc = -1;
-        if (scorer==null) {
+        if (scorer == null) {
           noMatches = true;
           return false;
         }
-        it = scorer.iterator();
+        tpi = scorer.twoPhaseIterator();
+        disi = tpi==null ? scorer.iterator() : tpi.approximation();
+        thisDocMatches = false;
       }
+      assert doc >= lastDocRequested : "values requested out of order; last=" + lastDocRequested + ", requested=" + doc;
 
 Review comment:
   I also didn't think so until your last review, but I found the following on the docstring for `ValueSource.getValues`:
   
   > The values must be consumed in a forward docID manner, and you must call this method again to iterate through the values again.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] micpalmia edited a comment on issue #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
micpalmia edited a comment on issue #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#issuecomment-598668787
 
 
   Thank you for your time and help! Fwiw, I appreciate it a lot. I'll be at Buzzwords too, hoping it's actually gonna happen.
   
   `QueryValueSource.exists(int doc)` is probably looking more complicated than it could because it assumes docs can be requested out of order and because it keeps state on the availability of a scorer. I tried to make it tidier, but it's purely aesthetical.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley closed pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
dsmiley closed pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392568557
 
 

 ##########
 File path: lucene/core/src/java/org/apache/lucene/search/DoubleValuesSource.java
 ##########
 @@ -608,22 +608,25 @@ public DoubleValues getValues(LeafReaderContext ctx, DoubleValues scores) throws
       return new DoubleValues() {
         private final TwoPhaseIterator tpi = scorer.twoPhaseIterator();
         private final DocIdSetIterator disi = (tpi == null) ? scorer.iterator() : tpi.approximation();
-
-        private int scorerDoc = -1;
-        private boolean thisDocMatches = false;
+        private Boolean thisDocMatches = false;
 
         @Override
         public double doubleValue() throws IOException {
-          return thisDocMatches ? scorer.score() : Double.NaN;
+          return (thisDocMatches != null && thisDocMatches) ? scorer.score() : Double.NaN;
         }
 
         @Override
         public boolean advanceExact(int doc) throws IOException {
-          if (scorerDoc < doc) {
-            scorerDoc = disi.advance(doc);
-            thisDocMatches = tpi==null || tpi.matches();
+          if (disi.docID() < doc) {
 
 Review comment:
   You mean something like the following?
   ```
   assert disi.docID() < doc;
   disi.advance(doc);
   thisDocMatches = null;
   if (disi.docID() == doc) { 
   [...]
   ```
   
   The disi can advance potentially much further than the requested `doc`, and that means that, for potentially many requested `doc`s afterwards ,`disi.docID() > doc` might hold. Tests fail for the edit above.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392387868
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/QueryValueSource.java
 ##########
 @@ -156,42 +135,38 @@ public float floatVal(int doc) {
   @Override
   public boolean exists(int doc) {
     try {
-      if (doc < lastDocRequested) {
-        if (noMatches) return false;
+      if (noMatches) return false;
+      if (scorer == null) {
         scorer = weight.scorer(readerContext);
-        scorerDoc = -1;
-        if (scorer==null) {
+        if (scorer == null) {
           noMatches = true;
           return false;
         }
-        it = scorer.iterator();
+        tpi = scorer.twoPhaseIterator();
+        disi = tpi==null ? scorer.iterator() : tpi.approximation();
+        thisDocMatches = false;
       }
+      assert doc >= lastDocRequested : "values requested out of order; last=" + lastDocRequested + ", requested=" + doc;
 
 Review comment:
   Hmm; I wasn't sure about this but the more I look, it appears that the ValueSource API (here) also has a forward-only requirement.  A few years ago this was not the case.  So we can assume this and throw an exception.  See JoinDocFreqValueSource for an example.  I think this check you added here should be way up front at this method, thus clearly showing as a pre-condition, not down here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] micpalmia commented on issue #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
micpalmia commented on issue #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#issuecomment-597801420
 
 
   Copying comments by @dsmiley here:
   
   > minor: I suggest using the variable name "disi" for a DocIdSetIterator as this choice is quite common elsewhere.
   > 
   > In DoubleValueSource I think it'd be better to declare the state you need like tpi, disi inside the anonymous DoubleValues class to keep the scope smaller and to reduce extra hidden indirection that I think Java adds. Also declare final where applicable. scorerMatch boolean is confusing to me; and I think the initial value as written doesn't matter either. I suggest renaming this to thisDocMatches, initialize to false (since we are unpositioned at the start), and then update it appropriately. The logic you have now around scoreMatch seems to do maybe extra work and/or isn't as simple as it could be. If doubleValue is invoked and not thisDocMatches then don't delegate to the scorer, which is probably invalid. Return NaN.
   > 
   > In QueryValueSource.exists, maybe you could improve this a bit to be simpler; I find the way it was written prior to be confusing.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392383623
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/QueryValueSource.java
 ##########
 @@ -156,42 +135,38 @@ public float floatVal(int doc) {
   @Override
   public boolean exists(int doc) {
     try {
-      if (doc < lastDocRequested) {
-        if (noMatches) return false;
+      if (noMatches) return false;
+      if (scorer == null) {
         scorer = weight.scorer(readerContext);
-        scorerDoc = -1;
-        if (scorer==null) {
+        if (scorer == null) {
           noMatches = true;
           return false;
         }
-        it = scorer.iterator();
+        tpi = scorer.twoPhaseIterator();
+        disi = tpi==null ? scorer.iterator() : tpi.approximation();
+        thisDocMatches = false;
 
 Review comment:
   You should set thisDocMatches to null because at this point you do not know if there is a match.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392568557
 
 

 ##########
 File path: lucene/core/src/java/org/apache/lucene/search/DoubleValuesSource.java
 ##########
 @@ -608,22 +608,25 @@ public DoubleValues getValues(LeafReaderContext ctx, DoubleValues scores) throws
       return new DoubleValues() {
         private final TwoPhaseIterator tpi = scorer.twoPhaseIterator();
         private final DocIdSetIterator disi = (tpi == null) ? scorer.iterator() : tpi.approximation();
-
-        private int scorerDoc = -1;
-        private boolean thisDocMatches = false;
+        private Boolean thisDocMatches = false;
 
         @Override
         public double doubleValue() throws IOException {
-          return thisDocMatches ? scorer.score() : Double.NaN;
+          return (thisDocMatches != null && thisDocMatches) ? scorer.score() : Double.NaN;
         }
 
         @Override
         public boolean advanceExact(int doc) throws IOException {
-          if (scorerDoc < doc) {
-            scorerDoc = disi.advance(doc);
-            thisDocMatches = tpi==null || tpi.matches();
+          if (disi.docID() < doc) {
 
 Review comment:
   You mean something like the following?
   ```
   public boolean advanceExact(int doc) throws IOException {
       assert disi.docID() < doc;
       disi.advance(doc);
       thisDocMatches = null;
       if (disi.docID() == doc) { 
           [...]
   ```
   
   The disi can advance potentially much further than the requested `doc`, and that means that, for potentially many requested `doc`s afterwards ,`disi.docID() > doc` might hold. Tests fail for the edit above.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392379518
 
 

 ##########
 File path: lucene/core/src/java/org/apache/lucene/search/DoubleValuesSource.java
 ##########
 @@ -608,22 +608,25 @@ public DoubleValues getValues(LeafReaderContext ctx, DoubleValues scores) throws
       return new DoubleValues() {
         private final TwoPhaseIterator tpi = scorer.twoPhaseIterator();
         private final DocIdSetIterator disi = (tpi == null) ? scorer.iterator() : tpi.approximation();
-
-        private int scorerDoc = -1;
-        private boolean thisDocMatches = false;
+        private Boolean thisDocMatches = false;
 
         @Override
         public double doubleValue() throws IOException {
-          return thisDocMatches ? scorer.score() : Double.NaN;
+          return (thisDocMatches != null && thisDocMatches) ? scorer.score() : Double.NaN;
 
 Review comment:
   Just compare with Boolean.TRUE; one condition

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392676630
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/QueryValueSource.java
 ##########
 @@ -139,15 +138,14 @@ public boolean exists(int doc) {
     lastDocRequested = doc;
 
     try {
-      if (noMatches) return false;
       if (scorer == null) {
 
 Review comment:
   This now needs to check if disi isn't null, otherwise we might call weight.scorer a ton!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on issue #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
dsmiley commented on issue #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#issuecomment-597830616
 
 
   BTW thanks for your contributions lately.  Perhaps you will be at Berlin Buzzwords this June; I shall be there.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] micpalmia commented on issue #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
micpalmia commented on issue #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#issuecomment-599032031
 
 
   > No auto-boxing concerns for Boolean :-)
   
   thanks for the reassurance :) - do you mean the performance hit is too low to be of interest, or that there's no trouble with performances at all? Do you maybe have any references on this?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392222264
 
 

 ##########
 File path: lucene/core/src/java/org/apache/lucene/search/DoubleValuesSource.java
 ##########
 @@ -604,18 +604,26 @@ public DoubleValues getValues(LeafReaderContext ctx, DoubleValues scores) throws
       Scorer scorer = weight.scorer(ctx);
       if (scorer == null)
         return DoubleValues.EMPTY;
-      DocIdSetIterator it = scorer.iterator();
+
       return new DoubleValues() {
+        private final TwoPhaseIterator tpi = scorer.twoPhaseIterator();
+        private final DocIdSetIterator disi = (tpi == null) ? scorer.iterator() : tpi.approximation();
+
+        private int scorerDoc = -1;
+        private boolean thisDocMatches = false;
+
         @Override
         public double doubleValue() throws IOException {
-          return scorer.score();
+          return thisDocMatches ? scorer.score() : Double.NaN;
         }
 
         @Override
         public boolean advanceExact(int doc) throws IOException {
-          if (it.docID() > doc)
-            return false;
-          return it.docID() == doc || it.advance(doc) == doc;
+          if (scorerDoc < doc) {
+            scorerDoc = disi.advance(doc);
+            thisDocMatches = tpi==null || tpi.matches();
 
 Review comment:
   If scorerDoc is > "this" doc (doc param), then we should not even compute tpi.matches(); it's wasted potentially expensive computation.  Maybe if thisDocMatches is a Object Boolean, we can differentiate an unkinown state.
   
   Perhaps
   ```
   if (scorerDoc < doc) {
     scorerDoc = disi.advance(doc);
     thisDocMatches = null;
   }
   if (scorerDoc == doc) {
    if (thisDocMatches == null) {
     thisDocMatches = tpi == null || tpi.matches();
    }
    return thisDocMatches;
   } else {
     return thisDocMatches = false;
   }
   ```
   
   And I don't think we even need scorerDoc to be a field since it's disi.doc();  Local-variable is nice.
   Hmm; if I'm not mistaken, the very first check (if scorerDoc < doc) can be replaced with an assertion since otherwise the caller is violating the API contract.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] micpalmia commented on issue #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
micpalmia commented on issue #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#issuecomment-598668787
 
 
   Thank you for your time and help! Fwiw, I appreciate it a lot. I'll be to Buzzwords too, hoping it's actually gonna happen.
   
   `QueryValueSource.exists(int doc)` is probably looking more complicated than it could because it assumes docs can be requested out of order and because it keeps state on the availability of a scorer. I tried to make it tidier, but it's purely aesthetical.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392228458
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/QueryValueSource.java
 ##########
 @@ -156,42 +136,35 @@ public float floatVal(int doc) {
   @Override
   public boolean exists(int doc) {
     try {
+      if (noMatches) return false;
       if (doc < lastDocRequested) {
-        if (noMatches) return false;
         scorer = weight.scorer(readerContext);
-        scorerDoc = -1;
         if (scorer==null) {
           noMatches = true;
           return false;
         }
-        it = scorer.iterator();
+        tpi = scorer.twoPhaseIterator();
+        it = tpi==null ? scorer.iterator() : tpi.approximation();
+        scorerDoc = -1;
+        thisDocMatches = false;
       }
       lastDocRequested = doc;
 
       if (scorerDoc < doc) {
         scorerDoc = it.advance(doc);
-      }
-
-      if (scorerDoc > doc) {
-        // query doesn't match this document... either because we hit the
-        // end, or because the next doc is after this doc.
-        return false;
+        thisDocMatches = tpi == null || tpi.matches();
 
 Review comment:
   Again, lets not compute thisDocMatches if the scorerDoc doesn't even align.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392594785
 
 

 ##########
 File path: lucene/core/src/java/org/apache/lucene/search/DoubleValuesSource.java
 ##########
 @@ -608,22 +608,25 @@ public DoubleValues getValues(LeafReaderContext ctx, DoubleValues scores) throws
       return new DoubleValues() {
         private final TwoPhaseIterator tpi = scorer.twoPhaseIterator();
         private final DocIdSetIterator disi = (tpi == null) ? scorer.iterator() : tpi.approximation();
-
-        private int scorerDoc = -1;
-        private boolean thisDocMatches = false;
+        private Boolean thisDocMatches = false;
 
         @Override
         public double doubleValue() throws IOException {
-          return thisDocMatches ? scorer.score() : Double.NaN;
+          return (thisDocMatches != null && thisDocMatches) ? scorer.score() : Double.NaN;
         }
 
         @Override
         public boolean advanceExact(int doc) throws IOException {
-          if (scorerDoc < doc) {
-            scorerDoc = disi.advance(doc);
-            thisDocMatches = tpi==null || tpi.matches();
+          if (disi.docID() < doc) {
 
 Review comment:
   Doh!  Of course; thanks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392568557
 
 

 ##########
 File path: lucene/core/src/java/org/apache/lucene/search/DoubleValuesSource.java
 ##########
 @@ -608,22 +608,25 @@ public DoubleValues getValues(LeafReaderContext ctx, DoubleValues scores) throws
       return new DoubleValues() {
         private final TwoPhaseIterator tpi = scorer.twoPhaseIterator();
         private final DocIdSetIterator disi = (tpi == null) ? scorer.iterator() : tpi.approximation();
-
-        private int scorerDoc = -1;
-        private boolean thisDocMatches = false;
+        private Boolean thisDocMatches = false;
 
         @Override
         public double doubleValue() throws IOException {
-          return thisDocMatches ? scorer.score() : Double.NaN;
+          return (thisDocMatches != null && thisDocMatches) ? scorer.score() : Double.NaN;
         }
 
         @Override
         public boolean advanceExact(int doc) throws IOException {
-          if (scorerDoc < doc) {
-            scorerDoc = disi.advance(doc);
-            thisDocMatches = tpi==null || tpi.matches();
+          if (disi.docID() < doc) {
 
 Review comment:
   You mean something like the following?
   ```
   public boolean advanceExact(int doc) throws IOException {
       assert disi.docID() < doc;
       disi.advance(doc);
       thisDocMatches = null;
       if (disi.docID() == doc) { 
           [...]
   ```
   
   The disi can advance potentially much further than the requested `doc`, and that means that, for potentially many requested `doc`s afterwards ,`disi.docID() > doc` might hold. Tests fail for the edit above (on that assertion specifically).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392677444
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/QueryValueSource.java
 ##########
 @@ -139,15 +138,14 @@ public boolean exists(int doc) {
     lastDocRequested = doc;
 
     try {
-      if (noMatches) return false;
       if (scorer == null) {
 
 Review comment:
   🤦🏻‍♂️

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392569098
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/QueryValueSource.java
 ##########
 @@ -156,42 +135,38 @@ public float floatVal(int doc) {
   @Override
   public boolean exists(int doc) {
     try {
-      if (doc < lastDocRequested) {
-        if (noMatches) return false;
+      if (noMatches) return false;
+      if (scorer == null) {
         scorer = weight.scorer(readerContext);
-        scorerDoc = -1;
-        if (scorer==null) {
+        if (scorer == null) {
           noMatches = true;
           return false;
         }
-        it = scorer.iterator();
+        tpi = scorer.twoPhaseIterator();
+        disi = tpi==null ? scorer.iterator() : tpi.approximation();
+        thisDocMatches = false;
 
 Review comment:
   The disi is positioned on -1, so definitely not a match, no?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource

Posted by GitBox <gi...@apache.org>.
micpalmia commented on a change in pull request #1343: LUCENE-8103: Use two-phase iteration in Query- and DoubleValuesSource
URL: https://github.com/apache/lucene-solr/pull/1343#discussion_r392568557
 
 

 ##########
 File path: lucene/core/src/java/org/apache/lucene/search/DoubleValuesSource.java
 ##########
 @@ -608,22 +608,25 @@ public DoubleValues getValues(LeafReaderContext ctx, DoubleValues scores) throws
       return new DoubleValues() {
         private final TwoPhaseIterator tpi = scorer.twoPhaseIterator();
         private final DocIdSetIterator disi = (tpi == null) ? scorer.iterator() : tpi.approximation();
-
-        private int scorerDoc = -1;
-        private boolean thisDocMatches = false;
+        private Boolean thisDocMatches = false;
 
         @Override
         public double doubleValue() throws IOException {
-          return thisDocMatches ? scorer.score() : Double.NaN;
+          return (thisDocMatches != null && thisDocMatches) ? scorer.score() : Double.NaN;
         }
 
         @Override
         public boolean advanceExact(int doc) throws IOException {
-          if (scorerDoc < doc) {
-            scorerDoc = disi.advance(doc);
-            thisDocMatches = tpi==null || tpi.matches();
+          if (disi.docID() < doc) {
 
 Review comment:
   You mean something like the following?
   ```
   public boolean advanceExact(int doc) throws IOException {
       assert disi.docID() < doc;
       disi.advance(doc);
       thisDocMatches = null;
       if (disi.docID() == doc) { 
           [...]
   ```
   
   The disi can advance potentially much further than the requested `doc`, and that means that, for potentially many requested `doc`s afterwards ,`disi.docID() > doc` might hold. Tests fail for the edit above (on the assertion specifically).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org