You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2020/03/01 08:10:59 UTC

[GitHub] [lucene-solr] atris opened a new pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

atris opened a new pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303
 
 
   This commit makes ValueSourceScorer's costing algorithm to also take the delegated FunctionValues's cost into consideration when calculating its cost.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#discussion_r387663374
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/FunctionValues.java
 ##########
 @@ -151,6 +164,11 @@ public ValueSourceScorer getScorer(Weight weight, LeafReaderContext readerContex
       public boolean matches(int doc) {
         return true;
       }
+      @Override
+      public float costEvaluationFunction() {
+        // Match everything
 
 Review comment:
   Nitpick comment: the point isn't that we're matching everything, the point is that matches() merely returns a constant and is thus virtually free.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
dsmiley commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-593657932
 
 
   Why a separate PR for my proposed test?
   
   Your proposal is better than the status quo but I think is rather lacking if that's it.  If your proposal can also accommodate a query-time user supplied cost, especially by FunctionRangeQuery somehow, then I think we're then in good shape as it'll allow a user to set this on the fly.  (BTW ignore the identical named class in Solr, which I plan on removing).  Perhaps this cost could sneak in by putting the cost on the "context" Map supplied to ValueSource.getValues ?  Yeah; that'd be cool :-)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-594105306
 
 
   > Ehh; nevermind my ill-thought-out idea of a cost on the Map context. There are many ValueSource.getValues impls that'd need to parse it, and then there's a concern that we wouldn't want it to propagate to sub-FunctionValues.
   > 
   > Alternative proposal: When FunctionRangeQuery calls functionValues.getRangeScorer, it gets back a ValueSourceScorer. We could just add a mutable cost on VSC that if set will be returned by VSC and if not VSC will delegate to the proposed `FV.cost`. While the mutability of it isn't pretty, it's also quite minor. It saves FRQ from having to wrap the scorer only to specify a matchCost.
   
   Yeah, I had a similar idea but the per VS implementations' need for parsing was putting off.
   
   I am not a fan of the intrusive mutability ability but agree that it is the cleanest thing to do to support this functionality while not needing to define a cost model for every FV implementation in this PR. As a follow up to this PR, I plan to define cost models for DoubleValuesSource, IntFieldSource etc.
   
   Raised another iteration with the said approach. I am not sure how to comprehensively write a test for this functionality -- but Lucene and Solr test suites pass with this commit. Please see and let me know your thoughts and comments.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris edited a comment on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris edited a comment on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-593133186
 
 
   @dsmiley Thinking further, I see no obvious way of ValueSourceScorer being able to determine a reasonable cost without having inputs from FunctionValues, and currently, the sane way of getting a cost out of FunctionValues is through its TPI (unless I am missing something?).
   
   The best cost metrics will come when specific implementations (such as IntFieldSource) expose their cost by internally evaluating their complexity in a source specific manner instead of delegating to the default FV matchCost implementation.
   
   Maybe have FunctionValues expose an abstract cost() method, have all FV derivatives implement it and then simply let VSC's matchCost use that method?
   
   (and oh, I realised that this PR is definitely wrong, for some reason I missed that FV delegates to VSC using itself as the delegated FV value. Need to add Lucene tests for VSC...)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#discussion_r387662567
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/FunctionValues.java
 ##########
 @@ -41,6 +41,9 @@
 //   want the Query carrying around big objects
 public abstract class FunctionValues {
 
+  // Default cost for FunctionValues -- ideally should be overriden by concrete implementations
+  public static final int DEFAULT_COST = 100;
 
 Review comment:
   This is fine with me but FWIW I wouldn't even bother defining it.  It has no value set aside like this; I doubt any user code would want to refer to it.  If we want to document what the default cost is, we should say so in cost()'s javadoc.  I know some devs like to make static constants for everything but IMO it's sometimes wasted ceremony.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
dsmiley commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-593151099
 
 
   Lets add the following test to TestFunctionRangeQuery:
   ```
     @Test
     public void testTwoRangeQueries() throws IOException {
       Query rq1 = new FunctionRangeQuery(INT_VALUESOURCE, 2, 4, true, true);
       Query rq2 = new FunctionRangeQuery(INT_VALUESOURCE, 8, 10, true, true);
       Query bq = new BooleanQuery.Builder()
           .add(rq1, BooleanClause.Occur.SHOULD)
           .add(rq2, BooleanClause.Occur.SHOULD)
           .build();
   
       ScoreDoc[] scoreDocs = indexSearcher.search(bq, N_DOCS).scoreDocs;
       expectScores(scoreDocs, 10, 9, 8, 4, 3, 2);
     }
   ```
   
   This'll stack-overflow on your first implementation.
   
   > Maybe have FunctionValues expose an abstract cost() method, have all FV derivatives implement it and then simply let VSC's matchCost use that method?
   
   Yes; we certainly need the FV to provide the cost; the TPI.matchCost should simply look it up.  By the FV (or VS) having a cost, it then becomes straight-forward for anyone's custom FV/VS to specify what their cost is.  It's debatable wether this cost should be on the VS vs FV.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-594352993
 
 
   > OH; an idea occurred to me. We don't actually need the cost to be mutable (which wasn't so pretty), we just need a matchCost that a ValueSourceScorer can choose to provide if it wants (which you just did). This way if someone (like Solr) wants to force the cost, it could wrap the original ValueSource to supply to FunctionRangeQuery -- one with a FunctionValues that overrides getRangeScorer to wrap subclass VSC to specify the cost. I know this is more code than the mutable cost but isn't as bad as what I was originally fearing, having to total black box, wrap a query, weight, scorer, etc. Perhaps this wrapping might even be a static convenience method on FunctionRangeQuery that takes in a VS & cost and returns a VS that is only to be used by FRQ.
   
   Agreed.
   
   I believe that the semantics here are completely driven from the use cases -- and yours is the use case we are chasing right now :) If the tradeoff of defining a new VS which overrides the cost method is fine, I am more than happy to remove the mutable cost. Raised an iteration for the same, please see and let me know your thoughts and comments.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#discussion_r387438580
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/ValueSourceScorer.java
 ##########
 @@ -55,7 +59,13 @@ public boolean matches() throws IOException {
 
       @Override
       public float matchCost() {
-        return 100; // TODO: use cost of ValueSourceScorer.this.matches()
+        // If an external cost is set, use that
+        if (externallyMutableCost != 0.0) {
+          return externallyMutableCost;
+        }
+
+        // Cost of iteration is fixed cost + cost exposed by delegated FunctionValues instance
+        return DEF_COST + values.cost();
 
 Review comment:
   I suppose the purpose of DEF_COST here is to add on the cost of the ValueSourceScorer.matches code that is separate from fetching the value from the FunctionValues.  That cost varies by the VSC subclass.  If we want to be thorough then maybe _this_ should be settable somehow _as well_?   One easy way to do this is to simply refactor out this one line (def cost + FV cost) into a protected method on the VSC that may be over-ridden if desired.   Or we could just say that is too pedantic, and prefer simplicity.  I'm fine either way.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#discussion_r387449993
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/ValueSourceScorer.java
 ##########
 @@ -55,7 +59,13 @@ public boolean matches() throws IOException {
 
       @Override
       public float matchCost() {
-        return 100; // TODO: use cost of ValueSourceScorer.this.matches()
+        // If an external cost is set, use that
+        if (externallyMutableCost != 0.0) {
+          return externallyMutableCost;
+        }
+
+        // Cost of iteration is fixed cost + cost exposed by delegated FunctionValues instance
+        return DEF_COST + values.cost();
 
 Review comment:
   I felt that DEF_COST defines the cost of the VSC.matches call itself and if the user prefers to do a complex cost, then override matchCost() :) I agree with your approach, added the cost evaluation method as a separate class method

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
dsmiley commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-594728584
 
 
   Okay; add to "Improvements" section.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#discussion_r387435286
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/FunctionValues.java
 ##########
 @@ -90,6 +93,9 @@ public int ordVal(int doc) throws IOException {
    * @return the number of unique sort ordinals this instance has
    */
   public int numOrd() { throw new UnsupportedOperationException(); }
+
 
 Review comment:
   Really needs javadoc explaining what this is.  See TPI.matchCost for inspiration.  I suggest:
   
   > An estimate of the expected cost to return a value for a document.
   > It's intended to be used by TwoPhaseIterator.matchCost implementations.
   > Returns an expected cost in number of simple operations like addition, multiplication,
   > comparing two numbers and indexing an array.
   > The returned value must be positive.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#discussion_r387438763
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/ValueSourceScorer.java
 ##########
 @@ -94,4 +104,12 @@ public float getMaxScore(int upTo) throws IOException {
     return Float.POSITIVE_INFINITY;
   }
 
+  /**
+   * Used to externally set a mutable cost for this instance. If set, this cost gets preference over the FunctionValues's cost
+   *
+   * @lucene.experimental
+   */
+  public void setExternallyMutableCost(float cost) {
 
 Review comment:
   Why not simply setMatchCost ?  It's apparent it's "mutable", and it's public and thus "externally" so :-)
   I think the javadocs should reference TPI.matchCost.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#discussion_r387794662
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/FunctionValues.java
 ##########
 @@ -151,6 +164,11 @@ public ValueSourceScorer getScorer(Weight weight, LeafReaderContext readerContex
       public boolean matches(int doc) {
         return true;
       }
+      @Override
+      public float costEvaluationFunction() {
+        // Match everything
 
 Review comment:
   Removed the comment, thanks

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-594318472
 
 
   @dsmiley Any thoughts on this one?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#discussion_r387451155
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/FunctionValues.java
 ##########
 @@ -90,6 +93,9 @@ public int ordVal(int doc) throws IOException {
    * @return the number of unique sort ordinals this instance has
    */
   public int numOrd() { throw new UnsupportedOperationException(); }
+
 
 Review comment:
   Used your recommendation, thanks

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris merged pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris merged pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#discussion_r387794239
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/FunctionValues.java
 ##########
 @@ -41,6 +41,9 @@
 //   want the Query carrying around big objects
 public abstract class FunctionValues {
 
+  // Default cost for FunctionValues -- ideally should be overriden by concrete implementations
+  public static final int DEFAULT_COST = 100;
 
 Review comment:
   Fixed, thanks

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
dsmiley commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-593116806
 
 
   Wouldn't this result in an infinite loop?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#discussion_r387660244
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/ValueSourceScorer.java
 ##########
 @@ -94,4 +97,17 @@ public float getMaxScore(int upTo) throws IOException {
     return Float.POSITIVE_INFINITY;
   }
 
+  /**
+   * Cost evaluation function which defines the cost of access for the TwoPhaseIterator for this class
+   * This method should be overridden for specifying custom cost methods. Used by {@link TwoPhaseIterator#matchCost()}
+   * for the instance owned by this class
+   *
+   * @return cost of access
+   *
+   * @lucene.experimental
+   */
+  protected float costEvaluationFunction() {
 
 Review comment:
   Can we call this simply matchCost now since, after all, the TPI.matchCost just calls it now with no mutable override?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
dsmiley commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-593675019
 
 
   Ehh; nevermind my ill-thought-out idea of a cost on the Map context.  There are many  ValueSource.getValues impls that'd need to parse it, and then there's a concern that we wouldn't want it to propagate to sub-FunctionValues.
   
   Alternative proposal:  When FunctionRangeQuery calls functionValues.getRangeScorer, it gets back a ValueSourceScorer.  We could just add a mutable cost on VSC that if set will be returned by VSC and if not VSC will delegate to the proposed `FV.cost`.  While the mutability of it isn't pretty, it's also quite minor.  It saves FRQ from having to wrap the scorer only to specify a matchCost.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-593544138
 
 
   > Lets add the following test to TestFunctionRangeQuery:
   > 
   > ```
   >   @Test
   >   public void testTwoRangeQueries() throws IOException {
   >     Query rq1 = new FunctionRangeQuery(INT_VALUESOURCE, 2, 4, true, true);
   >     Query rq2 = new FunctionRangeQuery(INT_VALUESOURCE, 8, 10, true, true);
   >     Query bq = new BooleanQuery.Builder()
   >         .add(rq1, BooleanClause.Occur.SHOULD)
   >         .add(rq2, BooleanClause.Occur.SHOULD)
   >         .build();
   > 
   >     ScoreDoc[] scoreDocs = indexSearcher.search(bq, N_DOCS).scoreDocs;
   >     expectScores(scoreDocs, 10, 9, 8, 4, 3, 2);
   >   }
   > ```
   Thanks, will add it in a separate PR.
   
   > > Maybe have FunctionValues expose an abstract cost() method, have all FV derivatives implement it and then simply let VSC's matchCost use that method?
   > 
   > Yes; we certainly need the FV to provide the cost; the TPI.matchCost should simply look it up. By the FV (or VS) having a cost, it then becomes straight-forward for anyone's custom FV/VS to specify what their cost is. It's debatable wether this cost should be on the VS vs FV.
   
   Ok, so my next iteration will have a cost method in FV (I am inclined to add the method in FV since VSC has a direct member of that type?) and have VSC owned TPI's matchCost() refer to the cost of the delegated FV.
   
   I also feel that FV class itself should have a default implementation of cost method where it just returns a dumb value (100?) as the cost. The idea is that if you are using the default FV, you probably dont care about costs. If you plug in your own FV which has a smarter costing algorithm, VSC will automatically pick it up.
   
   WDYT?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-593126930
 
 
   > Wouldn't this result in an infinite loop?
   
   The idea was that the underlying TwoPhaseIterator implementation for the nested FunctionValues would be an actual VSC derivative and not using the default matchCost version. I planned to merge this PR only for 8x and make matchCost an abstract method for master. BTW, what would be your suggestion to better evaluate FunctionValues's cost? Maybe we could look at the doc each time and see if it matches or not and then return a cost based on that?
   
   Another thing -- the Lucene test suite passes fine with this change. Does that mean that we are lacking comprehensive tests for VSC where two nested FunctionValues use the default VSC matchCost?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#discussion_r387436290
 
 

 ##########
 File path: lucene/queries/src/java/org/apache/lucene/queries/function/ValueSourceScorer.java
 ##########
 @@ -39,9 +39,13 @@
  * @lucene.experimental
  */
 public abstract class ValueSourceScorer extends Scorer {
+  // Fixed cost for a single iteration of the TwoPhaseIterator instance
+  private static final int DEF_COST = 5;
+
   protected final FunctionValues values;
   private final TwoPhaseIterator twoPhaseIterator;
   private final DocIdSetIterator disi;
+  private float externallyMutableCost;
 
 Review comment:
   Or we could use a Float object to more clearly show as user-settable via non-null?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-594654890
 
 
   @dsmiley Updated, please see and let me know your thoughts and comments. I have not added CHANGES.txt entry to avoid conflicts, will do so just before committing.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris edited a comment on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris edited a comment on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-593133186
 
 
   @dsmiley Thinking further, I see no obvious way of ValueSourceScorer being able to determine a reasonable cost without having inputs from FunctionValues, and currently, the sane way of getting a cost out of FunctionValues is through its TPI (unless I am missing something?).
   
   The best cost metrics will come when specific implementations (such as IntFieldSource) expose their cost by internally evaluating their complexity in a source specific manner instead of delegating to the default FV matchCost implementation.
   
   Maybe have FunctionValues expose an abstract cost() method, have all FV derivatives implement it and then simply let VSC's matchCost use that method?
   
   (and oh, I realised that this PR is definitely wrong, thanks for pointing it out. For some reason I missed that FV delegates to VSC using itself as the delegated FV value. Need to add Lucene tests for VSC...)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris commented on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-593133186
 
 
   @dsmiley Thinking further, I see no obvious way of ValueSourceScorer being able to determine a reasonable cost without having inputs from FunctionValues, and currently, the sane way of getting a cost out of FunctionValues is through its TPI (unless I am missing something?).
   
   The best cost metrics will come when specific implementations (such as IntFieldSource) expose their cost by internally evaluating their complexity in a source specific manner instead of delegating to the default FV matchCost implementation.
   
   Maybe have FunctionValues expose an abstract cost() method, have all FV derivatives implement it and then simply let VSC's matchCost use that method?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] atris edited a comment on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation

Posted by GitBox <gi...@apache.org>.
atris edited a comment on issue #1303: LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation
URL: https://github.com/apache/lucene-solr/pull/1303#issuecomment-594352993
 
 
   > OH; an idea occurred to me. We don't actually need the cost to be mutable (which wasn't so pretty), we just need a matchCost that a ValueSourceScorer can choose to provide if it wants (which you just did). This way if someone (like Solr) wants to force the cost, it could wrap the original ValueSource to supply to FunctionRangeQuery -- one with a FunctionValues that overrides getRangeScorer to wrap subclass VSC to specify the cost. I know this is more code than the mutable cost but isn't as bad as what I was originally fearing, having to total black box, wrap a query, weight, scorer, etc. Perhaps this wrapping might even be a static convenience method on FunctionRangeQuery that takes in a VS & cost and returns a VS that is only to be used by FRQ.
   
   @dsmiley , Agreed.
   
   I believe that the semantics here are completely driven from the use cases -- and yours is the use case we are chasing right now :) If the tradeoff of defining a new VS which overrides the cost method is fine, I am more than happy to remove the mutable cost. Raised an iteration for the same, please see and let me know your thoughts and comments.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org