You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@lucene.apache.org by "jtibshirani (via GitHub)" <gi...@apache.org> on 2023/02/14 18:45:37 UTC

[GitHub] [lucene] jtibshirani opened a new pull request, #12148: Improve DocAndScoreQuery#toString

jtibshirani opened a new pull request, #12148:
URL: https://github.com/apache/lucene/pull/12148

   Tiny improvements to DocAndScoreQuery:
   * Make toString more informative
   * Remove unnecessary 'k' parameter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] jtibshirani commented on a diff in pull request #12148: Improve DocAndScoreQuery#toString

Posted by "jtibshirani (via GitHub)" <gi...@apache.org>.

jtibshirani commented on code in PR #12148:
URL: https://github.com/apache/lucene/pull/12148#discussion_r1106384945


##########
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##########
@@ -302,9 +298,9 @@ public Weight createWeight(IndexSearcher searcher, ScoreMode scoreMode, float bo
         public Explanation explain(LeafReaderContext context, int doc) {
           int found = Arrays.binarySearch(docs, doc + context.docBase);
           if (found < 0) {
-            return Explanation.noMatch("not in top " + k);
+            return Explanation.noMatch("not in top docs");
           }
-          return Explanation.match(scores[found] * boost, "within top " + k);
+          return Explanation.match(scores[found] * boost, "within top docs");

Review Comment:
   Personally I don't find including this number helpful -- we don't usually include any query details within explanations. Storing 'k' in `DocAndScoreQuery` also felt out of place, since it's only used in these messages and has no bearing on the query's functionality (not included in equals/ hashCode, etc.)
   
   I don't feel strongly though, happy to undo this part. Or we could switch the message to "not in k nearest neighbors" to make it clearer?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] jtibshirani commented on a diff in pull request #12148: Improve DocAndScoreQuery#toString

Posted by "jtibshirani (via GitHub)" <gi...@apache.org>.

jtibshirani commented on code in PR #12148:
URL: https://github.com/apache/lucene/pull/12148#discussion_r1106621137


##########
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##########
@@ -302,9 +298,9 @@ public Weight createWeight(IndexSearcher searcher, ScoreMode scoreMode, float bo
         public Explanation explain(LeafReaderContext context, int doc) {
           int found = Arrays.binarySearch(docs, doc + context.docBase);
           if (found < 0) {
-            return Explanation.noMatch("not in top " + k);
+            return Explanation.noMatch("not in top docs");
           }
-          return Explanation.match(scores[found] * boost, "within top " + k);
+          return Explanation.match(scores[found] * boost, "within top docs");

Review Comment:
   Thanks, I gave that a try. I like this compromise, but overall I think it'd be too complex/ confusing if DocAndScoreQuery tried to fully represent the parent kNN vector query (in its query name, explanations, toString, etc.). It's just one of the consequences of our rewrite-based approach, that the kNN query is transformed into a separate query type.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] jtibshirani merged pull request #12148: Improve DocAndScoreQuery#toString

Posted by "jtibshirani (via GitHub)" <gi...@apache.org>.

jtibshirani merged PR #12148:
URL: https://github.com/apache/lucene/pull/12148


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] benwtrent commented on a diff in pull request #12148: Improve DocAndScoreQuery#toString

Posted by "benwtrent (via GitHub)" <gi...@apache.org>.

benwtrent commented on code in PR #12148:
URL: https://github.com/apache/lucene/pull/12148#discussion_r1106345767


##########
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##########
@@ -302,9 +298,9 @@ public Weight createWeight(IndexSearcher searcher, ScoreMode scoreMode, float bo
         public Explanation explain(LeafReaderContext context, int doc) {
           int found = Arrays.binarySearch(docs, doc + context.docBase);
           if (found < 0) {
-            return Explanation.noMatch("not in top " + k);
+            return Explanation.noMatch("not in top docs");
           }
-          return Explanation.match(scores[found] * boost, "within top " + k);
+          return Explanation.match(scores[found] * boost, "within top docs");

Review Comment:
   I am not sure about this. The purpose of this class is to hold the results from the KNN search. Its explanation should reflect that its the results from a `knn` search for a given `k`.
   
   Is the reason for removing the `k` is simply because its only used in this method? Or that the number of docs matched could be less than `k`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] jtibshirani commented on a diff in pull request #12148: Improve DocAndScoreQuery#toString

Posted by "jtibshirani (via GitHub)" <gi...@apache.org>.

jtibshirani commented on code in PR #12148:
URL: https://github.com/apache/lucene/pull/12148#discussion_r1106384945


##########
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##########
@@ -302,9 +298,9 @@ public Weight createWeight(IndexSearcher searcher, ScoreMode scoreMode, float bo
         public Explanation explain(LeafReaderContext context, int doc) {
           int found = Arrays.binarySearch(docs, doc + context.docBase);
           if (found < 0) {
-            return Explanation.noMatch("not in top " + k);
+            return Explanation.noMatch("not in top docs");
           }
-          return Explanation.match(scores[found] * boost, "within top " + k);
+          return Explanation.match(scores[found] * boost, "within top docs");

Review Comment:
   Personally I don't find including this number helpful -- we don't usually include any query details within explanations. Storing 'k' in `DocAndScoreQuery` also felt out of place, since it's only used in these messages and has no bearing on the query's functionality (not included in equals/ hashCode, etc.)
   
   I don't feel strongly though, happy to undo this part. Or we could switch the message to "not in top k nearest neighbors" to make it clearer?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] msokolov commented on a diff in pull request #12148: Improve DocAndScoreQuery#toString

Posted by "msokolov (via GitHub)" <gi...@apache.org>.

msokolov commented on code in PR #12148:
URL: https://github.com/apache/lucene/pull/12148#discussion_r1106608697


##########
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##########
@@ -302,9 +298,9 @@ public Weight createWeight(IndexSearcher searcher, ScoreMode scoreMode, float bo
         public Explanation explain(LeafReaderContext context, int doc) {
           int found = Arrays.binarySearch(docs, doc + context.docBase);
           if (found < 0) {
-            return Explanation.noMatch("not in top " + k);
+            return Explanation.noMatch("not in top docs");
           }
-          return Explanation.match(scores[found] * boost, "within top " + k);
+          return Explanation.match(scores[found] * boost, "within top docs");

Review Comment:
   Could we report the length of the docs array instead of passing k around? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] jtibshirani commented on a diff in pull request #12148: Improve DocAndScoreQuery#toString

Posted by "jtibshirani (via GitHub)" <gi...@apache.org>.

jtibshirani commented on code in PR #12148:
URL: https://github.com/apache/lucene/pull/12148#discussion_r1106410582


##########
lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java:
##########
@@ -302,9 +298,9 @@ public Weight createWeight(IndexSearcher searcher, ScoreMode scoreMode, float bo
         public Explanation explain(LeafReaderContext context, int doc) {
           int found = Arrays.binarySearch(docs, doc + context.docBase);
           if (found < 0) {
-            return Explanation.noMatch("not in top " + k);
+            return Explanation.noMatch("not in top docs");
           }
-          return Explanation.match(scores[found] * boost, "within top " + k);
+          return Explanation.match(scores[found] * boost, "within top docs");

Review Comment:
   Personally I didn't find including this number helpful, without more context. Storing 'k' in `DocAndScoreQuery` also felt out of place, since it's only used in these messages and has no bearing on the query's functionality (not included in equals/ hashCode, etc.)
   
   I don't feel strongly though, happy to undo this part. Or we could switch the message to "not in k nearest neighbors" to make it clearer?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org

[GitHub] [lucene] jtibshirani commented on pull request #12148: Improve DocAndScoreQuery#toString

Posted by "jtibshirani (via GitHub)" <gi...@apache.org>.

jtibshirani commented on PR #12148:
URL: https://github.com/apache/lucene/pull/12148#issuecomment-1430220297

   Same as #12146 -- I've been testing an AI coding assistant on some Lucene code, and it suggested these small improvements.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org