You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/05/23 12:39:54 UTC

[GitHub] [lucene] mocobeta opened a new pull request, #920: LUCENE-10589: increase upper bound of test range query to the maximum value + 1

mocobeta opened a new pull request, #920:
URL: https://github.com/apache/lucene/pull/920

   This is a small tweak for `TestKnnVectorQuery.testRandomWithFilter()`.
   
   See https://issues.apache.org/jira/browse/LUCENE-10589.
   
   On main:
   ```
   ./gradlew test --tests TestKnnVectorQuery.testRandomWithFilter -Dtests.seed=1DA39B92702DAC45 -Dtests.multiplier=3
   
   org.apache.lucene.search.TestKnnVectorQuery > testRandomWithFilter FAILED
       java.lang.UnsupportedOperationException: exact search is not supported
           at __randomizedtesting.SeedInfo.seed([1DA39B92702DAC45:6BEAC2197AD96AE0]:0)
           at org.apache.lucene.search.TestKnnVectorQuery$ThrowingKnnVectorQuery.exactSearch(TestKnnVectorQuery.java:715)
   ```
   
   With this patch:
   ```
   ./gradlew test --tests TestKnnVectorQuery.testRandomWithFilter -Dtests.seed=1DA39B92702DAC45 -Dtests.multiplier=3
   
   :lucene:core:test (SUCCESS): 1 test(s)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #920: LUCENE-10589: increase upper bound of test range query to the maximum value + 1

Posted by GitBox <gi...@apache.org>.
msokolov commented on PR #920:
URL: https://github.com/apache/lucene/pull/920#issuecomment-1134805154

   I suspect what's happening is RandomIndexWriter is causing some very small segment to be written, and within that segment the query *is* highly selective causing us to fall back to brute force scan. I would probably fix by using a more "normal" IndexWriter and always indexing a single segment?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta merged pull request #920: LUCENE-10589: increase upper bound of test range query to the maximum value + 1

Posted by GitBox <gi...@apache.org>.
mocobeta merged PR #920:
URL: https://github.com/apache/lucene/pull/920


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on pull request #920: LUCENE-10589: increase upper bound of test range query to the maximum value + 1

Posted by GitBox <gi...@apache.org>.
mocobeta commented on PR #920:
URL: https://github.com/apache/lucene/pull/920#issuecomment-1134641142

   It looks like the test can be tweaked not to fall into the corner cases but I'm not fully sure if this is correct - is there a better way to fix it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jtibshirani commented on pull request #920: LUCENE-10589: increase upper bound of test range query to the maximum value + 1

Posted by GitBox <gi...@apache.org>.
jtibshirani commented on PR #920:
URL: https://github.com/apache/lucene/pull/920#issuecomment-1135164651

   Thank you @mocobeta for looking into this! I don't think the failure is caused having multiple segments, since we make sure to force merge to one segment before starting the searches. Stepping through what happens, it looks like we just hit a really unlucky query + data combination where it takes more than 150 steps to conclude the search.
   
   Your proposed fix makes sense to me. Another option is to decrease `k` to make the search more restrictive (currently it's set to 5, I think 1 would work instead).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on pull request #920: LUCENE-10589: increase upper bound of test range query to the maximum value + 1

Posted by GitBox <gi...@apache.org>.
mocobeta commented on PR #920:
URL: https://github.com/apache/lucene/pull/920#issuecomment-1136964941

   I'll merge this, thanks for your quick response!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #920: LUCENE-10589: increase upper bound of test range query to the maximum value + 1

Posted by GitBox <gi...@apache.org>.
msokolov commented on PR #920:
URL: https://github.com/apache/lucene/pull/920#issuecomment-1134842970

   Yeah, I'm just not sure that all the implications are desirable for this test. For example if we have a segment with 5 docs, their "tags" might all be > the threshold of the filter?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on pull request #920: LUCENE-10589: increase upper bound of test range query to the maximum value + 1

Posted by GitBox <gi...@apache.org>.
mocobeta commented on PR #920:
URL: https://github.com/apache/lucene/pull/920#issuecomment-1134837438

   According to the javadocs of the test, using a randomly skewed index with RandomIndexWriter is an intentional choice I think? 
   ```
   /** Tests with random vectors and a random filter. Uses RandomIndexWriter. */
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on pull request #920: LUCENE-10589: increase upper bound of test range query to the maximum value + 1

Posted by GitBox <gi...@apache.org>.
mocobeta commented on PR #920:
URL: https://github.com/apache/lucene/pull/920#issuecomment-1135958878

   @jtibshirani thanks for reviewing.
   
   > Stepping through what happens, it looks like we just hit a really unlucky query + data combination where it takes more than 150 steps to conclude the search.
   
   Yes, it looks like it hits a really unlucky combination of query and data - once you add a line something like `random().nextInt();` somewhere in the test code to move forward the random state, it becomes all green. First I was confused about what was happening there.
   
   > Another option is to decrease k to make the search more restrictive (currently it's set to 5, I think 1 would work instead).
   
   Smaller `k` (<= 4) surely fixes the problem. I cannot determine which approach is better in this context. Let me know if we should tweak `k` instead of the range query's upper bound, then I'll update this!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org