You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/08/10 14:54:29 UTC

[GitHub] [lucene] msokolov commented on pull request #235: LUCENE-9614: add KnnVectorQuery implementation

msokolov commented on pull request #235:
URL: https://github.com/apache/lucene/pull/235#issuecomment-896098498


   Thanks for all the comments; I'll follow up with a new commit that addresses them soon. `1 / (1 + x)` makes a lot of sense; I was groping towards it :)
   
   Re: the random-distribution assumption for segments -- I believe this depends very much on the use case. Our experience in e-commerce is it is *usually* true. We've seen occasional outlying cases (more popular media products get re-indexed more often, and there can be correlation if *popularity* is an important query feature, which it is), but this is more the exception than the rule. OTOH a time-series index is likely to be heavily correlated, so a different strategy is appropriate (also, sequential operation can more easily re-use thresholds across segments, and if the segments can be sorted, that will help). Perhaps the vanilla approach (collect K per segment) is best as a safe first step, but I think some optimization here will be heavily impactful since the `K` directly influences the number of nodes explored in the graph, and thence the query cost. Maybe it will deserve some kind of parameterization - so yes, I agree, let's remove this for now, and follow up later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org