You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Mayya Sharipova (Jira)" <ji...@apache.org> on 2022/05/25 20:42:00 UTC

[jira] [Created] (LUCENE-10592) Should we build HNSW graph on the fly during indexing

Mayya Sharipova created LUCENE-10592:
----------------------------------------

             Summary: Should we build HNSW graph on the fly during indexing
                 Key: LUCENE-10592
                 URL: https://issues.apache.org/jira/browse/LUCENE-10592
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Mayya Sharipova


Currently, when we index vectors for KnnVectorField, we buffer those vectors in memory and on flush during a segment construction we build an HNSW graph.  As building an HNSW graph is very expensive, this makes flush operation take a lot of time. This also makes overall indexing performance quite unpredictable (as the number of flushes are defined by memory used, and the presence of concurrent searches), e.g. some indexing operations return almost instantly while others that trigger flush take a lot of time. 

Building an HNSW graph on the fly as we index vectors allows to avoid this problem, and spread a load of HNSW graph construction evenly. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org