You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/09/25 01:03:00 UTC

[jira] [Commented] (LUCENE-10109) Increase default 'beam width' for HNSW

    [ https://issues.apache.org/jira/browse/LUCENE-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17420035#comment-17420035 ] 

ASF subversion and git services commented on LUCENE-10109:
----------------------------------------------------------

Commit eaa421094d3efbba7b8961616402dc1f49ead485 in lucene's branch refs/heads/main from Julie Tibshirani
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=eaa4210 ]

LUCENE-10109: Bump default beam width for HNSW (#312)

Lucene90HnswVectorsFormat has a default 'beam width' of 16. This is quite low
and produces poor recall on typical-sized datasets.

This commit bumps it to 100. This new default tries to balance good search
performance with indexing speed. Most runs in ann-benchmarks set the parameter
between ~400 and 800, but they are heavily optimizing search over index speed.

> Increase default 'beam width' for HNSW
> --------------------------------------
>
>                 Key: LUCENE-10109
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10109
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Julie Tibshirani
>            Priority: Minor
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{Lucene90HnswVectorsFormat}} has a default 'beam width' of 16. This is quite low and doesn't produce good recall on typical-sized datasets. Lucene's 'beam width' roughly corresponds to the efConstruction parameter in HNSW. As a reference, the runs in ann-benchmarks set efConstruction between ~400 and 800, most common seems to be 500.
> I think we should bump the default for beam width to something like 100 or 500 to produce decent results out-of-the-box. I don't think it will slow down tests too much, but we could always set it lower in tests if necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org