You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/05/12 19:23:00 UTC
[jira] [Commented] (LUCENE-10527) Use bigger maxConn for last layer in HNSW

    [ https://issues.apache.org/jira/browse/LUCENE-10527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17536299#comment-17536299 ] 

ASF subversion and git services commented on LUCENE-10527:
----------------------------------------------------------

Commit ea5c40686f87bc74b6ac6c1af27ca10d1698a451 in lucene's branch refs/heads/main from Mayya Sharipova
[ https://gitbox.apache.org/repos/asf?p=lucene.git;h=ea5c40686f8 ]

LUCENE-10527 Use 2*maxConn for last layer in HNSW (#872)

The original HNSW paper (https://arxiv.org/pdf/1603.09320.pdf) suggests
to use a different maxConn for the upper layers vs. the bottom one
(which contains the full neighborhood graph). Specifically, they
suggest using maxConn=M for upper layers and maxConn=2*M for the bottom.

This patch ensures that we follow this recommendation and use
maxConn=2*M for the bottom layer.

> Use bigger maxConn for last layer in HNSW
> -----------------------------------------
>
>                 Key: LUCENE-10527
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10527
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Julie Tibshirani
>            Assignee: Mayya Sharipova
>            Priority: Minor
>         Attachments: image-2022-04-20-14-53-58-484.png
>
>          Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Recently I was rereading the HNSW paper ([https://arxiv.org/pdf/1603.09320.pdf)] and noticed that they suggest using a different maxConn for the upper layers vs. the bottom one (which contains the full neighborhood graph). Specifically, they suggest using maxConn=M for upper layers and maxConn=2*M for the bottom. This differs from what we do, which is to use maxConn=M for all layers.
> I tried updating our logic using a hacky patch, and noticed an improvement in latency for higher recall values (which is consistent with the paper's observation):
> *Results on glove-100-angular*
> Parameters: M=32, efConstruction=100
> !image-2022-04-20-14-53-58-484.png|width=400,height=367!
> As we'd expect, indexing becomes a bit slower:
> {code:java}
> Baseline: Indexed 1183514 documents in 733s 
> Candidate: Indexed 1183514 documents in 948s{code}
> When we benchmarked Lucene HNSW against hnswlib in LUCENE-9937, we noticed a big difference in recall for the same settings of M and efConstruction. (Even adding graph layers in LUCENE-10054 didn't really affect recall.) With this change, the recall is now very similar:
> *Results on glove-100-angular*
> Parameters: M=32, efConstruction=100
> {code:java}
> k            Approach                                                  Recall     QPS
> 10           luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.563     4410.499
> 50           luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.798     1956.280
> 100          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.862     1209.734
> 100          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.958      341.428
> 800          luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.974      230.396
> 1000         luceneknn dim=100 {'M': 32, 'efConstruction': 100}        0.980      188.757
> 10           hnswlib ({'M': 32, 'efConstruction': 100})                0.552    16745.433
> 50           hnswlib ({'M': 32, 'efConstruction': 100})                0.794     5738.468
> 100          hnswlib ({'M': 32, 'efConstruction': 100})                0.860     3336.386
> 500          hnswlib ({'M': 32, 'efConstruction': 100})                0.956      832.982
> 800          hnswlib ({'M': 32, 'efConstruction': 100})                0.973      541.097
> 1000         hnswlib ({'M': 32, 'efConstruction': 100})                0.979      442.163
> {code}
> I think it'd be nice update to maxConn so that we faithfully implement the paper's algorithm. This is probably least surprising for users, and I don't see a strong reason to take a different approach from the paper? Let me know what you think!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org