You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Balmukund Mandal (Jira)" <ji...@apache.org> on 2022/05/09 17:00:00 UTC

[jira] [Commented] (LUCENE-9625) Benchmark KNN search with ann-benchmarks

    [ https://issues.apache.org/jira/browse/LUCENE-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533911#comment-17533911 ] 

Balmukund Mandal commented on LUCENE-9625:
------------------------------------------

I was trying to run the benchmark and has a couple of questions. Indexing takes a long time, so is there a way to configure the benchmark to use an already existing index for search? Also, is there a way to configure the benchmark to use multiple threads for indexing (looks to me that it’s a single-threaded indexing)?

> Benchmark KNN search with ann-benchmarks
> ----------------------------------------
>
>                 Key: LUCENE-9625
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9625
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Michael Sokolov
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In addition to benchmarking with luceneutil, it would be good to be able to make use of ann-benchmarks, which is publishing results from many approximate knn algorithms, including the hnsw implementation from its authors. We don't expect to challenge the performance of these native code libraries, however it would be good to know just how far off we are.
> I started looking into this and posted a fork of ann-benchmarks that uses KnnGraphTester  class to run these: https://github.com/msokolov/ann-benchmarks. It's still a WIP; you have to manually copy jars and the KnnGraphTester.class to the test host machine rather than downloading from a distribution. KnnGraphTester needs some modifications in order to support this process - this issue is mostly about that.
> One thing I noticed is that some of the index builds with higher fanout (efConstruction) settings time out at 2h (on an AWS c5 instance), so this is concerning and I'll open a separate issue for trying to improve that.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org