You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Michael Sokolov (Jira)" <ji...@apache.org> on 2020/11/30 13:59:00 UTC

[jira] [Created] (LUCENE-9625) Benchmark KNN search with ann-benchmarks

Michael Sokolov created LUCENE-9625:
---------------------------------------

             Summary: Benchmark KNN search with ann-benchmarks
                 Key: LUCENE-9625
                 URL: https://issues.apache.org/jira/browse/LUCENE-9625
             Project: Lucene - Core
          Issue Type: New Feature
            Reporter: Michael Sokolov


In addition to benchmarking with luceneutil, it would be good to be able to make use of ann-benchmarks, which is publishing results from many approximate knn algorithms, including the hnsw implementation from its authors. We don't expect to challenge the performance of these native code libraries, however it would be good to know just how far off we are.

I started looking into this and posted a fork of ann-benchmarks that uses KnnGraphTester  class to run these: https://github.com/msokolov/ann-benchmarks. It's still a WIP; you have to manually copy jars and the KnnGraphTester.class to the test host machine rather than downloading from a distribution. KnnGraphTester needs some modifications in order to support this process - this issue is mostly about that.

One thing I noticed is that some of the index builds with higher fanout (efConstruction) settings time out at 2h (on an AWS c5 instance), so this is concerning and I'll open a separate issue for trying to improve that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org