You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Michael Wechner <mi...@wyona.com> on 2021/11/01 17:53:08 UTC

VectorField renamed to KnnVectorField?

Hi

In May 2021 I have done a Vector Search implementation based on Lucene 9.0.0-SNAPSHOT with the following code

FieldType vectorFieldType = VectorField.createHnswType(vector.length, VectorValues.SimilarityFunction.DOT_PRODUCT,16,500);
VectorField vectorField =new VectorField(VECTOR_FIELD, vector, vectorFieldType);
doc.add(vectorField)

and

class KnnWeightextends Weight {

     KnnWeight() {
         super(KnnQuery.this);
     }

     @Override public Scorer scorer(LeafReaderContext context)throws IOException {
         log.debug("Get scorer.");
         return new TopDocScorer(this, context.reader().searchNearestVectors(field,vector,topK,fanout));
     }

whereas fanout is of type "int"

I have now updated Lucene source and rebuilt 9.0.0-SNAPSHOT and get various compile errors.

I assume VectorField got renamed to KnnVectorField, right?

Does somebody maybe have some sample code how Vector search is being implemented with the most recent Lucene code?

Thanks

Michael


Re: VectorField renamed to KnnVectorField?

Posted by Michael Wechner <mi...@wyona.com>.
Hi Vigya

Great, thank you very much for these links!

All the best

Michael

Am 02.11.21 um 01:45 schrieb Vigya Sharma:
> Hi Michael,
>
> Glad you got it working. There is also a KNN vector search demo that 
> was added not long ago. You might want to check it out. It has 
> references for example, to compute embeddings and build knn vector 
> queries 
> <https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/SearchFiles.java#L272-L292>, 
> among other things.
>
>   * Search Files:
>     https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/SearchFiles.java
>     <https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/SearchFiles.java>
>
>   * Index Files:
>     https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/IndexFiles.java
>     <https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/IndexFiles.java>
>
>   * knn demo files:
>     https://github.com/apache/lucene/tree/main/lucene/demo/src/java/org/apache/lucene/demo/knn
>     <https://github.com/apache/lucene/tree/main/lucene/demo/src/java/org/apache/lucene/demo/knn>
>
>
> - Vigya
>
>
> On Mon, Nov 1, 2021 at 2:44 PM Michael Wechner 
> <michael.wechner@wyona.com <ma...@wyona.com>> wrote:
>
>     I was able to update my code
>
>     -        FieldType vectorFieldType =
>     VectorField.createHnswType(vector.length,
>     VectorValues.SimilarityFunction.DOT_PRODUCT, 16, 500);
>     -        VectorField vectorField = new VectorField(VECTOR_FIELD,
>     vector, vectorFieldType);
>     +        FieldType vectorFieldType =
>     KnnVectorField.createFieldType(vector.length,
>     VectorSimilarityFunction.DOT_PRODUCT);
>     +        KnnVectorField vectorField = new
>     KnnVectorField(VECTOR_FIELD, vector, vectorFieldType);
>
>     and
>
>     -            return new TopDocScorer(this,
>     context.reader().searchNearestVectors(field, vector, topK, fanout));
>     +            return new TopDocScorer(this,
>     context.reader().searchNearestVectors(field, vector, topK, null));
>
>     the indexing and searching works again :-)
>
>     Thanks
>
>     Michael
>
>     Am 01.11.21 um 18:53 schrieb Michael Wechner:
>>     Hi
>>
>>     In May 2021 I have done a Vector Search implementation based on Lucene 9.0.0-SNAPSHOT with the following code
>>
>>     FieldType vectorFieldType = VectorField.createHnswType(vector.length, VectorValues.SimilarityFunction.DOT_PRODUCT,16,500);
>>     VectorField vectorField =new VectorField(VECTOR_FIELD, vector, vectorFieldType);
>>     doc.add(vectorField)
>>
>>     and
>>
>>     class KnnWeightextends Weight {
>>
>>          KnnWeight() {
>>              super(KnnQuery.this);
>>          }
>>
>>          @Override public Scorer scorer(LeafReaderContext context)throws IOException {
>>              log.debug("Get scorer.");
>>              return new TopDocScorer(this, context.reader().searchNearestVectors(field,vector,topK,fanout));
>>          }
>>
>>     whereas fanout is of type "int"
>>
>>     I have now updated Lucene source and rebuilt 9.0.0-SNAPSHOT and get various compile errors.
>>
>>     I assume VectorField got renamed to KnnVectorField, right?
>>
>>     Does somebody maybe have some sample code how Vector search is being implemented with the most recent Lucene code?
>>
>>     Thanks
>>
>>     Michael
>
>
>
> -- 
> - Vigya


Re: VectorField renamed to KnnVectorField?

Posted by Vigya Sharma <vi...@gmail.com>.
Hi Michael,

Glad you got it working. There is also a KNN vector search demo that was
added not long ago. You might want to check it out. It has references for
example, to compute embeddings and build knn vector queries
<https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/SearchFiles.java#L272-L292>,
among other things.

   - Search Files:
   https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/SearchFiles.java

   - Index Files:
   https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/IndexFiles.java

   - knn demo files:
   https://github.com/apache/lucene/tree/main/lucene/demo/src/java/org/apache/lucene/demo/knn


- Vigya


On Mon, Nov 1, 2021 at 2:44 PM Michael Wechner <mi...@wyona.com>
wrote:

> I was able to update my code
>
> -        FieldType vectorFieldType =
> VectorField.createHnswType(vector.length,
> VectorValues.SimilarityFunction.DOT_PRODUCT, 16, 500);
> -        VectorField vectorField = new VectorField(VECTOR_FIELD, vector,
> vectorFieldType);
> +        FieldType vectorFieldType =
> KnnVectorField.createFieldType(vector.length,
> VectorSimilarityFunction.DOT_PRODUCT);
> +        KnnVectorField vectorField = new KnnVectorField(VECTOR_FIELD,
> vector, vectorFieldType);
>
> and
>
> -            return new TopDocScorer(this,
> context.reader().searchNearestVectors(field, vector, topK, fanout));
> +            return new TopDocScorer(this,
> context.reader().searchNearestVectors(field, vector, topK, null));
>
> the indexing and searching works again :-)
>
> Thanks
>
> Michael
>
> Am 01.11.21 um 18:53 schrieb Michael Wechner:
>
> Hi
>
> In May 2021 I have done a Vector Search implementation based on Lucene 9.0.0-SNAPSHOT with the following code
>
> FieldType vectorFieldType = VectorField.createHnswType(vector.length, VectorValues.SimilarityFunction.DOT_PRODUCT, 16, 500);
> VectorField vectorField = new VectorField(VECTOR_FIELD, vector, vectorFieldType);
> doc.add(vectorField)
>
> and
> class KnnWeight extends Weight {
>
>     KnnWeight() {
>         super(KnnQuery.this);
>     }
>
>     @Override    public Scorer scorer(LeafReaderContext context) throws IOException {
>         log.debug("Get scorer.");
>         return new TopDocScorer(this, context.reader().searchNearestVectors(field, vector, topK, fanout));
>     }
>
> whereas fanout is of type "int"
>
> I have now updated Lucene source and rebuilt 9.0.0-SNAPSHOT and get various compile errors.
>
> I assume VectorField got renamed to KnnVectorField, right?
>
> Does somebody maybe have some sample code how Vector search is being implemented with the most recent Lucene code?
>
> Thanks
>
> Michael
>
>
>

-- 
- Vigya

Re: VectorField renamed to KnnVectorField?

Posted by Michael Wechner <mi...@wyona.com>.
I was able to update my code

-        FieldType vectorFieldType = 
VectorField.createHnswType(vector.length, 
VectorValues.SimilarityFunction.DOT_PRODUCT, 16, 500);
-        VectorField vectorField = new VectorField(VECTOR_FIELD, vector, 
vectorFieldType);
+        FieldType vectorFieldType = 
KnnVectorField.createFieldType(vector.length, 
VectorSimilarityFunction.DOT_PRODUCT);
+        KnnVectorField vectorField = new KnnVectorField(VECTOR_FIELD, 
vector, vectorFieldType);

and

-            return new TopDocScorer(this, 
context.reader().searchNearestVectors(field, vector, topK, fanout));
+            return new TopDocScorer(this, 
context.reader().searchNearestVectors(field, vector, topK, null));

the indexing and searching works again :-)

Thanks

Michael

Am 01.11.21 um 18:53 schrieb Michael Wechner:
> Hi
>
> In May 2021 I have done a Vector Search implementation based on Lucene 9.0.0-SNAPSHOT with the following code
>
> FieldType vectorFieldType = VectorField.createHnswType(vector.length, VectorValues.SimilarityFunction.DOT_PRODUCT,16,500);
> VectorField vectorField =new VectorField(VECTOR_FIELD, vector, vectorFieldType);
> doc.add(vectorField)
>
> and
>
> class KnnWeightextends Weight {
>
>      KnnWeight() {
>          super(KnnQuery.this);
>      }
>
>      @Override public Scorer scorer(LeafReaderContext context)throws IOException {
>          log.debug("Get scorer.");
>          return new TopDocScorer(this, context.reader().searchNearestVectors(field,vector,topK,fanout));
>      }
>
> whereas fanout is of type "int"
>
> I have now updated Lucene source and rebuilt 9.0.0-SNAPSHOT and get various compile errors.
>
> I assume VectorField got renamed to KnnVectorField, right?
>
> Does somebody maybe have some sample code how Vector search is being implemented with the most recent Lucene code?
>
> Thanks
>
> Michael